What's Node.js Streams and How to Work with them?

What's Node.js Streams and How to Work with them?

In this Node.js tutorial, we'll learn what Node.js Streams is and how to work with them. Understanding Streams in Node.js: Streams in Node.js have a reputation for being hard to work with, and even harder to understand.

Streams in Node.js have a reputation for being hard to work with, and even harder to understand.

In the words of Dominic Tarr: “Streams are Node’s best and most misunderstood idea.” Even Dan Abramov, creator of Redux and core team member of React.js is afraid of Node streams.

This article will help you understand streams and how to work with them. So, don’t be afraid. We can figure this out!

What are streams?

Streams are one of the fundamental concepts that power Node.js applications. They are data-handling method and are used to read or write input into output sequentially.

Streams are a way to handle reading/writing files, network communications, or any kind of end-to-end information exchange in an efficient way.

What makes streams unique, is that instead of a program reading a file into memory all at once like in the traditional way, streams read chunks of data piece by piece, processing its content without keeping it all in memory.

This makes streams really powerful when working with large amounts of data, for example, a file size can be larger than your free memory space, making it impossible to read the whole file into the memory in order to process it. That’s where streams come to the rescue!

Using streams to process smaller chunks of data, makes it possible to read larger files.

Let’s take a “streaming” services such as YouTube or Netflix for example: these services don’t make you download the video and audio feed all at once. Instead, your browser receives the video as a continuous flow of chunks, allowing the recipients to start watching and/or listening almost immediately.

However, streams are not only about working with media or big data. They also give us the power of ‘composability’ in our code. Designing with composability in mind means several components can be combined in a certain way to produce the same type of result. In Node.js it’s possible to compose powerful pieces of code by piping data to and from other smaller pieces of code, using streams.

Why streams

Streams basically provide two major advantages compared to other data handling methods:

  1. Memory efficiency: you don’t need to load large amounts of data in memory before you are able to process it
  2. Time efficiency: it takes significantly less time to start processing data as soon as you have it, rather than having to wait with processing until the entire payload has been transmitted
There are 4 types of streams in Node.js:
  1. Writable: streams to which we can write data. For example, fs.createWriteStream() lets us write data to a file using streams.
  2. Readable: streams from which data can be read. For example: fs.createReadStream() lets us read the contents of a file.
  3. Duplex: streams that are both Readable and Writable. For example, net.Socket
  4. Transform: streams that can modify or transform the data as it is written and read. For example, in the instance of file-compression, you can write compressed data and read decompressed data to and from a file.

If you have already worked with Node.js, you may have come across streams. For example, in a Node.js based HTTP server, request is a readable stream and response is a writable stream. You might have used the fs module, which lets you work with both readable and writable file streams. Whenever you’re using Express you are using streams to interact with the client, also, streams are being used in every database connection driver that you can work with, because of TCP sockets, TLS stack and other connections are all based on Node.js streams.

A practical example

How to create a readable stream

We first require the Readable stream, and we initialize it.

const Stream = require('stream')
const readableStream = new Stream.Readable()

Now that the stream is initialized, we can send data to it:

readableStream.push('ping!')
readableStream.push('pong!')

async iterator

It’s highly recommended to use async iterator when working with streams. According to Dr. Axel Rauschmayer, Asynchronous iteration is a protocol for retrieving the contents of a data container asynchronously (meaning the current “task” may be paused before retrieving an item). Also, it’s important to mention that the stream async iterator implementation use the ‘readable’ event inside.

You can use async iterator when reading from readable streams:

import * as fs from 'fs';

async function logChunks(readable) {
  for await (const chunk of readable) {
    console.log(chunk);
  }
}

const readable = fs.createReadStream(
  'tmp/test.txt', {encoding: 'utf8'});
logChunks(readable);

// Output:
// 'This is a test!\n'

It’s also possible to collect the contents of a readable stream in a string:

import {Readable} from 'stream';

async function readableToString2(readable) {
  let result = '';
  for await (const chunk of readable) {
    result += chunk;
  }
  return result;
}

const readable = Readable.from('Good morning!', {encoding: 'utf8'});
assert.equal(await readableToString2(readable), 'Good morning!');

Note that, in this case, we had to use an async function because we wanted to return a Promise.

It’s important to keep in mind to not mix async functions with EventEmitter because currently, there is no way to catch a rejection when it is emitted within an event handler, causing hard to track bugs and memory leaks. The best current practice is to always wrap the content of an async function in a try/catch block and handle errors, but this is error prone. This pull request aims to solve this issue once it lands on Node core.

Readable.from(): Creating readable streams from iterables

stream.Readable.from(iterable, [options]) it’s a utility method for creating Readable Streams out of iterators, which holds the data contained in iterable. Iterable can be a synchronous iterable or an asynchronous iterable. The parameter options is optional and can, among other things, be used to specify a text encoding.

const { Readable } = require('stream');

async function * generate() {
  yield 'hello';
  yield 'streams';
}

const readable = Readable.from(generate());

readable.on('data', (chunk) => {
  console.log(chunk);
});

Two Reading Modes

According to Streams API, readable streams effectively operate in one of two modes: flowing and paused. A Readable stream can be in object mode or not, regardless of whether it is in flowing mode or paused mode.

  • In flowing mode, data is read from the underlying system automatically and provided to an application as quickly as possible using events via the EventEmitter interface.

  • In paused mode, the stream.read() method must be called explicitly to read chunks of data from the stream.

In a flowing mode, to read data from a stream, it’s possible to listen to data event and attach a callback. When a chunk of data is available, the readable stream emits a data event and your callback executes. Take a look at the following snippet:

var fs = require("fs");
var data = '';

var readerStream = fs.createReadStream('file.txt'); //Create a readable stream

readerStream.setEncoding('UTF8'); // Set the encoding to be utf8\. 

// Handle stream events --> data, end, and error
readerStream.on('data', function(chunk) {
   data += chunk;
});

readerStream.on('end',function() {
   console.log(data);
});

readerStream.on('error', function(err) {
   console.log(err.stack);
});

console.log("Program Ended");

The function call fs.createReadStream() gives you a readable stream. Initially, the stream is in a static state. As soon as you listen to data event and attach a callback it starts flowing. After that, chunks of data are read and passed to your callback. The stream implementor decides how often a data event is emitted. For example, an HTTP request may emit a data event once every few KBs of data are read. When you are reading data from a file you may decide you emit a data event once a line is read.

When there is no more data to read (end is reached), the stream emits an end event. In the above snippet, we listen to this event to get notified when the end is reached.

Also, if there is an error, the stream will emit and notify the error.

In paused mode, you just need to call read() on the stream instance repeatedly until every chunk of data has been read, like in the following example:

var fs = require('fs');
var readableStream = fs.createReadStream('file.txt');
var data = '';
var chunk;

readableStream.on('readable', function() {
    while ((chunk=readableStream.read()) != null) {
        data += chunk;
    }
});

readableStream.on('end', function() {
    console.log(data)
});

The read() function reads some data from the internal buffer and returns it. When there is nothing to read, it returns null. So, in the while loop, we check for null and terminate the loop. Note that the readable event is emitted when a chunk of data can be read from the stream.


All Readable streams begin in paused mode but can be switched to flowing mode in one of the following ways:

  • Adding a 'data' event handler.
  • Calling the stream.resume() method.
  • Calling the stream.pipe() method to send the data to a Writable.

The Readable can switch back to paused mode using one of the following:

  • If there are no pipe destinations, by calling the stream.pause() method.
  • If there are pipe destinations, by removing all pipe destinations. Multiple pipe destinations may be removed by calling the stream.unpipe() method.

The important concept to remember is that a Readable will not generate data until a mechanism for either consuming or ignoring that data is provided. If the consuming mechanism is disabled or taken away, the Readable will attempt to stop generating the data.
Adding a readable event handler automatically make the stream to stop flowing, and the data to be consumed via readable.read(). If the 'readable' event handler is removed, then the stream will start flowing again if there is a 'data' event handler.

How to create a writable stream

To write data to a writable stream you need to call write() on the stream instance. Like in the following example:

var fs = require('fs');
var readableStream = fs.createReadStream('file1.txt');
var writableStream = fs.createWriteStream('file2.txt');

readableStream.setEncoding('utf8');

readableStream.on('data', function(chunk) {
    writableStream.write(chunk);
});

The above code is straightforward. It simply reads chunks of data from an input stream and writes to the destination using write(). This function returns a boolean value indicating if the operation was successful. If true, then the write was successful and you can keep writing more data. If false is returned, it means something went wrong and you can’t write anything at the moment. The writable stream will let you know when you can start writing more data by emitting a drain event.

Calling the writable.end() method signals that no more data will be written to the Writable. If provided, the optional callback function is attached as a listener for the 'finish' event.

// Write 'hello, ' and then end with 'world!'.
const fs = require('fs');
const file = fs.createWriteStream('example.txt');
file.write('hello, ');
file.end('world!');
// Writing more now is not allowed!

Using a writable stream you can read data from a readable stream:

const Stream = require('stream')

const readableStream = new Stream.Readable()
const writableStream = new Stream.Writable()

writableStream._write = (chunk, encoding, next) => {
    console.log(chunk.toString())
    next()
}

readableStream.pipe(writableStream)

readableStream.push('ping!')
readableStream.push('pong!')

writableStream.end()

You can also use async iterators to write to a writable stream, which is recommended

import * as util from 'util';
import * as stream from 'stream';
import * as fs from 'fs';
import {once} from 'events';

const finished = util.promisify(stream.finished); // (A)

async function writeIterableToFile(iterable, filePath) {
  const writable = fs.createWriteStream(filePath, {encoding: 'utf8'});
  for await (const chunk of iterable) {
    if (!writable.write(chunk)) { // (B)
      // Handle backpressure
      await once(writable, 'drain');
    }
  }
  writable.end(); // (C)
  // Wait until done. Throws if there are errors.
  await finished(writable);
}

await writeIterableToFile(
  ['One', ' line of text.\n'], 'tmp/log.txt');
assert.equal(
  fs.readFileSync('tmp/log.txt', {encoding: 'utf8'}),
  'One line of text.\n');

The default version of stream.finished() is callback-based but can be turned into a Promise-based version via util.promisify() (line A).

In this example, it is used the following two patterns:

Writing to a writable stream while handling backpressure (line B):

if (!writable.write(chunk)) {
  await once(writable, 'drain');
}

Closing a writable stream and waiting until writing is done (line C):

writable.end();
await finished(writable);

pipeline()

Piping is a mechanism where we provide the output of one stream as the input to another stream. It is normally used to get data from one stream and to pass the output of that stream to another stream. There is no limit on piping operations. In other words, piping is used to process streamed data in multiple steps.

In Node 10.x was introduced stream.pipeline(). This is a module method to pipe between streams forwarding errors and properly cleaning up and provide a callback when the pipeline is complete.

Here is an example of using pipeline:

const { pipeline } = require('stream');
const fs = require('fs');
const zlib = require('zlib');

// Use the pipeline API to easily pipe a series of streams
// together and get notified when the pipeline is fully done.
// A pipeline to gzip a potentially huge video file efficiently:

pipeline(
  fs.createReadStream('The.Matrix.1080p.mkv'),
  zlib.createGzip(),
  fs.createWriteStream('The.Matrix.1080p.mkv.gz'),
  (err) => {
    if (err) {
      console.error('Pipeline failed', err);
    } else {
      console.log('Pipeline succeeded');
    }
  }
);

pipeline should be used instead of pipe, as pipe is unsafe.

The Stream Module

The Node.js stream module provides the foundation upon which all streaming APIs are build.

The Stream module is a native module that shipped by default in Node.js. The Stream is an instance of the EventEmitter class which handles events asynchronously in Node. Because of this, streams are inherently event-based.

To access the stream module:

const stream = require('stream');

The stream module is useful for creating new types of stream instances. It is usually not necessary to use the stream module to consume streams.

Streams-powered Node APIs

Due to their advantages, many Node.js core modules provide native stream handling capabilities, most notably:

  • net.Socket is the main node api that is stream are based on, which underlies most of the following APIs
  • process.stdin returns a stream connected to stdin
  • process.stdout returns a stream connected to stdout
  • process.stderr returns a stream connected to stderr
  • fs.createReadStream() creates a readable stream to a file
  • fs.createWriteStream() creates a writable stream to a file
  • net.connect() initiates a stream-based connection
  • http.request() returns an instance of the http.ClientRequest class, which is a writable stream
  • zlib.createGzip() compress data using gzip (a compression algorithm) into a stream
  • zlib.createGunzip() decompress a gzip stream.
  • zlib.createDeflate() compress data using deflate (a compression algorithm) into a stream
  • zlib.createInflate() decompress a deflate stream

Streams Cheat Sheet:

Here are some important events related to writable streams:

  • error – Emitted to indicate that an error has occurred while writing/piping.
  • pipeline – When a readable stream is piped into a writable stream, this event is emitted by the writable stream.
  • unpipe – Emitted when you call unpipe on the readable stream and stop it from piping into the destination stream.
Conclusion

This was all about the basics of streams. Streams, pipes, and chaining are the core and most powerful features in Node.js. Streams can indeed help you write neat and performant code to perform I/O.

Also, there is a Node.js strategic initiative worth looking to, called BOB, aiming to improve Node.js streaming data interfaces, both within Node.js core internally, and hopefully also as future public APIs.

Node.js Streams: Stream Into the Future

Node.js Streams: Stream Into the Future

Node.js Streams: Stream Into the Future. In this Node.js Streams tutorial, a Node.js Core Streams maintainer presents a stream-less future by demonstrating how to use pure JavaScript: Async Iterators and Generators can give us everything Streams can while being completely cross-platform and highly performant.

Stream Into the Future

There was a time when Node.js streams were all the rage but over time the Node.js Core Streams codebase became extremely complex and hard to understand. Worse still, WHATWG introduced an API for browser Streams. The two Streams API’s are incompatible with each other and both are complex and leaky abstractions. In this talk, a Node.js Core Streams maintainer presents a stream-less future by demonstrating how to use pure JavaScript: Async Iterators and Generators can give us everything Streams can while being completely cross-platform and highly performant.

Everything You Need to Understand Streams in Node.js

Everything You Need to Understand Streams in Node.js

In this Node.js Streams tutorial, you'll see everything You need to understand Streams in Node.js. Node.js Streams have a reputation for being hard to work with, and even harder to understand

Node.js streams have a reputation for being hard to work with, and even harder to understand. Well I’ve got good news for you — that’s no longer the case.

Over the years, developers created lots of packages out there with the sole purpose of making working with streams easier. But in this article, I’m going to focus on the native Node.js stream API.

“Streams are Node’s best and most misunderstood idea.”

— Dominic Tarr

What exactly are streams?

Streams are collections of data — just like arrays or strings. The difference is that streams might not be available all at once, and they don’t have to fit in memory. This makes streams really powerful when working with large amounts of data, or data that’s coming from an external source one chunk at a time.

However, streams are not only about working with big data. They also give us the power of composability in our code. Just like we can compose powerful linux commands by piping other smaller Linux commands, we can do exactly the same in Node with streams.

const grep = ... // A stream for the grep output
const wc = ... // A stream for the wc input

grep.pipe(wc)

Many of the built-in modules in Node implement the streaming interface:

The list above has some examples for native Node.js objects that are also readable and writable streams. Some of these objects are both readable and writable streams, like TCP sockets, zlib and crypto streams.

Notice that the objects are also closely related. While an HTTP response is a readable stream on the client, it’s a writable stream on the server. This is because in the HTTP case, we basically read from one object (http.IncomingMessage) and write to the other (http.ServerResponse).

Also note how the stdio streams (stdin, stdout, stderr) have the inverse stream types when it comes to child processes. This allows for a really easy way to pipe to and from these streams from the main process stdio streams.

A streams practical example

Theory is great, but often not 100% convincing. Let’s see an example demonstrating the difference streams can make in code when it comes to memory consumption.

Let’s create a big file first:

const fs = require('fs');
const file = fs.createWriteStream('./big.file');

for(let i=0; i<= 1e6; i++) {
  file.write('Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.\n');
}

file.end();

Look what I used to create that big file. A writable stream!

The fs module can be used to read from and write to files using a stream interface. In the example above, we’re writing to that big.file through a writable stream 1 million lines with a loop.

Running the script above generates a file that’s about ~400 MB.

Here’s a simple Node web server designed to exclusively serve the big.file:

const fs = require('fs');
const server = require('http').createServer();

server.on('request', (req, res) => {
  fs.readFile('./big.file', (err, data) => {
    if (err) throw err;

    res.end(data);
  });
});

server.listen(8000);

When the server gets a request, it’ll serve the big file using the asynchronous method, fs.readFile. But hey, it’s not like we’re blocking the event loop or anything. Every thing is great, right? Right?

Well, let’s see what happens when we run the server, connect to it, and monitor the memory while doing so.

When I ran the server, it started out with a normal amount of memory, 8.7 MB:

Then I connected to the server. Note what happened to the memory consumed:

Wow — the memory consumption jumped to 434.8 MB.

We basically put the whole big.file content in memory before we wrote it out to the response object. This is very inefficient.

The HTTP response object (res in the code above) is also a writable stream. This means if we have a readable stream that represents the content of big.file, we can just pipe those two on each other and achieve mostly the same result without consuming ~400 MB of memory.

Node’s fs module can give us a readable stream for any file using the createReadStream method. We can pipe that to the response object:

const fs = require('fs');
const server = require('http').createServer();

server.on('request', (req, res) => {
  const src = fs.createReadStream('./big.file');
  src.pipe(res);
});

server.listen(8000);

Now when you connect to this server, a magical thing happens (look at the memory consumption):

What’s happening?

When a client asks for that big file, we stream it one chunk at a time, which means we don’t buffer it in memory at all. The memory usage grew by about 25 MB and that’s it.

You can push this example to its limits. Regenerate the big.file with five million lines instead of just one million, which would take the file to well over 2 GB, and that’s actually bigger than the default buffer limit in Node.

If you try to serve that file using fs.readFile, you simply can’t, by default (you can change the limits). But with fs.createReadStream, there is no problem at all streaming 2 GB of data to the requester, and best of all, the process memory usage will roughly be the same.

Streams 101

There are four fundamental stream types in Node.js: Readable, Writable, Duplex, and Transform streams.

  • A readable stream is an abstraction for a source from which data can be consumed. An example of that is the fs.createReadStream method.
  • A writable stream is an abstraction for a destination to which data can be written. An example of that is the fs.createWriteStream method.
  • A duplex streams is both Readable and Writable. An example of that is a TCP socket.
  • A transform stream is basically a duplex stream that can be used to modify or transform the data as it is written and read. An example of that is the zlib.createGzip stream to compress the data using gzip. You can think of a transform stream as a function where the input is the writable stream part and the output is readable stream part. You might also hear transform streams referred to as “through streams.”

All streams are instances of EventEmitter. They emit events that can be used to read and write data. However, we can consume streams data in a simpler way using the pipe method.

The pipe method

Here’s the magic line that you need to remember:

readableSrc.pipe(writableDest)

In this simple line, we’re piping the output of a readable stream — the source of data, as the input of a writable stream — the destination. The source has to be a readable stream and the destination has to be a writable one. Of course, they can both be duplex/transform streams as well. In fact, if we’re piping into a duplex stream, we can chain pipe calls just like we do in Linux:

readableSrc
  .pipe(transformStream1)
  .pipe(transformStream2)
  .pipe(finalWrtitableDest)

The pipe method returns the destination stream, which enabled us to do the chaining above. For streams a (readable), b and c (duplex), and d (writable), we can:

a.pipe(b).pipe(c).pipe(d)

# Which is equivalent to:
a.pipe(b)
b.pipe(c)
c.pipe(d)

# Which, in Linux, is equivalent to:
$ a | b | c | d

The pipe method is the easiest way to consume streams. It’s generally recommended to either use the pipe method or consume streams with events, but avoid mixing these two. Usually when you’re using the pipe method you don’t need to use events, but if you need to consume the streams in more custom ways, events would be the way to go.

Stream events

Beside reading from a readable stream source and writing to a writable destination, the pipe method automatically manages a few things along the way. For example, it handles errors, end-of-files, and the cases when one stream is slower or faster than the other.

However, streams can also be consumed with events directly. Here’s the simplified event-equivalent code of what the pipe method mainly does to read and write data:

# readable.pipe(writable)

readable.on('data', (chunk) => {
  writable.write(chunk);
});

readable.on('end', () => {
  writable.end();
});

Here’s a list of the important events and functions that can be used with readable and writable streams:

The events and functions are somehow related because they are usually used together.

The most important events on a readable stream are:

  • The data event, which is emitted whenever the stream passes a chunk of data to the consumer
  • The end event, which is emitted when there is no more data to be consumed from the stream.

The most important events on a writable stream are:

  • The drain event, which is a signal that the writable stream can receive more data.
  • The finish event, which is emitted when all data has been flushed to the underlying system.

Events and functions can be combined to make for a custom and optimized use of streams. To consume a readable stream, we can use the pipe/unpipe methods, or the read/unshift/resume methods. To consume a writable stream, we can make it the destination of pipe/unpipe, or just write to it with the write method and call the end method when we’re done.

Paused and Flowing Modes of Readable Streams

Readable streams have two main modes that affect the way we can consume them:

  • They can be either in the paused mode
  • Or in the flowing mode

Those modes are sometimes referred to as pull and push modes.

All readable streams start in the paused mode by default but they can be easily switched to flowing and back to paused when needed. Sometimes, the switching happens automatically.

When a readable stream is in the paused mode, we can use the read() method to read from the stream on demand, however, for a readable stream in the flowing mode, the data is continuously flowing and we have to listen to events to consume it.

In the flowing mode, data can actually be lost if no consumers are available to handle it. This is why, when we have a readable stream in flowing mode, we need a data event handler. In fact, just adding a data event handler switches a paused stream into flowing mode and removing the data event handler switches the stream back to paused mode. Some of this is done for backward compatibility with the older Node streams interface.

To manually switch between these two stream modes, you can use the resume() and pause() methods.

When consuming readable streams using the pipe method, we don’t have to worry about these modes as pipe manages them automatically.

Implementing Streams

When we talk about streams in Node.js, there are two main different tasks:

  • The task of implementing the streams.
  • The task of consuming them.

So far we’ve been talking about only consuming streams. Let’s implement some!

Stream implementers are usually the ones who require the stream module.

Implementing a Writable Stream

To implement a writable stream, we need to to use the Writable constructor from the stream module.

const { Writable } = require('stream');

We can implement a writable stream in many ways. We can, for example, extend the Writable constructor if we want

class myWritableStream extends Writable {
}

However, I prefer the simpler constructor approach. We just create an object from the Writable constructor and pass it a number of options. The only required option is a write function which exposes the chunk of data to be written.

const { Writable } = require('stream');

const outStream = new Writable({
  write(chunk, encoding, callback) {
    console.log(chunk.toString());
    callback();
  }
});

process.stdin.pipe(outStream);

This write method takes three arguments.

  • The chunk is usually a buffer unless we configure the stream differently.
  • The encoding argument is needed in that case, but usually we can ignore it.
  • The callback is a function that we need to call after we’re done processing the data chunk. It’s what signals whether the write was successful or not. To signal a failure, call the callback with an error object.

In outStream, we simply console.log the chunk as a string and call the callback after that without an error to indicate success. This is a very simple and probably not so useful echo stream. It will echo back anything it receives.

To consume this stream, we can simply use it with process.stdin, which is a readable stream, so we can just pipe process.stdin into our outStream.

When we run the code above, anything we type into process.stdin will be echoed back using the outStream console.log line.

This is not a very useful stream to implement because it’s actually already implemented and built-in. This is very much equivalent to process.stdout. We can just pipe stdin into stdout and we’ll get the exact same echo feature with this single line:

process.stdin.pipe(process.stdout);

Implement a Readable Stream

To implement a readable stream, we require the Readable interface, and construct an object from it, and implement a read() method in the stream’s configuration parameter:

const { Readable } = require('stream');

const inStream = new Readable({
  read() {}
});

There is a simple way to implement readable streams. We can just directly push the data that we want the consumers to consume.

const { Readable } = require('stream'); 

const inStream = new Readable({
  read() {}
});

inStream.push('ABCDEFGHIJKLM');
inStream.push('NOPQRSTUVWXYZ');

inStream.push(null); // No more data

inStream.pipe(process.stdout);

When we push a null object, that means we want to signal that the stream does not have any more data.

To consume this simple readable stream, we can simply pipe it into the writable stream process.stdout.

When we run the code above, we’ll be reading all the data from inStream and echoing it to the standard out. Very simple, but also not very efficient.

We’re basically pushing all the data in the stream before piping it to process.stdout. The much better way is to push data on demand, when a consumer asks for it. We can do that by implementing the read() method in the configuration object:

const inStream = new Readable({
  read(size) {
    // there is a demand on the data... Someone wants to read it.
  }
});

When the read method is called on a readable stream, the implementation can push partial data to the queue. For example, we can push one letter at a time, starting with character code 65 (which represents A), and incrementing that on every push:

const inStream = new Readable({
  read(size) {
    this.push(String.fromCharCode(this.currentCharCode++));
    if (this.currentCharCode > 90) {
      this.push(null);
    }
  }
});

inStream.currentCharCode = 65;

inStream.pipe(process.stdout);

While the consumer is reading a readable stream, the read method will continue to fire, and we’ll push more letters. We need to stop this cycle somewhere, and that’s why an if statement to push null when the currentCharCode is greater than 90 (which represents Z).

This code is equivalent to the simpler one we started with but now we’re pushing data on demand when the consumer asks for it. You should always do that.

Implementing Duplex/Transform Streams

With Duplex streams, we can implement both readable and writable streams with the same object. It’s as if we inherit from both interfaces.

Here’s an example duplex stream that combines the two writable and readable examples implemented above:

const { Duplex } = require('stream');

const inoutStream = new Duplex({
  write(chunk, encoding, callback) {
    console.log(chunk.toString());
    callback();
  },

  read(size) {
    this.push(String.fromCharCode(this.currentCharCode++));
    if (this.currentCharCode > 90) {
      this.push(null);
    }
  }
});

inoutStream.currentCharCode = 65;

process.stdin.pipe(inoutStream).pipe(process.stdout);

By combining the methods, we can use this duplex stream to read the letters from A to Z and we can also use it for its echo feature. We pipe the readable stdin stream into this duplex stream to use the echo feature and we pipe the duplex stream itself into the writable stdout stream to see the letters A through Z.

It’s important to understand that the readable and writable sides of a duplex stream operate completely independently from one another. This is merely a grouping of two features into an object.

A transform stream is the more interesting duplex stream because its output is computed from its input.

For a transform stream, we don’t have to implement the read or write methods, we only need to implement a transform method, which combines both of them. It has the signature of the write method and we can use it to push data as well.

Here’s a simple transform stream which echoes back anything you type into it after transforming it to upper case format:

const { Transform } = require('stream');

const upperCaseTr = new Transform({
  transform(chunk, encoding, callback) {
    this.push(chunk.toString().toUpperCase());
    callback();
  }
});

process.stdin.pipe(upperCaseTr).pipe(process.stdout);

In this transform stream, which we’re consuming exactly like the previous duplex stream example, we only implemented a transform() method. In that method, we convert the chunk into its upper case version and then push that version as the readable part.

Streams Object Mode

By default, streams expect Buffer/String values. There is an objectMode flag that we can set to have the stream accept any JavaScript object.

Here’s a simple example to demonstrate that. The following combination of transform streams makes for a feature to map a string of comma-separated values into a JavaScript object. So “a,b,c,d” becomes {a: b, c: d}.

const { Transform } = require('stream');

const commaSplitter = new Transform({
  readableObjectMode: true,

  transform(chunk, encoding, callback) {
    this.push(chunk.toString().trim().split(','));
    callback();
  }
});

const arrayToObject = new Transform({
  readableObjectMode: true,
  writableObjectMode: true,

  transform(chunk, encoding, callback) {
    const obj = {};
    for(let i=0; i < chunk.length; i+=2) {
      obj[chunk[i]] = chunk[i+1];
    }
    this.push(obj);
    callback();
  }
});

const objectToString = new Transform({
  writableObjectMode: true,

  transform(chunk, encoding, callback) {
    this.push(JSON.stringify(chunk) + '\n');
    callback();
  }
});

process.stdin
  .pipe(commaSplitter)
  .pipe(arrayToObject)
  .pipe(objectToString)
  .pipe(process.stdout)

We pass the input string (for example, “a,b,c,d”) through commaSplitter which pushes an array as its readable data ([“a”, “b”, “c”, “d”]). Adding the readableObjectMode flag on that stream is necessary because we’re pushing an object there, not a string.

We then take the array and pipe it into the arrayToObject stream. We need a writableObjectMode flag to make that stream accept an object. It’ll also push an object (the input array mapped into an object) and that’s why we also needed the readableObjectMode flag there as well. The last objectToString stream accepts an object but pushes out a string, and that’s why we only needed a writableObjectMode flag there. The readable part is a normal string (the stringified object).

Node’s built-in transform streams

Node has a few very useful built-in transform streams. Namely, the zlib and crypto streams.

Here’s an example that uses the zlib.createGzip() stream combined with the fs readable/writable streams to create a file-compression script:

const fs = require('fs');
const zlib = require('zlib');
const file = process.argv[2];

fs.createReadStream(file)
  .pipe(zlib.createGzip())
  .pipe(fs.createWriteStream(file + '.gz'));

You can use this script to gzip any file you pass as the argument. We’re piping a readable stream for that file into the zlib built-in transform stream and then into a writable stream for the new gzipped file. Simple.

The cool thing about using pipes is that we can actually combine them with events if we need to. Say, for example, I want the user to see a progress indicator while the script is working and a “Done” message when the script is done. Since the pipe method returns the destination stream, we can chain the registration of events handlers as well:

const fs = require('fs');
const zlib = require('zlib');
const file = process.argv[2];

fs.createReadStream(file)
  .pipe(zlib.createGzip())
  .on('data', () => process.stdout.write('.'))
  .pipe(fs.createWriteStream(file + '.zz'))
  .on('finish', () => console.log('Done'));

So with the pipe method, we get to easily consume streams, but we can still further customize our interaction with those streams using events where needed.

What’s great about the pipe method though is that we can use it to compose our program piece by piece, in a much readable way. For example, instead of listening to the data event above, we can simply create a transform stream to report progress, and replace the .on() call with another .pipe() call:

const fs = require('fs');
const zlib = require('zlib');
const file = process.argv[2];

const { Transform } = require('stream');

const reportProgress = new Transform({
  transform(chunk, encoding, callback) {
    process.stdout.write('.');
    callback(null, chunk);
  }
});

fs.createReadStream(file)
  .pipe(zlib.createGzip())
  .pipe(reportProgress)
  .pipe(fs.createWriteStream(file + '.zz'))
  .on('finish', () => console.log('Done'));

This reportProgress stream is a simple pass-through stream, but it reports the progress to standard out as well. Note how I used the second argument in the callback() function to push the data inside the transform() method. This is equivalent to pushing the data first.

The applications of combining streams are endless. For example, if we need to encrypt the file before or after we gzip it, all we need to do is pipe another transform stream in that exact order that we needed. We can use Node’s crypto module for that:

const crypto = require('crypto');
// ...

fs.createReadStream(file)
  .pipe(zlib.createGzip())
  .pipe(crypto.createCipher('aes192', 'a_secret'))
  .pipe(reportProgress)
  .pipe(fs.createWriteStream(file + '.zz'))
  .on('finish', () => console.log('Done'));

The script above compresses and then encrypts the passed file and only those who have the secret can use the outputted file. We can’t unzip this file with the normal unzip utilities because it’s encrypted.

To actually be able to unzip anything zipped with the script above, we need to use the opposite streams for crypto and zlib in a reverse order, which is simple:

fs.createReadStream(file)
  .pipe(crypto.createDecipher('aes192', 'a_secret'))
  .pipe(zlib.createGunzip())
  .pipe(reportProgress)
  .pipe(fs.createWriteStream(file.slice(0, -3)))
  .on('finish', () => console.log('Done'));

Assuming the passed file is the compressed version, the code above will create a read stream from that, pipe it into the crypto createDecipher() stream (using the same secret), pipe the output of that into the zlib createGunzip() stream, and then write things out back to a file without the extension part.

Thanks for reading!

Node.js for Beginners - Learn Node.js from Scratch (Step by Step)

Node.js for Beginners - Learn Node.js from Scratch (Step by Step)

Node.js for Beginners - Learn Node.js from Scratch (Step by Step) - Learn the basics of Node.js. This Node.js tutorial will guide you step by step so that you will learn basics and theory of every part. Learn to use Node.js like a professional. You’ll learn: Basic Of Node, Modules, NPM In Node, Event, Email, Uploading File, Advance Of Node.

Node.js for Beginners

Learn Node.js from Scratch (Step by Step)

Welcome to my course "Node.js for Beginners - Learn Node.js from Scratch". This course will guide you step by step so that you will learn basics and theory of every part. This course contain hands on example so that you can understand coding in Node.js better. If you have no previous knowledge or experience in Node.js, you will like that the course begins with Node.js basics. otherwise if you have few experience in programming in Node.js, this course can help you learn some new information . This course contain hands on practical examples without neglecting theory and basics. Learn to use Node.js like a professional. This comprehensive course will allow to work on the real world as an expert!
What you’ll learn:

  • Basic Of Node
  • Modules
  • NPM In Node
  • Event
  • Email
  • Uploading File
  • Advance Of Node