This article will be focusing on how use worker threads to execute a task asynchronously and stream data from that task back to the rest of your Node application using RxJS Observables.
Before we get started, if you want to learn more about worker threads and why you might want to use them, I would recommend reading Node.js multithreading: What are Worker Threads and why do they matter? by Alberto Gimeno. Alberto has done a fantastic job explaining the purpose of the worker_thread module, provided some solid examples of where it makes sense to use it as well as demonstrated some alternate ways to build a multi-threaded Node app.
We are going to be building a simple Node app that creates a worker thread running a simulated long-running task that reports status back at regular intervals until it completes or until time runs out. The worker thread will be wrapped in an RxJS Observable so that the rest of the application can stream messages returned from the worker thread using the powerful RxJS library.
If you want to jump ahead and see the final solution, you can see it out on GitHub at briandesousa/node-worker-thread-rxjs.
The first thing we need to do is ensure our environment is ready to go:
npm init
to initialize a new NPM packagenode node-parent-thread-rxjs.js
npm install -s rxjs
The worker thread has the logic to simulate a long-running task:
const { workerData, parentPort } = require('worker_threads');
parentPort.postMessage(`starting heavy duty work from process ${process.pid} that will take ${workerData}s to complete`);
timeLimit = workerData;
timer = 0;
// simulate a long-running process with updates posted back on a regular interval
do {
setTimeout(
(count) => {
parentPort.postMessage(`heavy duty work in progress...${count + 1}s`);
if (count === timeLimit) {
parentPort.postMessage('done heavy duty work');
}
},
1000 * timer,
timer);
} while (++timer !== timeLimit);
node-worker-thread-rxjs.js
Let’s break this script down a bit:
parentPort
from the worker_threads
modules to communicate back to the parent thread at 3 different points:workerData
from the worker_threads
module to pass in a time limit for how long (in seconds) the task should run for. The task completes when this time limit is reached (line 19).This worker thread doesn’t do anything particularly useful but it does demonstrate how a thread might receive instructions from its parent and stream multiple updates back to its parent.
The parent thread has the following responsibilities:
parentPort
from the worker_threads
modules to communicate back to the parent thread at 3 different points:workerData
from the worker_threads
module to pass in a time limit for how long (in seconds) the task should run for. The task completes when this time limit is reached (line 19).const Rxjs = require('rxjs');
const RxjsOperators = require('rxjs/operators');
const { Worker } = require('worker_threads');
console.log("\nNode multi-threading demo using worker_threads module in Node 11.7.0\n");
const COMPLETE_SIGNAL = 'COMPLETE';
function runTask(workerData, completedOnTime) {
return Rxjs.Observable.create(observer => {
const worker = new Worker('./node-worker-thread-rxjs.js', { workerData });
worker.on('message', message => observer.next(message));
worker.on('error', error => observer.error(error));
worker.on('exit', code => {
if (code !== 0) {
observer.error(`Worker stopped with exit code ${code}`);
} else {
completedOnTime();
observer.next(COMPLETE_SIGNAL);
observer.complete();
}
});
});
}
const MAX_WAIT_TIME = 3;
const WORKER_TIME = 10;
function main() {
completedOnTime = false;
console.log(`[Main] Starting worker from process ${process.pid}`);
const worker$ = runTask(WORKER_TIME, () => completedOnTime = true);
// receive messages from worker until it completes but only wait for MAX_WAIT_TIME
worker$.pipe(
RxjsOperators.takeWhile(message => message !== COMPLETE_SIGNAL),
RxjsOperators.takeUntil(Rxjs.timer(MAX_WAIT_TIME * 1000))
).subscribe(
result => console.log(`[Main] worker says: ${result}`),
error => console.error(`[Main] worker error: ${error}`),
() => {
if (!completedOnTime) {
console.log(`[Main] worker could not complete its work in the allowed ${MAX_WAIT_TIME}s, exiting Node process`);
process.exit(0);
} else {
console.log(`[Main] worker completed its work in the allowed ${WORKER_TIME}s`);
}
}
);
}
main();
node-parent-thread-rxjs.js
There is a lot going on here. Let’s focus on the runTask()
function first:
parentPort
from the worker_threads
modules to communicate back to the parent thread at 3 different points:workerData
from the worker_threads
module to pass in a time limit for how long (in seconds) the task should run for. The task completes when this time limit is reached (line 19).The runTask()
doesn’t have a very descriptive name however you can now see that it encapsulates the mapping logic between worker thread events and the Observable interface.
Next, let’s look at the at the main()
function:
parentPort
from the worker_threads
modules to communicate back to the parent thread at 3 different points:workerData
from the worker_threads
module to pass in a time limit for how long (in seconds) the task should run for. The task completes when this time limit is reached (line 19).Run the solution with your npm start
command. Assuming MAX_WAIT_TIME is still set to 3 and WORKER_TIME is set to 10, you will see the following output:
Node multi-threading demo using worker_threads module in Node 11.7.0 [Main] Starting worker from process 4764 [Main] worker says: starting heavy duty work from process 4764 that will take 10s to complete [Main] worker says: heavy duty work in progress...1s [Main] worker says: heavy duty work in progress...2s [Main] worker says: heavy duty work in progress...3s [Main] worker could not complete its work in the allowed 3s, exiting Node process
The worker thread started to do its work, but after 3 seconds, the app signaled to stop the stream. The main process was forcefully exited along with the worker thread before it had a chance to complete its task.
You can also try adjusting the solution to see what happens when:
parentPort
from the worker_threads
modules to communicate back to the parent thread at 3 different points:workerData
from the worker_threads
module to pass in a time limit for how long (in seconds) the task should run for. The task completes when this time limit is reached (line 19).We have only just scratched the surface of what is possible when you combine the streaming power and beauty of RxJS Observables with the worker_threads module. Happy threading!
Check out the full solution on GitHub at briandesousa/node-worker-thread-rxjs.
Further reading:
☞ How to build a command-line chat app using SocketIO
☞ Top 15 Programming Languages by Popularity (2004-2019)
☞ Use MongoDB Node.js Native Driver Without Mongoose
☞ Deploying a Node 12 Function to Cloud Run
☞ How to build a realtime messaging feature in React app with Chatkit
☞ Video Streaming with Node.js
☞ Building Real-World Microservices with Node.js
This article will be focusing on how use worker threads to execute a task asynchronously and stream data from that task back to the rest of your Node application using RxJS Observables.
Before we get started, if you want to learn more about worker threads and why you might want to use them, I would recommend reading Node.js multithreading: What are Worker Threads and why do they matter? by Alberto Gimeno. Alberto has done a fantastic job explaining the purpose of the worker_thread module, provided some solid examples of where it makes sense to use it as well as demonstrated some alternate ways to build a multi-threaded Node app.
We are going to be building a simple Node app that creates a worker thread running a simulated long-running task that reports status back at regular intervals until it completes or until time runs out. The worker thread will be wrapped in an RxJS Observable so that the rest of the application can stream messages returned from the worker thread using the powerful RxJS library.
If you want to jump ahead and see the final solution, you can see it out on GitHub at briandesousa/node-worker-thread-rxjs.
The first thing we need to do is ensure our environment is ready to go:
npm init
to initialize a new NPM packagenode node-parent-thread-rxjs.js
npm install -s rxjs
The worker thread has the logic to simulate a long-running task:
const { workerData, parentPort } = require('worker_threads');
parentPort.postMessage(`starting heavy duty work from process ${process.pid} that will take ${workerData}s to complete`);
timeLimit = workerData;
timer = 0;
// simulate a long-running process with updates posted back on a regular interval
do {
setTimeout(
(count) => {
parentPort.postMessage(`heavy duty work in progress...${count + 1}s`);
if (count === timeLimit) {
parentPort.postMessage('done heavy duty work');
}
},
1000 * timer,
timer);
} while (++timer !== timeLimit);
node-worker-thread-rxjs.js
Let’s break this script down a bit:
parentPort
from the worker_threads
modules to communicate back to the parent thread at 3 different points:workerData
from the worker_threads
module to pass in a time limit for how long (in seconds) the task should run for. The task completes when this time limit is reached (line 19).This worker thread doesn’t do anything particularly useful but it does demonstrate how a thread might receive instructions from its parent and stream multiple updates back to its parent.
The parent thread has the following responsibilities:
parentPort
from the worker_threads
modules to communicate back to the parent thread at 3 different points:workerData
from the worker_threads
module to pass in a time limit for how long (in seconds) the task should run for. The task completes when this time limit is reached (line 19).const Rxjs = require('rxjs');
const RxjsOperators = require('rxjs/operators');
const { Worker } = require('worker_threads');
console.log("\nNode multi-threading demo using worker_threads module in Node 11.7.0\n");
const COMPLETE_SIGNAL = 'COMPLETE';
function runTask(workerData, completedOnTime) {
return Rxjs.Observable.create(observer => {
const worker = new Worker('./node-worker-thread-rxjs.js', { workerData });
worker.on('message', message => observer.next(message));
worker.on('error', error => observer.error(error));
worker.on('exit', code => {
if (code !== 0) {
observer.error(`Worker stopped with exit code ${code}`);
} else {
completedOnTime();
observer.next(COMPLETE_SIGNAL);
observer.complete();
}
});
});
}
const MAX_WAIT_TIME = 3;
const WORKER_TIME = 10;
function main() {
completedOnTime = false;
console.log(`[Main] Starting worker from process ${process.pid}`);
const worker$ = runTask(WORKER_TIME, () => completedOnTime = true);
// receive messages from worker until it completes but only wait for MAX_WAIT_TIME
worker$.pipe(
RxjsOperators.takeWhile(message => message !== COMPLETE_SIGNAL),
RxjsOperators.takeUntil(Rxjs.timer(MAX_WAIT_TIME * 1000))
).subscribe(
result => console.log(`[Main] worker says: ${result}`),
error => console.error(`[Main] worker error: ${error}`),
() => {
if (!completedOnTime) {
console.log(`[Main] worker could not complete its work in the allowed ${MAX_WAIT_TIME}s, exiting Node process`);
process.exit(0);
} else {
console.log(`[Main] worker completed its work in the allowed ${WORKER_TIME}s`);
}
}
);
}
main();
node-parent-thread-rxjs.js
There is a lot going on here. Let’s focus on the runTask()
function first:
parentPort
from the worker_threads
modules to communicate back to the parent thread at 3 different points:workerData
from the worker_threads
module to pass in a time limit for how long (in seconds) the task should run for. The task completes when this time limit is reached (line 19).The runTask()
doesn’t have a very descriptive name however you can now see that it encapsulates the mapping logic between worker thread events and the Observable interface.
Next, let’s look at the at the main()
function:
parentPort
from the worker_threads
modules to communicate back to the parent thread at 3 different points:workerData
from the worker_threads
module to pass in a time limit for how long (in seconds) the task should run for. The task completes when this time limit is reached (line 19).Run the solution with your npm start
command. Assuming MAX_WAIT_TIME is still set to 3 and WORKER_TIME is set to 10, you will see the following output:
Node multi-threading demo using worker_threads module in Node 11.7.0 [Main] Starting worker from process 4764 [Main] worker says: starting heavy duty work from process 4764 that will take 10s to complete [Main] worker says: heavy duty work in progress...1s [Main] worker says: heavy duty work in progress...2s [Main] worker says: heavy duty work in progress...3s [Main] worker could not complete its work in the allowed 3s, exiting Node process
The worker thread started to do its work, but after 3 seconds, the app signaled to stop the stream. The main process was forcefully exited along with the worker thread before it had a chance to complete its task.
You can also try adjusting the solution to see what happens when:
parentPort
from the worker_threads
modules to communicate back to the parent thread at 3 different points:workerData
from the worker_threads
module to pass in a time limit for how long (in seconds) the task should run for. The task completes when this time limit is reached (line 19).We have only just scratched the surface of what is possible when you combine the streaming power and beauty of RxJS Observables with the worker_threads module. Happy threading!
Check out the full solution on GitHub at briandesousa/node-worker-thread-rxjs.
Further reading:
☞ How to build a command-line chat app using SocketIO
☞ Top 15 Programming Languages by Popularity (2004-2019)
☞ Use MongoDB Node.js Native Driver Without Mongoose
☞ Deploying a Node 12 Function to Cloud Run
☞ How to build a realtime messaging feature in React app with Chatkit
☞ Video Streaming with Node.js
☞ Building Real-World Microservices with Node.js
#node-js #javascript