A better way to check big amount of data against database in Node.js

A better way to check big amount of data against database in Node.js

While working on some projects there might be some obvious performance optimisation tasks and not so obvious as well. Sometimes it's even hard to find a good solution especially when a lot of sources all over the internet look like copy paste.

While working on some projects there might be some obvious performance optimisation tasks and not so obvious as well. Sometimes it's even hard to find a good solution especially when a lot of sources all over the internet look like copy paste.

Unfortunately I couldn't find good solution for this issue all over internet. Fortunately I've got help from my colleague and would like to share it with you.

So how is it possible to check if some data already exists in database? The solution is very simple we just need to know some unique identifier which we will use to query data from database:

const pool = ... // Database connection pool
const product = {
    id: 1,
    title: "Best Product Ever",
    ...
}

pool.product.find({where: {title: product.title}}).then(result => { // do something... });

As you can see nothing complex. But what if we need to check if multiple products exist in our database? We can use loop:

const pool = ... // Database connection pool
const products = [{
    id: 1,
    title: "Best Product Ever",
    ...
}, {
    ...
}]

products.forEach(product =>{ pool.product.find({where: {title: product.title}}).then(result => { // Do something... }); })

This also doesn't look so complex. As you may already know one of the good practices of connecting to database is using pools. This gives ability to connect to database multiple times without need to close the connection.

If we take a closer look at our loop it wouldn't be hard to understand that as many items we have in array as many connections we might have. It might not sound like a problem when there are 5 or 6 items we have to check for existence in our database. But it can be a big performance issue when we would like to check hundreds, thousands or even hundreds of thousand of records.

Solution

So how can we solve this issue? Instead of one by one connecting to database and checking if the record exists it's possible to query based on data we have compare queried data with our income data.

const pool = ... // Database connection pool
const products = [
    {
        title: 'Some Product Title'...
        ...
    },
    {...},
    {...}
]

const productTitles = products.map(product => item.title) const query = { where: { title: productTitles }, attributes: ['title'] }

pool.product.find(query).then(result => { const resultTitles = result.map(item => item.title);

const newProducts = products.filter(product => resultTitles.indexOf(product.title) === -1);
// Do something...

});

In the above code I've made a query based on list of products, selected all products form database based on query. If there is some title in my query that doesn't exist in database it just won't select it. Then I compared to lists with help of .map and .filter methods. As a result I've got new products in 'newProducts' variable.

In this case we will only one connection pool. Depending on project requirements there might be more connection pools for example one more connection pool to save the data little bit later.

This is useful when you would like to process only the data that doesn't exists in your database and save to database only after few additional process steps.

I hope this small trick will help you to make your project a little bit better.


By :  Eduard Pochtar


Angular 9 Tutorial: Learn to Build a CRUD Angular App Quickly

What's new in Bootstrap 5 and when Bootstrap 5 release date?

Brave, Chrome, Firefox, Opera or Edge: Which is Better and Faster?

How to Build Progressive Web Apps (PWA) using Angular 9

What is new features in Javascript ES2020 ECMAScript 2020

How to Use Express.js, Node.js and MongoDB.js

In this post, I will show you how to use Express.js, Node.js and MongoDB.js. We will be creating a very simple Node application, that will allow users to input data that they want to store in a MongoDB database. It will also show all items that have been entered into the database.

Top 7 Most Popular Node.js Frameworks You Should Know

Node.js is an open-source, cross-platform, runtime environment that allows developers to run JavaScript outside of a browser. In this post, you'll see top 7 of the most popular Node frameworks at this point in time (ranked from high to low by GitHub stars).

Node.js for Beginners - Learn Node.js from Scratch (Step by Step)

Node.js for Beginners - Learn Node.js from Scratch (Step by Step) - Learn the basics of Node.js. This Node.js tutorial will guide you step by step so that you will learn basics and theory of every part. Learn to use Node.js like a professional. You’ll learn: Basic Of Node, Modules, NPM In Node, Event, Email, Uploading File, Advance Of Node.