How to write queries using Elasticsearch query body builder in Node.js

How to write queries using Elasticsearch query body builder in Node.js

In this Node.js tutorial explains how to write queries using Elasticsearch query body builder in Node.js. Understanding Elasticsearch query body builder in Node.js. Elasticsearch query body builder is a query DSL (domain-specific language) or client that provides an API layer over raw Elasticsearch queries

Introduction

Elasticsearch query body builder is a query DSL (domain-specific language) or client that provides an API layer over raw Elasticsearch queries. It makes full-text search data querying and complex data aggregation easier, more convenient, and cleaner in terms of syntax.

In this tutorial, we will learn how writing queries using the builder syntax offers more advantages over raw Elasticsearch queries. This is because raw queries can quickly become cumbersome, unstructured, less idiomatic, and even error-prone.

fWe are going to achieve this by leveraging elastic-builder, a query builder library. According to its documentation, it is a tool for quickly building request body for complex search queries and aggregation. Additionally, it conforms with the API specification standard of native Elasticsearch queries with no performance bottleneck whatsoever.

Essentially, this means we can write queries using the builder syntax, matching equivalent queries provided by native Elasticsearch. Not to worry — we will learn and understand the builder syntax as we progress with this tutorial.

To begin, let’s examine a simple example of a generic car query to understand why using ES query builder would make querying Elasticsearch data easier, and how it contributes to a faster development lifecycle.

{
  "query": {
    "bool": {
      "must": {
        "match": {
          "Origin": "USA"
        }
      },
      "filter": {
        "range": {
          "Cylinders": {
            "gte": 4,
            "lte": 6
          }
        }
      },
      "must_not": {
        "range": {
          "Horsepower": {
            "gte": 75
          }
        }
      },
      "should": {
        "term": {
          "Name": "ford"
        }
      }
    }
  }
}

Looking at the above, we are running a query for a car whose origin is the USA, while performing a filter where the engine’s cylinders can be either greater than or equal to 4 or less than or equal to 6. Also, we are running a range query, where the horsepower of the car must not be greater than or equal to 75. Finally, the name of the car should be Ford.

Now, the problem with writing these kinds of queries are:

  • They are overly verbose
  • They are prone to syntax errors, possibly as a result of badly nested fields
  • They may be difficult to maintain or even add little incremental changes to over time. For example, knowing where to add another filter or query field may become confusing
  • They may be difficult to pick up by new members of a dev team
  • They are not fun or interesting to write for more complex queries

Now consider an equivalent of the above query using the builder syntax, shown below:

esb.requestBodySearch()
     .query(
        esb.boolQuery()
            .must(esb.matchQuery('Origin', 'USA'))
            .filter(esb.rangeQuery('Cylinders').gte(4).lte(6))
            .should(esb.termQuery('Name', 'ford'))
            .mustNot(esb.rangeQuery('Horsepower').gte(75))
    )

The query above does exactly the same thing as the raw ES query we previously reviewed, and as we can see, this is more intuitive and intentional.

Here, we are making use of the requestBodySearch API from elastic-builder. This API helps us build and form queries that verbally represent and explain our intent in an even, smooth, idiomatic way. They are also very readable, and we can decide to add even more fields so as to obtain an entirely different query result, as the case may be.

Prerequisites

In order to easily follow along with this tutorial, I would recommend going through this introductory tutorial on getting started with Elasticsearch and Node.js. Note that this action is only necessary if you lack prior experience working with Elasticsearch or if you want a little refresher on it. Otherwise, you should be able to follow this tutorial with ease.

For a start, ensure that you have Node.js and npm installed on your machine. Also, I would recommend that you download the Elasticsearch binaries and install them, just in case you intend to run it locally. However, for the purposes of this tutorial, we will be setting up Elasticsearch with Elastic Cloud, for which you can use a 14-day free trial.

After you’re done with the entire setup (like choosing a cloud provider and region of your choice, since it is a managed service), you should get a username (which most likely would be elastic), a password, a host and a port. Note that we will need these credentials or secrets to connect to our ES cluster later on.

Although the UI is quite intuitive, to have a visual cue of where to locate these parameters, here are some screenshots that point out where to look.


Elastic user and password fields.


Elasticsearch URL endpoint.

The first screenshot shows the Elasticsearch user and where we can find our password or generate a new password. The second screenshot shows a link where we can easily copy the elasticsearch endpoint url. After this setup, we should be good to go, except we intend to explore other Elasticsearch services in the stack like Kibana.


The Kibana UI, which provides a sort of dashboard visualization and allows monitoring and viewing metrics on your data and the entire Elastic stack.

You can check out more information on Kibana and the entire Elastic stack. To proceed, let’s get a clear context on what we will be building.

Bootstrapping our application

In this tutorial, we are going to build a few API endpoints to demonstrate how to perform full-text search queries on data stored in our Elasticsearch cluster. Of course, we will be using the builder syntax to construct our queries and compare them alongside raw ES queries.

We can go ahead and create a new folder for our project and call it any name we want. As usual, before we begin a new Node.js project, we run npm init inside the project directory. This would create a new package.json file for us.

Then, we can go ahead and install our application dependencies. The dependencies we need for this project are the official Elasticsearch client for Node, the elastic-builder library, Express, body-parser, and the dotenv package.

To install them, we can run the following command in our terminal/command prompt:

npm install @elastic/elasticsearch body-parser dotenv elastic-builder express –save

After the installation, our package.json file should look like this:

{
  "name": "logrocket_elasticsearch_tutorial",
  "version": "1.0.0",
  "description": "LogRocket ElasticSearch Tutorial with ES Builder",
  "main": "index.js",
  "scripts": {
    "start": "node ./app/server.js"
  },
  "author": "Alexander Nnakwue",
  "license": "ISC",
  "dependencies": {
    "@elastic/elasticsearch": "^7.4.0",
    "body-parser": "^1.19.0",
    "dotenv": "^8.2.0",
    "elastic-builder": "^2.4.0",
    "express": "^4.17.1"
  }
}

Now we’ll proceed to create all the necessary files and folders we require. Note that the start script is based on the relative path of our server.js file. First, make sure you are inside the project directory, then run mkdir app to create a new folder called app.

After creating the app folder, we can then navigate into it and create all the necessary files, as shown in the screenshot below. Also, we can go ahead and create all the other files in the project’s root directory as shown.

The next step is for us to create a connection to the Elasticsearch cluster. To do so, we will need to create a .env file to store all our environment variables or secrets. The sample.env file exactly mirrors what should be contained in our .env. The contents of the file are as follows:

ELASTICSEARCH_USERNAME=username
ELASTICSEARCH_PASSWORD=password
ELASTICSEARCH_HOST=host
ELASTICSEARCH_PORT=port
APP_PORT= 3004
ELASTICSEARCH_INDEX=index
ELASTICSEARCH_TYPE=type

We can go ahead and copy these parameters, create a .env file in our project’s root directory, and fill in the real credentials. After that, we should be good to create our config.js file, which should provide access to the variables defined or added in our newly created .env file.

The config.js file should contain the following JSON:

const result = require('dotenv').config();
module.exports= {
es_host: process.env.ELASTICSEARCH_HOST,
es_pass: process.env.ELASTICSEARCH_PASSWORD,
es_port: process.env.ELASTICSEARCH_PORT,
es_user:process.env.ELASTICSEARCH_USERNAME,
es_index:process.env.ELASTICSEARCH_INDEX,
es_type:process.env.ELASTICSEARCH_TYPE,
app_port: process.env.APP_PORT
};

if (result.error) {
  console.log(result.error, "[Error Parsing env variables!]");
  throw result.error;
};
// console.log(result.parsed, '[Parsed env variables!]');

As we can see, we are getting access to the variables contained in the .env file and storing them with different variable names. Also note that we have added the app_port, es_index, es_type, and other variables needed for our Elasticsearch connection.

Now, let’s go ahead and connect to our Elasticsearch cluster with these parameters. To do so, we can copy the following to the esConfig.js file:

'use strict'

const { Client } = require('@elastic/elasticsearch');
const config = require('./config');
const client = new Client({ node: `https://${config.es_user}:${config.es_pass}@${config.es_host}:${config.es_port}`});

module.exports.esClient= client;

Here we are adding a reference to the official Elasticsearch Node.js client library, then we are using the contents contained in our config.js file created earlier to instantiate a new ES client connection to our cluster.

Writing data to our ES cluster

Now that our cluster is set up, we can go ahead and create a new file that contains the JSON data we intend to write to our Elasticsearch index. We can go ahead and create the new file, dataToEs.json, if we haven’t done so earlier. The contents of the file can be credited to this source on GitHub. It basically contains the JSON-based dataset we will be writing to our ES index based on the given parameters required to connect to our cluster.

After we are done with the above, we can create a utility.js file, which would contain the functions required to create our ES index; create a new mapping based on the available fields with their respective data types for our datasets; and then, finally, write the JSON data to the index we created on our cluster.

Note that Elasticsearch is schemaless by default, but we can go ahead and define our own schema beforehand to help define a standard structure and format for our data. This, of course, has its own advantages, like data uniformity and so on. Now let’s understand what is going on in the utility.js file:

const fs = require('fs');
const esconfig = require('./esConfig');
const client = esconfig.esClient;
const data = JSON.parse(fs.readFileSync(__dirname + '/dataToEs.json'));
const config = require('./config');

const index= config.es_index;
const type = config.es_type;

  async function writeCarDataToEs(index, data){
    for (let i = 0; i < data.length; i++ ) {
      await client.create({
        refresh: true,
        index: index,
        id: i,
        body: data[i]
      }, function(error, response) {
        if (error) {
          console.error("Failed to import data", error);
          return;
        }
        else {
          console.log("Successfully imported data", data[i]);
        }
      });
    }
};
async function createCarMapping (index, type) {
  const carSchema = {
      "Acceleration": {
        "type": "long"
      },
      "Cylinders": {
        "type": "long"
      },
      "Displacement": {
        "type": "long"
      },
      "Horsepower": {
        "type": "long"
      },
      "Miles_per_Gallon": {
        "type": "long"
      },
      "Name": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "Origin": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "Weight_in_lbs": {
        "type": "long"
      },
      "Year": {
        "type": "date"
    }
  }
  return client.indices.putMapping({index, type, body:{properties:carSchema}});
}
module.exports = {
async resetIndex(){
 if (client.indices.exists({ index })) {
      client.indices.delete({ index });
   }
 client.indices.create({ index });
 createCarMapping(client, index, type);
 writeCarDataToEs(index, data);
}
};

In the file above, we are first dynamically reading the JSON data contained in the dataToEs.json file we talked about earlier. As shown, we have made use of the native filesystem package for Node.js.

We are also making use of __dirname to get access to the directory name of the current module and appending the relative file path of the dataset to it. Additionally, we are importing a reference to our ES client connection. The first function, writeCarDataToEs, loops through the entire JSON dataset and writes it to our Elasticsearch index.

Note that there is a caveat here, as for very large datasets, we should instead make do with the ES bulk API instead of the create API. However, for our current use case, this should work fine. To see how to use the ES bulk API, you can check the official example provided in this GitHub repo.

After that, we can now create mappings for our data, which represent the expected data type and format. We do so by calling the putMapping API while passing the index, type, and the JSON body.

Lastly, we create the function resetIndex, which checks if the index we are trying to create already exists and, if it does, deletes it for us. Otherwise, we create a new index with the name we pass from our env variable, create the mappings for our JSON dataset, and call the writeDataToEs() function, which then writes the data to the index in accordance with the mappings already specified.

Now we can go ahead and create our server.js file, which is basically a simple Express server.

const express = require('express');
const bodyParser = require('body-parser')
require("dotenv").config();
require("./utility").resetIndex();
const app = express();
const esconfig = require('./esConfig');
const client = esconfig.esClient;
const router  = require("./router");

app.use(bodyParser.urlencoded({ extended: false }));
app.use(bodyParser.json());
app.use("/",router);

app.set('port', process.env.APP_PORT || 3000);

client.ping({}, function(error) {
  if (error) {
      console.log('ES Cluster is down', error);
  } else {
      console.log('ES Cluster is up!');
  }
});


app.listen(app.get('port'), ()=>{
  console.log(`Express server listening on port, ${app.get('port')}`);
} );

Here, we are importing the resetIndex() function from the utility.js file, which will make it run automatically when we spin up our app. We can decide to comment that import out, as it won’t be needed for subsequent app restarts since we should already have our index, mappings, and data all created and set up in our ES cluster.

Writing elastic-builder queries

Now we can get to writing queries for our data. Let’s begin by writing a multiple match query that matches a car’s name and its origin, while its weight is greater than or equal to a particular number (rangeQuery). We can check out the service.js file to understand how this query works:

async fetchMatchMultipleQuery(origin, name,weight){
  const requestBody = esb.requestBodySearch()
      .query(
        esb.boolQuery()
          .must([
            esb.matchQuery(
              'Origin', origin,
            ),
            (
              esb.matchQuery(
                'Name', name,
              )
            ),
          ])
        .filter(esb.rangeQuery('Weight_in_lbs').gte(weight))
      )
      return client.search({index: index, body: requestBody.toJSON()});
}

Looking at the above function, it is quite clear what we are trying to achieve. This query is a boolean that must match cars from a particular origin and a specific name. Also, we are filtering the cars using a range query, where the weight must be greater than or equal to the particular weight we specify.

As an aside, let’s take a look at the equivalent raw query for the above:

{
  "bool": {
    "must": [
      {
        "match": {
          "Origin": "https://elastic-builder.js.org"
        }
      },
      {
        "match": {
          "Name": "name"
        }
      }
    ],
    "filter": {
      "range": {
        "Weight_in_lbs": {
          "gte": "weight"
        }
      }
    }
  }
}

As we can see, this is prone to mistakes due to the deeply nested nature of the query, which we pointed out earlier. Now that we have a visual cue for this, let’s understand the flow in actually calling this API.

First of all, check out the services.js file. This file handles everything related to building our queries using the builder syntax, and then calling our ES client to actually perform those calls. Also, inside the file, we fill find the same function above.

The controller.js file takes care of routing our requests based on the app route specified in the routes.js file. When requests are routed, the functions in the controller.js file call those in the services.js files.

Let’s illustrate this with a simple example. For the previous query defined above, the corresponding call in the controller file is shown below:

async fetchMatchMultipleQuery(req,res) {
    const origin = req.query.Origin;
    const name = req.query.Name;
    const weight = req.query.Weight_in_lbs;
    try {
      const result = await Services.fetchMatchMultipleQuery(origin, name, weight);
      const data = result.body.hits.hits.map((car)=>{
        return {
          id: car._id,
          data: car._source
        }
      })
      res.json({status_code: 200, success: true, data: data, messsage: "fetch match query for multiple requests successful!" });
    } catch (err) {
      res.json({status_code: 500, success: false, data: [], message: err});
    }
  }

Subsequently, the routing for this call is contained in the routes.js file:

routes.route("/search-by-multiple").get(controller.fetchMatchMultipleQuery);
Testing our implementation

We can now go ahead and test our implementation. First, let’s start our server by running npm start. Then we can visit this URL to run our query with the provided filters: name, origin, and weight_in_lbs.

http://localhost:3000/search-by-multiple?Name=ford&Origin=USA&Weight_in_lbs=3000

Note that the request above is a GET request, and the parameters after the URL are the query parameters required to give us our desired filtered results. The results for the API call are shown below:

{
  "status_code": 200,
  "success": true,
  "data": [
    {
      "id": "221",
      "data": {
        "Name": "ford f108",
        "Miles_per_Gallon": 13,
        "Cylinders": 8,
        "Displacement": 302,
        "Horsepower": 130,
        "Weight_in_lbs": 3870,
        "Acceleration": 15,
        "Year": "1976-01-01",
        "Origin": "USA"
      }
    },
    {
      "id": "99",
      "data": {
        "Name": "ford ltd",
        "Miles_per_Gallon": 13,
        "Cylinders": 8,
        "Displacement": 351,
        "Horsepower": 158,
        "Weight_in_lbs": 4363,
        "Acceleration": 13,
        "Year": "1973-01-01",
        "Origin": "USA"
      }
    },
{
      "id": "235",
      "data": {
        "Name": "ford granada",
        "Miles_per_Gallon": 18.5,
        "Cylinders": 6,
        "Displacement": 250,
        "Horsepower": 98,
        "Weight_in_lbs": 3525,
        "Acceleration": 19,
        "Year": "1977-01-01",
        "Origin": "USA"
      }
    },
    {
      "id": "31",
      "data": {
        "Name": "ford f250",
        "Miles_per_Gallon": 10,
        "Cylinders": 8,
        "Displacement": 360,
        "Horsepower": 215,
        "Weight_in_lbs": 4615,
        "Acceleration": 14,
        "Year": "1970-01-01",
        "Origin": "USA"
      }
    },
    "messsage": "fetch match query for multiple requests successful!"
}

Note that the above query result has been truncated for brevity. When we run this query locally, you should get the entire result. Not to worry, the link to the collections on POSTMAN is here. You can copy it, import it to your POSTMAN, and test as well.

The entire code for the services.js file, which contains all the queries made to our data in the cluster, is shown below:

const esconfig = require('./esConfig');
const client = esconfig.esClient;
const config = require('./config');
const index = config.es_index;
const esb = require('elastic-builder'); //the builder

module.exports = {
  async search(){
    const requestBody = esb.requestBodySearch()
    .query(esb.matchAllQuery())
    .size(10)
    .from(1);
    return client.search({index: index, body: requestBody.toJSON()});
  },


  async filterCarsByYearMade(param) {
    const requestBody = esb.requestBodySearch()
                            .query(
                              esb.boolQuery()
                              .must(esb.matchAllQuery())
                              .filter(esb.rangeQuery('Year').gte(param).lte(param))
                            )
                            .from(1)
                            .size(5);
    return client.search({index: index, body: requestBody.toJSON()});
  },

async filterCarsByName(param) {
  const requestBody = esb.requestBodySearch()
  .query(
      esb.termQuery('Name', param))
      .sort(esb.sort('Year', 'asc')
  )
  .from(1)
  .size(10);
  return client.search({index: index, body: requestBody.toJSON()});
},
async fetchCarByName(param) {
  const requestBody = esb.requestBodySearch()
    .query(
      esb.boolQuery()
        .must(esb.matchPhraseQuery('Name', param))
    );
  return client.search({index: index, body: requestBody.toJSON()});
},
async fetchMatchMultipleQuery(origin, name,weight){
  const requestBody = esb.requestBodySearch()
      .query(
        esb.boolQuery()
          .must([
            esb.matchQuery(
              'Origin', origin,
            ),
            (
              esb.matchQuery(
                'Name', name,
              )
            ),
          ])
        .filter(esb.rangeQuery('Weight_in_lbs').gte(weight))
      )
      return client.search({index: index, body: requestBody.toJSON()});
},
async aggregateQuery(origin,cylinder,name,horsePower) {
const requestBody = esb.requestBodySearch()
.query(
    esb.boolQuery()
        .must(esb.matchQuery('Origin', origin))
        .filter(esb.rangeQuery('Cylinders').gte(cylinder))
        .should(esb.termQuery('Name', name))
        .mustNot(esb.rangeQuery('Horsepower').gte(horsePower))
        // .agg(esb.avgAggregation('avg_miles', 'Miles_per_Gallon'))
)
return client.search({index: index, body: requestBody.toJSON()});
},
};

As we can see in the file above, the queries are quite readable and easy to grasp. We have made use of the matchQuery, rangeQuery, termQuery, matchPhraseQuery, boolQuery, and matchAllQuery queries provided by the builder library. For other available queries and how to use them, we can check out the query sections of the elastic-builder documentation.

The sort command, as the name implies, sorts the queries in either an ascending or descending order, whatever the case may be. The from and size parameters help with controlling the output of our data by paginating the returned result.

Also, the code for the controller.js file is shown below:

const Services = require('./services');

module.exports = {
  async search(req, res) {
    try {
      const result = await Services.search();
      const data = result.body.hits.hits.map((car)=>{
        return {
          id: car._id,
          data: car._source
        }
      })
      res.json({ status_code: 200, success: true, data: data, message: "Cars data successfully fetched!" });
    } catch (err) {
      res.json({ status_code: 500, success: false, data: [], message: err});
    }
  },

  async  filterCarsByYearMade(req, res) {
    let {year} = req.query;
    try {
      const result = await Services.filterCarsByYearMade(year);
      const data = result.body.hits.hits.map((car)=>{
        return {
          id: car._id,
          data: car._source
        }
      })
      res.json({ status_code: 200, success: true, data: data, message: "Filter Cars by year made data fetched successfully" });
    } catch (err) {
      res.json({ status_code: 500, success: false, data: [],  message: err});
    }
  },

  async filterCarsByName(req,res) {
    let param = req.query.Name;
    try {
      const result = await Services.filterCarsByName(param);
      const data = result.body.hits.hits.map((car)=>{
        return {
          id: car._id,
          data: car._source
        }
      })
      res.json({status_code: 200, success: true, data:data , message: "Filter cars by name data fetched successfully!" });
    } catch (err) {
      res.json({ status_code: 500, success: false, data: [], message: err});
    }
  },


  async filterCarByName(req,res) {
    const param = req.query.Name;
    try {
      const result = await Services.fetchCarByName(param);
      const data = result.body.hits.hits.map((car)=>{
        return {
          id: car._id,
          data: car._source
        }
      })
      res.json({ status_code: 200, success: true, data: data , message: "Filter a car by name query data fetched successfully!"});
    } catch (err) {
      res.json({ status_code: 500, success: false, data: [], message: err});
    }
  },

  async fetchMatchMultipleQuery(req,res) {
    const origin = req.query.Origin;
    const name = req.query.Name;
    const weight = req.query.Weight_in_lbs;
    try {
      const result = await Services.fetchMatchMultipleQuery(origin, name, weight);
      const data = result.body.hits.hits.map((car)=>{
        return {
          id: car._id,
          data: car._source
        }
      })
      res.json({status_code: 200, success: true, data: data, messsage: "fetch match query for multiple requests successful!" });
    } catch (err) {
      res.json({status_code: 500, success: false, data: [], message: err});
    }
  },

  async aggregateQuery(req,res) {
    const origin = req.query.Origin;
    const cylinder = req.query.Cylinder;
    const name = req.query.Name;
    const horsePower = req.query.Horsepower;
    try {
    const result = await Services.aggregateQuery(origin, cylinder, name, horsePower);
      const data = result.body.hits.hits.map((car)=>{
        return {
          id: car._id,
          data: car._source
        }
      })
      res.json({ status_code: 200, success: true, data: data, message: "Data successfully fetched!" });
    } catch (err) {
      res.json({ status_code: 500, success: false, data: [], message: err});
    }
  },

}

The above file contains the code that calls our services.js file and helps route the requests. As we can see, for each query above, we are doing a map on the returned data and outputting the id and the _source fields alone.

The routes for all the queries as contained in the routes.js file are shown below:

const express    = require("express");
const controller = require("./controller");
const routes     = express.Router();

routes.route("/search-all").get(controller.search);
routes.route("/search-by-year").get(controller.filterCarsByYearMade);
routes.route("/search-by-name").get(controller.filterCarsByName);
routes.route("/search-by-name-single").get(controller.filterCarByName);
routes.route("/search-by-multiple").get(controller.fetchMatchMultipleQuery);
routes.route("/seach-avg-query").get(controller.aggregateQuery);

module.exports = routes;

This file helps in calling and routing all the functions provided in the controller.js file. Note that the entire code for this project can be found on GitHub.

Conclusion

Elasticsearch is necessary if we intend to perform data aggregation, metrics, complex filters, and full-text search capabilities for highly search-intensive applications.

While there are other builder libraries out there, elastic-builder is quite reliable, stable, and has a clear, readable, easily understandable syntax.

Top 7 Most Popular Node.js Frameworks You Should Know

Top 7 Most Popular Node.js Frameworks You Should Know

Node.js is an open-source, cross-platform, runtime environment that allows developers to run JavaScript outside of a browser. In this post, you'll see top 7 of the most popular Node frameworks at this point in time (ranked from high to low by GitHub stars).

Node.js is an open-source, cross-platform, runtime environment that allows developers to run JavaScript outside of a browser.

One of the main advantages of Node is that it enables developers to use JavaScript on both the front-end and the back-end of an application. This not only makes the source code of any app cleaner and more consistent, but it significantly speeds up app development too, as developers only need to use one language.

Node is fast, scalable, and easy to get started with. Its default package manager is npm, which means it also sports the largest ecosystem of open-source libraries. Node is used by companies such as NASA, Uber, Netflix, and Walmart.

But Node doesn't come alone. It comes with a plethora of frameworks. A Node framework can be pictured as the external scaffolding that you can build your app in. These frameworks are built on top of Node and extend the technology's functionality, mostly by making apps easier to prototype and develop, while also making them faster and more scalable.

Below are 7of the most popular Node frameworks at this point in time (ranked from high to low by GitHub stars).

Express

With over 43,000 GitHub stars, Express is the most popular Node framework. It brands itself as a fast, unopinionated, and minimalist framework. Express acts as middleware: it helps set up and configure routes to send and receive requests between the front-end and the database of an app.

Express provides lightweight, powerful tools for HTTP servers. It's a great framework for single-page apps, websites, hybrids, or public HTTP APIs. It supports over fourteen different template engines, so developers aren't forced into any specific ORM.

Meteor

Meteor is a full-stack JavaScript platform. It allows developers to build real-time web apps, i.e. apps where code changes are pushed to all browsers and devices in real-time. Additionally, servers send data over the wire, instead of HTML. The client renders the data.

The project has over 41,000 GitHub stars and is built to power large projects. Meteor is used by companies such as Mazda, Honeywell, Qualcomm, and IKEA. It has excellent documentation and a strong community behind it.

Koa

Koa is built by the same team that built Express. It uses ES6 methods that allow developers to work without callbacks. Developers also have more control over error-handling. Koa has no middleware within its core, which means that developers have more control over configuration, but which means that traditional Node middleware (e.g. req, res, next) won't work with Koa.

Koa already has over 26,000 GitHub stars. The Express developers built Koa because they wanted a lighter framework that was more expressive and more robust than Express. You can find out more about the differences between Koa and Express here.

Sails

Sails is a real-time, MVC framework for Node that's built on Express. It supports auto-generated REST APIs and comes with an easy WebSocket integration.

The project has over 20,000 stars on GitHub and is compatible with almost all databases (MySQL, MongoDB, PostgreSQL, Redis). It's also compatible with most front-end technologies (Angular, iOS, Android, React, and even Windows Phone).

Nest

Nest has over 15,000 GitHub stars. It uses progressive JavaScript and is built with TypeScript, which means it comes with strong typing. It combines elements of object-oriented programming, functional programming, and functional reactive programming.

Nest is packaged in such a way it serves as a complete development kit for writing enterprise-level apps. The framework uses Express, but is compatible with a wide range of other libraries.

LoopBack

LoopBack is a framework that allows developers to quickly create REST APIs. It has an easy-to-use CLI wizard and allows developers to create models either on their schema or dynamically. It also has a built-in API explorer.

LoopBack has over 12,000 GitHub stars and is used by companies such as GoDaddy, Symantec, and the Bank of America. It's compatible with many REST services and a wide variety of databases (MongoDB, Oracle, MySQL, PostgreSQL).

Hapi

Similar to Express, hapi serves data by intermediating between server-side and client-side. As such, it's can serve as a substitute for Express. Hapi allows developers to focus on writing reusable app logic in a modular and prescriptive fashion.

The project has over 11,000 GitHub stars. It has built-in support for input validation, caching, authentication, and more. Hapi was originally developed to handle all of Walmart's mobile traffic during Black Friday.

Node.js for Beginners - Learn Node.js from Scratch (Step by Step)

Node.js for Beginners - Learn Node.js from Scratch (Step by Step)

Node.js for Beginners - Learn Node.js from Scratch (Step by Step) - Learn the basics of Node.js. This Node.js tutorial will guide you step by step so that you will learn basics and theory of every part. Learn to use Node.js like a professional. You’ll learn: Basic Of Node, Modules, NPM In Node, Event, Email, Uploading File, Advance Of Node.

Node.js for Beginners

Learn Node.js from Scratch (Step by Step)

Welcome to my course "Node.js for Beginners - Learn Node.js from Scratch". This course will guide you step by step so that you will learn basics and theory of every part. This course contain hands on example so that you can understand coding in Node.js better. If you have no previous knowledge or experience in Node.js, you will like that the course begins with Node.js basics. otherwise if you have few experience in programming in Node.js, this course can help you learn some new information . This course contain hands on practical examples without neglecting theory and basics. Learn to use Node.js like a professional. This comprehensive course will allow to work on the real world as an expert!
What you’ll learn:

  • Basic Of Node
  • Modules
  • NPM In Node
  • Event
  • Email
  • Uploading File
  • Advance Of Node

Learn to generate video previews by using FFmpeg and Node.js

Learn to generate video previews by using FFmpeg and Node.js

In this Node.js tutorial, we'll learn to generate video previews by using FFmpeg and Node.js. How to create video previews with Node.js and FFmpeg. Generating video previews with Node.js and FFmpeg. How to generate video thumbnail in NodeJS? How to manipulate a video with Node.js

Every website that deals with video streaming in any way has a way of showing a short preview of a video without actually playing it. YouTube, for instance, plays a 3- to 4-second excerpt from a video whenever users hover over its thumbnail. Another popular way of creating a preview is to take a few frames from a video and make a slideshow.

We are going to take a closer look at how to implement both of these approaches.

How to manipulate a video with Node.js

Manipulating a video with Node.js itself would be extremely hard, so instead we are going to use the most popular video manipulation tool: FFmpeg. In the documentation, we read:

FFmpeg is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much anything that humans and machines have created. It supports the most obscure ancient formats up to the cutting edge. No matter if they were designed by some standards committee, the community or a corporation. It is also highly portable: FFmpeg compiles, runs, and passes our testing infrastructure FATE across Linux, Mac OS X, Microsoft Windows, the BSDs, Solaris, etc. under a wide variety of build environments, machine architectures, and configurations.

Boasting such an impressive resume, FFmpeg is the perfect choice for video manipulation done from inside of the program, able to run in many different environments.

FFmpeg is accessible through CLI, but the framework can be easily controlled through the node-fluent-ffmpeg library. The library, available on npm, generates the FFmpeg commands for us and executes them. It also implements many useful features, such as tracking the progress of a command and error handling.

Although the commands can get pretty complicated quickly, there’s very good documentation available for the tool. Also, in our examples, there won’t be anything too fancy going on.

The installation process is pretty straightforward if you are on Mac or Linux machine. For Windows, please refer here. The fluent-ffmpeg library depends on the ffmpeg executable being either on our $PATH (so it is callable from the CLI like: ffmpeg ...) or by our providing the paths to the executables through the environment variables.

The exemplary .env file:

FFMPEG_PATH="D:/ffmpeg/bin/ffmpeg.exe"
FFPROBE_PATH="D:/ffmpeg/bin/ffprobe.exe"

Both paths have to be set if they are not already available in our $PATH.

Creating a preview

Now that we know what tools to use for video manipulation from within Node.js runtime, let’s create the previews in the formats mentioned above. I will be using Childish Gambino’s “This is America” video for testing purposes.

Video fragment

The video fragment preview is pretty straightforward to create; all we have to do is slice the video at the right moment. In order for the fragment to be a meaningful and representative sample of the video content, it is best if we get it from a point somewhere around 25–75 percent of the total length of the video. For this, of course, we must first get the video duration.

In order to get the duration of the video, we can use ffprobe, which comes with FFmpeg. ffprobe is a tool that lets us get the metadata of a video, among other things.

Let’s create a helper function that gets the duration for us:

export const getVideoInfo = (inputPath: string) => {
  return new Promise((resolve, reject) => {
    return ffmpeg.ffprobe(inputPath, (error, videoInfo) => {
      if (error) {
        return reject(error);
      }

      const { duration, size } = videoInfo.format;

      return resolve({
        size,
        durationInSeconds: Math.floor(duration),
      });
    });
  });
};

The ffmpeg.ffprobe method calls the provided callback with the video metadata. The videoInfo is an object containing many useful properties, but we are interested only in the format object, in which there is the duration property. The duration is provided in seconds.

Now we can create a function for creating the preview.

Before we do that, let’s take break down the FFmpeg command used to create the fragment:

ffmpeg -ss 146 -i video.mp4 -y -an -t 4 fragment-preview.mp4
  • -ss 146: Start video processing at the 146-second mark of the video (146 is just a placeholder here, our code will randomly generate the number of seconds)
  • -i video.mp4: The input file path
  • -y: Overwrite any existing files while generating the output
  • -an: Remove audio from the generated fragment
  • -t 4: The duration of the (fragment in seconds)
  • fragment-preview.mp4: The path of the output file

Now that we know what the command will look like, let’s take a look at the Node code that will generate it for us.

const createFragmentPreview = async (
  inputPath,
  outputPath,
  fragmentDurationInSeconds = 4,
) => {
  return new Promise(async (resolve, reject) => {
    const { durationInSeconds: videoDurationInSeconds } = await getVideoInfo(
      inputPath,
    );

    const startTimeInSeconds = getStartTimeInSeconds(
      videoDurationInSeconds,
      fragmentDurationInSeconds,
    );

    return ffmpeg()
      .input(inputPath)
      .inputOptions([`-ss ${startTimeInSeconds}`])
      .outputOptions([`-t ${fragmentDurationInSeconds}`])
      .noAudio()
      .output(outputPath)
      .on('end', resolve)
      .on('error', reject)
      .run();
  });
};

First, we use the previously created getVideoInfo function to get the duration of the video. Then we get the start time using the getStartTimeInSeconds helper function.

Let’s think about the start time (the -ss parameter) because it may be tricky to get it right. The start time has to be somewhere between 25–75 percent of the video length since that is where the most representative fragment will be.

But we also have to make sure that the randomly generated start time plus the duration of the fragment is not larger than the duration of the video (startTime + fragmentDurationvideoDuration). If that were the case, the fragment would be cut short due since there wouldn’t be enough video left.

With these requirements in mind, let’s create the function:

const getStartTimeInSeconds = (
  videoDurationInSeconds,
  fragmentDurationInSeconds,
) => {
  // by subtracting the fragment duration we can be sure that the resulting
  // start time + fragment duration will be less than the video duration
  const safeVideoDurationInSeconds =
    videoDurationInSeconds - fragmentDurationInSeconds;

  // if the fragment duration is longer than the video duration
  if (safeVideoDurationInSeconds <= 0) {
    return 0;
  }

  return getRandomIntegerInRange(
    0.25 * safeVideoDurationInSeconds,
    0.75 * safeVideoDurationInSeconds,
  );
};

First, we subtract the fragment duration from the video duration. By doing so, we can be sure that the resulting start time plus the fragment duration will be smaller than the video duration.

If the result of the subtraction is less than 0, then the start time has to be 0 because the fragment duration is longer than the actual video. For example, if the video were 4 seconds long and the expected fragment were to be 6 seconds long, the fragment would be the entire video.

The function returns a random number of seconds from the range between 25–75 percent of the video length using the helper function: getRandomIntegerInRange.

export const getRandomIntegerInRange = (min, max) => {
  const minInt = Math.ceil(min);
  const maxInt = Math.floor(max);

  return Math.floor(Math.random() * (maxInt - minInt + 1) + minInt);
};

It makes use of, among other things, Math.random() to get a pseudo-random integer in the range. The helper is brilliantly explained here.

Now, coming back to the command, all that’s left to do is set the command’s parameters with the generated values and run it.

return ffmpeg()
  .input(inputPath)
  .inputOptions([`-ss ${startTimeInSeconds}`])
  .outputOptions([`-t ${fragmentDurationInSeconds}`])
  .noAudio()
  .output(outputPath)
  .on('end', resolve)
  .on('error', reject)
  .run();

The code is self-explanatory. We make use of the .noAudio() method to generate the -an parameter. We also attach the resolve and reject listeners on the end and error events, respectively. As a result, we have a function that is easy to deal with because it’s wrapped in a promise.

In a real-world setting, we would probably take in a stream and output a stream from the function, but here I decided to use promises to make the code easier to understand.

Here are a few sample results from running the function on the “This is America” video. The videos were converted to gifs to embed them more easily.

Since the users are probably going to view the previews in small viewports, we could do without an unnecessarily high resolution and thus save on the file size.

Frames interval

The second option is to get x frames evenly spread throughout the video. For example if we had a video that was 100 seconds long and we wanted 5 frames out of it for the preview, we would take a frame every 20 seconds. Then we could either merge them together in a video (using ffmpeg) or load them to the website and manipulate them with JavaScript.

Let’s break down the command:

ffmpeg -i video.mp4 -y -vf fps=1/24 thumb%04d.jpg
  • -i video.mp4: The input video file
  • -y: Output overwrites any existing files
  • -vf fps=1/24: The filter that takes a frame every (in this case) 24 seconds
  • thumb%04d.jpg: The output pattern that generates files in the following fashion: thumb0001.jpg, thumb0002.jpg, etc. The %04d part specifies that there should be four decimal numbers

With the command also being pretty straightforward, let’s implement it in Node.

export const createXFramesPreview = (
  inputPath,
  outputPattern,
  numberOfFrames,
) => {
  return new Promise(async (resolve, reject) => {
    const { durationInSeconds } = await getVideoInfo(inputPath);

    // 1/frameIntervalInSeconds = 1 frame each x seconds
    const frameIntervalInSeconds = Math.floor(
      durationInSeconds / numberOfFrames,
    );

    return ffmpeg()
      .input(inputPath)
      .outputOptions([`-vf fps=1/${frameIntervalInSeconds}`])
      .output(outputPattern)
      .on('end', resolve)
      .on('error', reject)
      .run();
  });
};

As was the case with the previous function, we must first know the length of the video in order to calculate when to extract each frame. We get it with the previously defined helper getVideoInfo.

Next, we divide the duration of the video by the number of frames (passed as an argument, numberOfFrames). We use the Math.floor() function to make sure that the number is an integer and multiplied again by the number of frames is lower or equal to the duration of the video.

Then we generate the command with the values and execute it. Once again, we attach the resolve and reject functions to the end and error events, respectively, to wrap the output in the promise.

Here are some of the generated images (frames):

As stated above, we could now load the images in a browser and use JavaScript to make them into a slideshow or generate a slideshow with FFmpeg. Let’s create a command for the latter approach as an exercise:

ffmpeg -framerate 1/0.6 -i thumb%04d.jpg slideshow.mp4
  • -framerate 1/0.6: Each frame should be seen for 0.6 seconds
  • -i thumb%04d.jpg: The pattern for the images to be included in the slideshow
  • slideshow.mp4: The output video file name

Here’s the slideshow video generated from 10 extracted frames. A frame was extracted every 24 seconds.


This preview shows us a very good overview of the content of the video.

Fun fact

In order to prepare the resulting videos for embedding in the article, I had to convert them to the .gif format. There are many online converters available as well as apps that could do this for me. But writing a post about using FFmpeg, it felt weird to not even try and use it in this situation. Sure enough, converting a video to the gif format could be done with one command:

ffmpeg -i video.mp4 -filter_complex "[0:v] split [a][b];[a] palettegen [p];[b][p] paletteuse" converted-video.gif

Here’s the blog post explaining the logic behind it.

Now, sure, this command is not that easy to understand because of the complex filter, but it goes a long way in showing how many use cases FFmpeg has and how useful it is to be familiar with this tool.

Instead of using online converters, where the conversion could take some time due to the tools being free and doing it on the server side, I executed the command and had the gif ready after only a few seconds.

Summary

It is not very likely that you will need to create previews of videos yourself, but hopefully by now you know how to use FFmpeg and its basic command syntax well enough to use it in any potential projects. Regarding the previews formats, I would probably go with the video fragment option, as more people will be familiar with it because of YouTube.

We should probably generate the previews of the video with low quality to keep the preview file sizes small since they have to be loaded on users’ browsers. The previews are usually shown in a very small viewport, so the low resolution should not be a problem.

Originally published by Maciej Cieślar at https://blog.logrocket.com