Simple Digit Recognition OCR in OpenCV-Python

I am trying to implement a "Digit Recognition OCR" in OpenCV-Python (cv2). It is just for learning purposes. I would like to learn both KNearest and SVM features in OpenCV.

I am trying to implement a "Digit Recognition OCR" in OpenCV-Python (cv2). It is just for learning purposes. I would like to learn both KNearest and SVM features in OpenCV.

I have 100 samples (i.e. images) of each digit. I would like to train with them.

There is a sample that comes with OpenCV sample. But I still couldn't figure out on how to use it. I don't understand what are the samples, responses etc. Also, it loads a txt file at first, which I didn't understand first.

Later on searching a little bit, I could find a in cpp samples. I used it and made a code for cv2.KNearest in the model of (just for testing):

import numpy as np
import cv2

fn = ''
a = np.loadtxt(fn, np.float32, delimiter=',', converters={ 0 : lambda ch : ord(ch)-ord('A') })
samples, responses = a[:,1:], a[:,0]

model = cv2.KNearest()
retval = model.train(samples,responses)
retval, results, neigh_resp, dists = model.find_nearest(samples, k = 10)

It gave me an array of size 20000, I don't understand what it is.


1) What is file? How to build that file from my own data set?

2) What does results.reval() denote?

3) How we can write a simple digit recognition tool using file (either KNearest or SVM)?

OpenCV Python Tutorial - Computer Vision With OpenCV In Python

OpenCV Python Tutorial - Computer Vision With OpenCV In Python

In this OpenCV Python Tutorial article, we will be covering various aspects of Computer Vision using OpenCV in Python. OpenCV has been a vital part in the development of software for a long time. Learning OpenCV is a good asset to the developer to improve aspects of coding and also helps in building a software development career.

We will be checking out the following concepts:

  • What is Computer Vision?
  • How a computer reads an image?
  • What is OpenCV?
  • Basics of OpenCV
  • Image Detection using OpenCV
  • Motion Detector using OpenCV

What Is Computer Vision?

To simplify the answer to this — Let us consider a scenario.

We all use Facebook, correct? Let us say you and your friends went on a vacation and you clicked a lot of pictures and you want to upload them on Facebook and you did. But now, wouldn’t it take so much time just to find your friends faces and tag them in each and every picture? Well, Facebook is intelligent enough to actually tag people for you.

So, how do you think the auto tag feature works? In simple terms, it works on computer vision.

Computer Vision is an interdisciplinary field that deals with how computers can be made to gain a high-level understanding from digital images or videos.

The idea here is to automate tasks that the human visual systems can do. So, a computer should be able to recognize objects such as that of a face of a human being or a lamppost or even a statue.

How Does A Computer Read An Image?

Consider the below image:

We can figure out that it is an image of the New York Skyline. But, can a computer find this out all on its own? The answer is no!

The computer reads any image as a range of values between 0 and 255.

For any color image, there are 3 primary channels — Red, green and blue. How it works is pretty simple.

A matrix is formed for every primary color and later these matrices combine to provide a Pixel value for the individual R, G, B colors.

Each element of the matrices provide data pertaining to the intensity of brightness of the pixel.

Consider the following image:

As shown, the size of the image here can be calculated as B x A x 3.

Note: For a black-white image, there is only one single channel.

Next in this article, let us look at what OpenCV actually is.

What Is OpenCV?

OpenCV is a Python library which is designed to solve computer vision problems. OpenCV was originally developed in 1999 by Intel but later it was supported by Willow Garage.

OpenCV supports a wide variety of programming languages such as C++, Python, Java etc. Support for multiple platforms including Windows, Linux, and MacOS.

OpenCV Python is nothing but a wrapper class for the original C++ library to be used with Python. Using this, all of the OpenCV array structures gets converted to/from NumPy arrays.

This makes it easier to integrate it with other libraries which use NumPy. For example, libraries such as SciPy and Matplotlib.

Next in this article, let us look at some of the basic operations that we can perform with OpenCV.

Basic Operations With OpenCV?

Let us look at various concepts ranging from loading images to resizing them and so on.

Loading an image using OpenCV:

Import cv2
# colored Image
Img = cv2.imread (“Penguins.jpg”,1)
# Black and White (gray scale)
Img_1 = cv2.imread (“Penguins.jpg”,0))

As seen in the above piece of code, the first requirement is to import the OpenCV module.

Later we can read the image using imread module. The 1 in the parameters denotes that it is a color image. If the parameter was 0 instead of 1, it would mean that the image being imported is a black and white image. The name of the image here is ‘Penguins’. Pretty straightforward, right?

Image Shape/Resolution:

We can make use of the shape sub-function to print out the shape of the image. Refer below.

Import cv2
# Black and White (gray scale)
Img = cv2.imread (“Penguins.jpg”,0)

By shape of the image, we mean the shape of the NumPy array. As you see from executing the code, the matrix consists of 768 rows and 1024 columns.

Displaying the image:

Displaying an image using OpenCV is pretty simple and straightforward. Refer below.

import cv2
# Black and White (gray scale)
Img = cv2.imread (“Penguins.jpg”,0)
cv2.imshow(“Penguins”, img)
# cv2.waitKey(2000)

As you can see, we first import the image using imread. We require a window output to display the images, right?

We use the imshow function to display the image by opening a window. There are 2 parameters to theimshow function which is the name of the window and the image object to be displayed.

Later, we wait for a user event. waitKey makes the window static until the user presses a key. The parameter passed to it is the time in milliseconds.

And lastly, we use destroyAllWindows to close the window based on the waitForKey parameter.

Resizing the image:

Similarly, resizing an image is very easy. Here’s another code snippet:

import cv2
# Black and White (gray scale)
img = cv2.imread (“Penguins.jpg”,0)
resized_image = cv2.resize(img, (650,500))
cv2.imshow(“Penguins”, resized_image)

Here, resize function is used to resize an image to the desired shape. The parameter here is the shape of the new resized image.

Later, do note that the image object changes from img to resized_image, because of the image object changes now.

Rest of the code is pretty simple to the previous one, correct?

I am sure you guys are curious to look at the penguins, right? This is the image we were looking to output all this while!

There is another way to pass the parameters to the resize function. Refer below.

Resized_image = cv2.resize(img, int(img.shape[1]/2), int(img.shape[0]/2)))

Here, we get the new image shape to be half of that of the original image.

Next up in this article, let us look at how we perform face detection using OpenCV.

Face Detection Using OpenCV

This seems complex at first but it is very easy. Let me walk you through the entire process and you will feel the same.

Step 1: Considering our prerequisites, we will require an image, to begin with. Later we need to create a cascade classifier which will eventually give us the features of the face.

Step 2: This step involves making use of OpenCV which will read the image and the features file. So at this point, there are NumPy arrays at the primary data points.

All we need to do is to search for the row and column values of the face NumPy ndarray. This is the array with the face rectangle coordinates.

Step 3: This final step involves displaying the image with the rectangular face box.

Check out the following image, here I have summarized the 3 steps in the form of an image for easier readability:

Pretty straightforward, correct?

First, we create a CascadeClassifier object to extract the features of the face as explained earlier. The path to the XML file which contains the face features is the parameter here.

The next step would be to read an image with a face on it and convert it into a black and white image using COLOR_BGR2GREY. Followed by this, we search for the coordinates for the image. This is done using detectMultiScale.

What coordinates, you ask? It’s the coordinates for the face rectangle. The scaleFactor is used to decrease the shape value by 5% until the face is found. So, on the whole — Smaller the value, greater is the accuracy.

Finally, the face is printed on the window.

Adding the rectangular face box:

This logic is very simple — As simple as making use of a for loop statement. Check out the following image.

We define the method to create a rectangle using cv2.rectangle by passing parameters such as the image object, RGB values of the box outline and the width of the rectangle.

Let us check out the entire code for face detection:

import cv2
# Create a CascadeClassifier Object
face_cascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
# Reading the image as it is
img = cv2.imread("photo.jpg")
# Reading the image as gray scale image
gray_img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Search the co-ordintes of the image
faces = face_cascade.detectMultiScale(gray_img, scaleFactor = 1.05,
for x,y,w,h in faces:
    img = cv2.rectangle(img, (x,y), (x+w,y+h),(0,255,0),3)
resized = cv2.resize(img, (int(img.shape[1]/7),int(img.shape[0]/7)))
cv2.imshow("Gray", resized)

Next up in this article, let us look at how to use OpenCV to capture video with the computer webcam.

Capturing Video Using OpenCV

Capturing videos using OpenCV is pretty simple as well. the following loop will give you a better idea. Check it out:

The images are read one-by-one and hence videos are produced due to fast processing of frames which makes the individual images move.

Capturing Video:

Check out the following image:

First, we import the OpenCV library as usual. Next, we have a method called VideoCapture which is used to create the VideoCapture object. This method is used to trigger the camera on the user’s machine. The parameter to this function denotes if the program should make use of the built-in camera or an add-on camera. ‘0’ denotes the built-in camera in this case.

And lastly, the release method is used to release the camera in a few milliseconds.

When you go ahead and type in and try to execute the above code, you will notice that the camera light switches on for a split second and turns off later. Why does this happen?

This happens because there is no time delay to keep the camera functional.

Looking at the above code, we have a new line called time.sleep(3) — This makes the script to stop for 3 seconds. Do note that the parameter passed is the time in seconds. So, when the code is executed, the webcam will be turned on for 3 seconds.

Adding the window:

Adding a window to show the video output is pretty simple and can be compared to the same methods used for images. However, there is a slight change. Check out the following code:

I am pretty sure you can make the most sense from the above code apart from one or two lines.

Here, we have defined a NumPy array which we use to represent the first image that the video captures — This is stored in the frame array.

We also have check — This is a boolean datatype which returns True if Python is able to access and read the VideoCapture object.

Check out the output below:

As you can check out, we got the output as True and the part of the frame array is printed.

But we need to read the first frame/image of the video to begin, correct?

To do exactly that, we need to first create a frame object which will read the images of the VideoCapture object.

As seen above, the imshow method is used to capture the first frame of the video.

All this while, we have tried to capture the first image/frame of the video but directly capturing the video.

So how do we go about capturing the video instead of the first image in OpenCV?

Capturing Video Directly:

In order to capture the video, we will be using the while loop. While condition will be such that, until unless ‘check’ is True. If it is, then Python will display the frames.

Here’s the code snippet image:

We make use of the cvtColor function to convert each frame into a grey-scale image as explained earlier.

waitKey(1) will make sure to generate a new frame after every millisecond of a gap.

It is important here that you note that the while loop is completely in play to help iterate through the frames and eventually display the video.

There is a user event trigger here as well. Once the ‘q’ key is pressed by the user, the program window closes.

OpenCV is pretty easy to grasp, right? I personally love how good the readability is and how quickly a beginner can get started working with OpenCV.

Next up in this article, let us look at how to use a very interesting motion detector use case using OpenCV.

Use Case: Motion Detector Using OpenCV

Problem Statement:

You have been approached by a company that is studying human behavior. Your task is to give them a webcam, that can detect the motion or any movement in front of it. This should return a graph, this graph should contain how long the human/object was in front of the camera.

So, now that we have defined our problem statement, we need to build a solution logic to approach the problem in a structured way.

Consider the below diagram:

Initially, we save the image in a particular frame.

The next step involves converting the image to a Gaussian blur image. This is done so as to ensure we calculate a palpable difference between the blurred image and the actual image.

At this point, the image is still not an object. We define a threshold to remove blemishes such as shadows and other noises in the image.

Borders for the object are defined later and we add a rectangular box around the object as we discussed earlier on the blog.

Lastly, we calculate the time at which the object appears and exits the frame.

Pretty easy, right?

Here’s the code snippet:

The same principle follows through here as well. We first import the package and create the VideoCapture object to ensure we capture video using the webcam.

The while loop iterates through the individual frames of the video. We convert the color frame to a grey-scale image and later we convert this grey-scale image to Gaussian blur.

We need to store the first image/frame of the video, correct? We make use of the if statement for this purpose alone.

Now, let us dive into a little more code:

We make use of the absdiff function to calculate the difference between the first occurring frame and all the other frames.

The threshold function provides a threshold value, such that it will convert the difference value with less than 30 to black. If the difference is greater than 30 it will convert those pixels to white color. THRESH_BINARY is used for this purpose.

Later, we make use of the findContours function to define the contour area for our image. And we add in the borders at this stage as well.

The contourArea function, as previously explained, removes the noises and the shadows. To make it simple, it will keep only that part white, which has an area greater than 1000 pixels as we’ve defined for that.

Later, we create a rectangular box around our object in the working frame.

And followed by this is this simple code:

As discussed earlier, the frame changes every 1 millisecond and when the user enters ‘q’, the loop breaks and the window closes.

We’ve covered all of the major details on this OpenCV Python Tutorial blog. One thing that remains with our use-case is that we need to calculate the time for which the object was in front of the camera.

Calculating the time:

We make use of DataFrame to store the time values during which object detection and movement appear in the frame.

Followed by that is VideoCapture function as explained earlier. But here, we have a flag bit we call status. We use this status at the beginning of the recording to be zero as the object is not visible initially.

We will change the status flag to 1 when the object is being detected as shown in the above figure. Pretty simple, right?

We are going to make a list of the status for every scanned frame and later record the date and time using datetime in a list if and where a change occurs.

And we store the time values in a DataFrame as shown in the above explanatory diagram. We’ll conclude by writing the DataFrame to a CSV file as shown.

Plotting the Motion Detection Graph:

The final step in our use-case to display the results. We are displaying the graph which denotes the motion on 2-axes. Consider the below code:

To begin with, we import the DataFrame from the file.

The next step involves converting time to a readable string format which can be parsed.

Lastly, the DataFrame of time values is plotted on the browser using Bokeh plots.


I hope this article helps you in learning all the fundamentals needed to get started with OpenCV using Python.

This will be very handy when you are trying to develop applications that require image recognition and similar principles. Now, you should also be able to use these concepts to develop applications easily with the help of OpenCV in Python.

Originally published at

Learn More

Complete Python: Go from zero to hero in Python

Computer Vision Using OpenCV

Learn Python 3 Programming for Beginners

An A-Z of useful Python tricks

A Complete Machine Learning Project Walk-Through in Python

A Feature Selection Tool for Machine Learning in Python

Learning Python: From Zero to Hero

Automated Machine Learning on the Cloud in Python

MongoDB with Python Crash Course - Tutorial for Beginners

Complete Python Bootcamp: Go from zero to hero in Python 3

Complete Python Masterclass

Python and Django Full Stack Web Developer Bootcamp

Node.js bindings to OpenCV 3 and OpenCV 4

Node.js bindings to OpenCV 3 and OpenCV 4

The ultimate goal of this article is to provide a comprehensive collection of nodejs bindings to the API of OpenCV and the OpenCV-contrib modules.

The ultimate goal of this article is to provide a comprehensive collection of nodejs bindings to the API of OpenCV and the OpenCV-contrib modules.

opencv4nodejs allows you to use the native OpenCV library in Nodejs. Besides a synchronous API the package provides an asynchronous API, which allows you to build non-blocking and multithreaded computer vision tasks. **opencv4nodejs **supports OpenCV 3 and OpenCV 4.

To get an overview of the currently implemented bindings, have a look at the type declarations of this package. Furthermore, contribution is highly appreciated. If you want to add missing bindings check out the contribution guide.

Table content

  • How to install
  • Usage with Docker
  • Usage with Electron
  • Usage with NW.js
  • Quick Start
  • Async API
  • With TypeScript
  • External Memory Tracking (v4.0.0)
How to install
npm install --save opencv4nodejs

Native node modules are built via node-gyp, which already comes with npm by default. However, node-gyp requires you to have python installed. If you are running into node-gyp specific issues have a look at known issues with node-gyp first.

Important note: node-gyp won't handle whitespaces properly, thus make sure, that the path to your project directory does not contain any whitespaces. Installing opencv4nodejs under "C:\Program Files\some_dir" or similar will not work and will fail with: "fatal error C1083: Cannot open include file: 'opencv2/core.hpp'"!**

On Windows you will furthermore need Windows Build Tools to compile OpenCV and opencv4nodejs. If you don't have Visual Studio or Windows Build Tools installed, you can easily install the VS2015 build tools:

npm install --global windows-build-tools

Installing OpenCV Manually

Setting up OpenCV on your own will require you to set an environment variable to prevent the auto build script to run:

# linux and osx:
# on windows:


You can install any of the OpenCV 3 or OpenCV 4 releases manually or via the Chocolatey package manager:

# to install OpenCV 4.1.0
choco install OpenCV -y -version 4.1.0

Note, this will come without contrib modules. To install OpenCV under windows with contrib modules you have to build the library from source or you can use the auto build script.

Before installing opencv4nodejs with an own installation of OpenCV you need to expose the following environment variables:

  • OPENCV_INCLUDE_DIR pointing to the directory with the subfolder opencv2 containing the header files
  • OPENCV_LIB_DIR pointing to the lib directory containing the OpenCV .lib files

Also you will need to add the OpenCV binaries to your system path:

  • add an environment variable OPENCV_BIN_DIR pointing to the binary directory containing the OpenCV .dll files
  • append ;%OPENCV_BIN_DIR%; to your system path variable

Note: Restart your current console session after making changes to your environment.


Under OSX we can simply install OpenCV via brew:

brew update
brew install [email protected]
brew link --force [email protected]


Under Linux we have to build OpenCV from source manually or using the auto build script.

Installing OpenCV via Auto Build Script

The auto build script comes in form of the opencv-build npm package, which will run by default when installing opencv4nodejs. The script requires you to have git and a recent version of cmake installed.

Auto Build Flags

You can customize the autobuild flags using OPENCV4NODEJS_AUTOBUILD_FLAGS=. Flags must be space-separated.

This is an advanced customization and you should have knowledge regarding the **OpenCV **compilation flags. Flags added by default are listed here.

Installing a Specific Version of OpenCV

You can specify the Version of **OpenCV **you want to install via the script by setting an environment variable: export OPENCV4NODEJS_AUTOBUILD_OPENCV_VERSION=4.1.0

Installing only a Subset of OpenCV modules

If you only want to build a subset of the OpenCV modules you can pass the -DBUILD_LIST cmake flag via the OPENCV4NODEJS_AUTOBUILD_FLAGS environment variable. For example export OPENCV4NODEJS_AUTOBUILD_FLAGS=-DBUILD_LIST=dnn will build only modules required for dnn and reduces the size and compilation time of the OpenCV package.

Usage with Docker

opencv-express - example for opencv4nodejs with express.js and docker

Or simply pull from justadudewhohacks/opencv-nodejs for opencv-3.2 + contrib-3.2 with opencv4nodejs globally installed:

FROM justadudewhohacks/opencv-nodejs 

Note: The aforementioned Docker image already has opencv4nodejs installed globally. In order to prevent build errors during an npm install, your package.json should not include opencv4nodejs, and instead should include/require the global package either by requiring it by absolute path or setting the NODE_PATH environment variable to /usr/lib/node_modules in your Dockerfile and requiring the package as you normally would.

Different OpenCV 3.x base images can be found here:

Usage with Electron

opencv-electron - example for opencv4nodejs with electron

Add the following script to your package.json:

"electron-rebuild": "electron-rebuild -w opencv4nodejs"

Run the script:

$ npm run electron-rebuild

Require it in the application:

const cv = require('opencv4nodejs');

Usage with NW.js

Any native modules, including opencv4nodejs, must be recompiled to be used with NW.js. Instructions on how to do this are available in the Use Native Modules section of the the NW.js documentation.

Once recompiled, the module can be installed and required as usual:

const cv = require('opencv4nodejs');

Quick Start
const cv = require('opencv4nodejs');

Initializing Mat (image matrix), Vec, Point

const rows = 100; // height
const cols = 100; // width

// empty Mat
const emptyMat = new cv.Mat(rows, cols, cv.CV_8UC3);

// fill the Mat with default value
const whiteMat = new cv.Mat(rows, cols, cv.CV_8UC1, 255);
const blueMat = new cv.Mat(rows, cols, cv.CV_8UC3, [255, 0, 0]);

// from array (3x3 Matrix, 3 channels)
const matData = [
  [[255, 0, 0], [255, 0, 0], [255, 0, 0]],
  [[0, 0, 0], [0, 0, 0], [0, 0, 0]],
  [[255, 0, 0], [255, 0, 0], [255, 0, 0]]
const matFromArray = new cv.Mat(matData, cv.CV_8UC3);

// from node buffer
const charData = [255, 0, ...];
const matFromArray = new cv.Mat(Buffer.from(charData), rows, cols, cv.CV_8UC3);

// Point
const pt2 = new cv.Point(100, 100);
const pt3 = new cv.Point(100, 100, 0.5);

// Vector
const vec2 = new cv.Vec(100, 100);
const vec3 = new cv.Vec(100, 100, 0.5);
const vec4 = new cv.Vec(100, 100, 0.5, 0.5);

Mat and Vec operations

const mat0 = new cv.Mat(...);
const mat1 = new cv.Mat(...);

// arithmetic operations for Mats and Vecs
const matMultipliedByScalar = mat0.mul(0.5);  // scalar multiplication
const matDividedByScalar = mat0.div(2);       // scalar division
const mat0PlusMat1 = mat0.add(mat1);          // addition
const mat0MinusMat1 = mat0.sub(mat1);         // subtraction
const mat0MulMat1 = mat0.hMul(mat1);          // elementwise multiplication
const mat0DivMat1 = mat0.hDiv(mat1);          // elementwise division

// logical operations Mat only
const mat0AndMat1 = mat0.and(mat1);
const mat0OrMat1 = mat0.or(mat1);
const mat0bwAndMat1 = mat0.bitwiseAnd(mat1);
const mat0bwOrMat1 = mat0.bitwiseOr(mat1);
const mat0bwXorMat1 = mat0.bitwiseXor(mat1);
const mat0bwNot = mat0.bitwiseNot();

Accessing Mat data

const matBGR = new cv.Mat(..., cv.CV_8UC3);
const matGray = new cv.Mat(..., cv.CV_8UC1);

// get pixel value as vector or number value
const vec3 =, 100);
const grayVal =, 100);

// get raw pixel value as array
const [b, g, r] = matBGR.atRaw(200, 100);

// set single pixel values
matBGR.set(50, 50, [255, 0, 0]);
matBGR.set(50, 50, new Vec(255, 0, 0));
matGray.set(50, 50, 255);

// get a 25x25 sub region of the Mat at offset (50, 50)
const width = 25;
const height = 25;
const region = matBGR.getRegion(new cv.Rect(50, 50, width, height));

// get a node buffer with raw Mat data
const matAsBuffer = matBGR.getData();

// get entire Mat data as JS array
const matAsArray = matBGR.getDataAsArray();


// load image from file
const mat = cv.imread('./path/img.jpg');
cv.imreadAsync('./path/img.jpg', (err, mat) => {

// save image
cv.imwrite('./path/img.png', mat);
cv.imwriteAsync('./path/img.jpg', mat,(err) => {

// show image
cv.imshow('a window name', mat);

// load base64 encoded image
const base64text='..';//Base64 encoded string
const base64data =base64text.replace('data:image/jpeg;base64','')
                            .replace('data:image/png;base64','');//Strip image type prefix
const buffer = Buffer.from(base64data,'base64');
const image = cv.imdecode(buffer); //Image is now represented as Mat

// convert Mat to base64 encoded jpg image
const outBase64 =  cv.imencode('.jpg', croppedImage).toString('base64'); // Perform base64 encoding
const htmlImg='<img src=data:image/jpeg;base64,'+outBase64 + '>'; //Create insert into HTML compatible <img> tag

// open capture from webcam
const devicePort = 0;
const wCap = new cv.VideoCapture(devicePort);

// open video capture
const vCap = new cv.VideoCapture('./path/video.mp4');

// read frames from capture
const frame =;
vCap.readAsync((err, frame) => {

// loop through the capture
const delay = 10;
let done = false;
while (!done) {
  let frame =;
  // loop back to start on end of stream reached
  if (frame.empty) {
    frame =;

  // ...

  const key = cv.waitKey(delay);
  done = key !== 255;

Useful Mat methods

const matBGR = new cv.Mat(..., cv.CV_8UC3);

// convert types
const matSignedInt = matBGR.convertTo(cv.CV_32SC3);
const matDoublePrecision = matBGR.convertTo(cv.CV_64FC3);

// convert color space
const matGray = matBGR.bgrToGray();
const matHSV = matBGR.cvtColor(cv.COLOR_BGR2HSV);
const matLab = matBGR.cvtColor(cv.COLOR_BGR2Lab);

// resize
const matHalfSize = matBGR.rescale(0.5);
const mat100x100 = matBGR.resize(100, 100);
const matMaxDimIs100 = matBGR.resizeToMax(100);

// extract channels and create Mat from channels
const [matB, matG, matR] = matBGR.splitChannels();
const matRGB = new cv.Mat([matR, matB, matG]);

Drawing a Mat into HTML Canvas

const img = ...

// convert your image to rgba color space
const matRGBA = img.channels === 1
  ? img.cvtColor(cv.COLOR_GRAY2RGBA)
  : img.cvtColor(cv.COLOR_BGR2RGBA);

// create new ImageData from raw mat data
const imgData = new ImageData(
  new Uint8ClampedArray(matRGBA.getData()),

// set canvas dimensions
const canvas = document.getElementById('myCanvas');
canvas.height = img.rows;
canvas.width = img.cols;

// set image data
const ctx = canvas.getContext('2d');
ctx.putImageData(imgData, 0, 0);

Method Interface

OpenCV method interface from official docs or src:

void GaussianBlur(InputArray src, OutputArray dst, Size ksize, double sigmaX, double sigmaY = 0, int borderType = BORDER_DEFAULT);

translates to:

const src = new cv.Mat(...);
// invoke with required arguments
const dst0 = src.gaussianBlur(new cv.Size(5, 5), 1.2);
// with optional paramaters
const dst2 = src.gaussianBlur(new cv.Size(5, 5), 1.2, 0.8, cv.BORDER_REFLECT);
// or pass specific optional parameters
const optionalArgs = {
  borderType: cv.BORDER_CONSTANT
const dst2 = src.gaussianBlur(new cv.Size(5, 5), 1.2, optionalArgs);

Async API

The async API can be consumed by passing a callback as the last argument of the function call. By default, if an async method is called without passing a callback, the function call will yield a Promise.

Async Face Detection

const classifier = new cv.CascadeClassifier(cv.HAAR_FRONTALFACE_ALT2);

// by nesting callbacks
cv.imreadAsync('./faceimg.jpg', (err, img) => {
  if (err) { return console.error(err); }

  const grayImg = img.bgrToGray();
  classifier.detectMultiScaleAsync(grayImg, (err, res) => {
    if (err) { return console.error(err); }

    const { objects, numDetections } = res;

// via Promise
  .then(img =>
      .then(grayImg => classifier.detectMultiScaleAsync(grayImg))
      .then((res) => {
        const { objects, numDetections } = res;
  .catch(err => console.error(err));

// using async await
try {
  const img = await cv.imreadAsync('./faceimg.jpg');
  const grayImg = await img.bgrToGrayAsync();
  const { objects, numDetections } = await classifier.detectMultiScaleAsync(grayImg);
} catch (err) {

With TypeScript
import * as cv from 'opencv4nodejs'

Check out the TypeScript examples.

External Memory Tracking (v4.0.0)

Since version 4.0.0 was released, external memory tracking has been enabled by default. Simply put, the memory allocated for Matrices (cv.Mat) will be manually reported to the node process. This solves the issue of inconsistent Garbage Collection, which could have resulted in spiking memory usage of the node process eventually leading to overflowing the RAM of your system, prior to version 4.0.0.

Note, that in doubt this feature can be disabled by setting an environment variable OPENCV4NODEJS_DISABLE_EXTERNAL_MEM_TRACKING before requiring the module:


Or directly in your code:

const cv = require('opencv4nodejs')

OpenCV Python Tutorial: Computer Vision With OpenCV In Python

OpenCV Python Tutorial: Computer Vision With OpenCV In Python

Learn Vision Includes all OpenCV Image Processing Features with Simple Examples. Face Detection, Face Recognition

OpenCV Python Tutorial: Computer Vision With OpenCV In Python

A guide to Face Detection in Python

Face Detection using Open-CV

A guide to Face Detection with Golang and OpenCV

Implement Face Detection Using Python

Python Face Detection Tutorial for Beginners

Computer Vision is an AI based, that is, Artificial Intelligence based technology that allows computers to understand and label images. Its now used in Convenience stores, Driver-less Car Testing, Security Access Mechanisms, Policing and Investigations Surveillance, Daily Medical Diagnosis monitoring health of crops and live stock and so on and so forth..

A common example will be face detection and unlocking mechanism that you use in your mobile phone. We use that daily. That is also a big application of Computer Vision. And today, top technology companies like Amazon, Google, Microsoft, Facebook etc are investing millions and millions of Dollars into Computer Vision based research and product development.

Computer vision allows us to analyze and leverage image and video data, with applications in a variety of industries, including self-driving cars, social network apps, medical diagnostics, and many more.

As the fastest growing language in popularity, Python is well suited to leverage the power of existing computer vision libraries to learn from all this image and video data.

What you'll learn

  • Use OpenCV to work with image files
  • Perform image manipulation with OpenCV, including smoothing, blurring, thresholding, and morphological operations.
  • Create Face Detection Software
  • Detect Objects, including corner, edge, and grid detection techniques with OpenCV and Python
  • Use Python and Deep Learning to build image classifiers
  • Use Python and OpenCV to draw shapes on images and videos
  • Create Color Histograms with OpenCV
  • Study from MIT notes and get Interview questions
  • Crack image processing limits by developing Applications.