Saving an image as numpy array

I am not able to load images into numpy array and getting an error like this...

I am not able to load images into numpy array and getting an error like this...

ValueError: could not broadcast input array from shape (175,217,3) into shape (100,100,3)

The function code:

import cv2
import numpy as np
import os

train_data_dir = '/home/ec2-user/SageMaker/malaria-detection-model/malaria/training'
valid_data_dir = '/home/ec2-user/SageMaker/malaria-detection-model/malaria/validation'

declare the number of samples in each category

nb_train_samples = 22045 # training samples
nb_valid_samples = 5513# validation samples
num_classes = 2
img_rows_orig = 100
img_cols_orig = 100

def load_training_data():
labels = os.listdir(train_data_dir)
total = len(labels)

X_train = np.ndarray((nb_train_samples, img_rows_orig, img_cols_orig, 3), dtype=np.uint8)
Y_train = np.zeros((nb_train_samples,), dtype='uint8')

i = 0
j = 0
for label in labels:
    image_names_train = os.listdir(os.path.join(train_data_dir, label))
    total = len(image_names_train)
    print(label, total)
    for image_name in image_names_train:
        img = cv2.imread(os.path.join(train_data_dir, label, image_name), cv2.IMREAD_COLOR)
        img = np.array([img])
        X_train[i] = img
        Y_train[i] = j

        if i % 100 == 0:
            print('Done: {0}/{1} images'.format(i, total))
        i += 1
    j += 1    
print(i)                
print('Loading done.')

np.save('imgs_train.npy', X_train, Y_train)
return X_train, Y_train

This function is part of the file load_data.py that can be found in malaria_cell_classification_code.zip file from:

https://ceb.nlm.nih.gov/repositories/malaria-datasets/


I tried to change X_train and Y_train to list instead of numpy array. The function halts at np.save method.

X_train = Y_train = list()
X_train.append(img)
Y_train.append(j)

What is the correct and standard way to save images in numpy?


After resizing the image, I get different error:

Done: 19400/9887 images
Done: 19500/9887 images
Done: 19600/9887 images
Done: 19700/9887 images
Done: 19800/9887 images
19842
Loading done.
Transform targets to keras compatible format.
Done: 19800/9887 images
19842
Loading done.
Transform targets to keras compatible format. Creating validation images... Parasitized 1098

error Traceback (most recent call last)
<ipython-input-6-8008be74f482> in <module>()
2 #load data for training
3 X_train, Y_train = load_resized_training_data(img_rows, img_cols)
----> 4 X_valid, Y_valid = load_resized_validation_data(img_rows, img_cols)
5 #print the shape of the data
6 print(X_train.shape, Y_train.shape, X_valid.shape, Y_valid.shape)

~/SageMaker/malaria-detection-model/malaria_cell_classification_code/load_data.py in load_resized_validation_data(img_rows, img_cols)
103 def load_resized_validation_data(img_rows, img_cols):
104
--> 105 X_valid, Y_valid = load_validation_data()
106
107 # Resize images

~/SageMaker/malaria-detection-model/malaria_cell_classification_code/load_data.py in load_validation_data()
75
76 img = np.array([img])
---> 77 img2 = cv2.resize(img, (100, 100))
78 X_valid[i] = img2
79 Y_valid[i] = j

error: OpenCV(4.0.0) /io/opencv/modules/imgproc/src/resize.cpp:3427: error: (-215:Assertion failed) !dsize.empty() in function 'resize'


Creating validation images... Parasitized 1098

error Traceback (most recent call last)
<ipython-input-6-8008be74f482> in <module>()
2 #load data for training
3 X_train, Y_train = load_resized_training_data(img_rows, img_cols)
----> 4 X_valid, Y_valid = load_resized_validation_data(img_rows, img_cols)
5 #print the shape of the data
6 print(X_train.shape, Y_train.shape, X_valid.shape, Y_valid.shape)

~/SageMaker/malaria-detection-model/malaria_cell_classification_code/load_data.py in load_resized_validation_data(img_rows, img_cols)
103 def load_resized_validation_data(img_rows, img_cols):
104
--> 105 X_valid, Y_valid = load_validation_data()
106
107 # Resize images

~/SageMaker/malaria-detection-model/malaria_cell_classification_code/load_data.py in load_validation_data()
75
76 img = np.array([img])
---> 77 img2 = cv2.resize(img, (100, 100))
78 X_valid[i] = img2
79 Y_valid[i] = j

error: OpenCV(4.0.0) /io/opencv/modules/imgproc/src/resize.cpp:3427: error: (-215:Assertion failed) !dsize.empty() in function 'resize'

The complete script can be found here...

https://gist.github.com/shantanuo/cfe0913b367647890451f5ae3f6fb691

Image Panorama Stitching with OpenCV and Python

Image Panorama Stitching with OpenCV and Python

In this article, we will learn how to perform image stitching using Python and OpenCV

Image stitching is one of the most successful applications in Computer Vision. Nowadays, it is hard to find a cell phone or an image processing API that do not contain this functionality.

In this piece, we will talk about how to perform image stitching using Python and OpenCV. Given a pair of images thata share some common region, our goal is to “stitch” them and create a panoramic image scene.

Throughout this article, we go over some of the most famous Computer Vision techniques. These include:

  • Keypoint detection
  • Local invariant descriptors (SIFT, SURF, etc)
  • Feature matching
  • Homography estimation using RANSAC
  • Perspective warping

We explore many feature extractors like SIFT, SURF, BRISK, and ORB. You can follow along using this Colab notebook and even try it out with your pictures.

Feature Detection and Extraction

Given a pair of images like the ones above, we want to stitch them to create a panoramic scene. It is important to note that both images need to share some common region.

Moreover, our solution has to be robust even if the pictures have differences in one or more of the following aspects:

  • Scaling
  • Angle
  • Spacial position
  • Capturing devices

The first step in that direction is to extract some key points and features of interest. These features, however, need to have some special properties.

Let’s first consider a simple solution.

Keypoints Detection

An initial and probably naive approach would be to extract key points using an algorithm such as Harris Corners. Then, we could try to match the corresponding key points based on some measure of similarity like Euclidean distance. As we know, corners have one nice property: they are invariant to rotation. It means that, once we detect a corner, if we rotate an image, that corner will still be there.

However, what if we rotate then scale an image? In this situation, we would have a hard time because corners are not invariant to scale. That is to say, if we zoom-in to an image, the previously detected corner might become a line!

In summary, we need features that are invariant to rotation and scaling. That is where more robust methods like SIFT, SURF, and ORB come in.

Keypoints and Descriptors.

Methods like SIFT and SURF try to address the limitations of corner detection algorithms. Usually, corner detector algorithms use a fixed size kernel to detect regions of interest (corners) on images. It is easy to see that when we scale an image, this kernel might become too small or too big.

To address this limitation, methods like SIFT uses Difference of Gaussians (DoD). The idea is to apply DoD on differently scaled versions of the same image. It also uses the neighboring pixel information to find and refine key points and corresponding descriptors.

To start, we need to load 2 images, a query image, and a training image. Initially, we begin by extracting key points and descriptors from both. We can do it in one step by using the OpenCV detectAndCompute() function. Note that in order to use detectAndCompute() we need an instance of a keypoint detector and descriptor object. It can be ORB, SIFT or SURF, etc. Also, before feeding the images to detectAndCompute() we convert them to grayscale.

def detectAndDescribe(image, method=None):
    """
    Compute key points and feature descriptors using an specific method
    """
    
    assert method is not None, "You need to define a feature detection method. Values are: 'sift', 'surf'"
    
    # detect and extract features from the image
    if method == 'sift':
        descriptor = cv2.xfeatures2d.SIFT_create()
    elif method == 'surf':
        descriptor = cv2.xfeatures2d.SURF_create()
    elif method == 'brisk':
        descriptor = cv2.BRISK_create()
    elif method == 'orb':
        descriptor = cv2.ORB_create()
        
    # get keypoints and descriptors
    (kps, features) = descriptor.detectAndCompute(image, None)
    
    return (kps, features)

We run detectAndCompute() on both, the query and the train image. At this point, we have a set of key points and descriptors for both images. If we use SIFT as the feature extractor, it returns a 128-dimensional feature vector for each key point. If SURF is chosen, we get a 64-dimensional feature vector. The following images show some of the features extracted using SIFT, SURF, BRISK, and ORB.

Detection of key points and descriptors using SIFT

Detection of key points and descriptors using SURF

Detection of key points and descriptors using BRISK and Hamming distances.

Detection of key points and descriptors using ORB and Hamming distances.

Feature Matching

As we can see, we have a large number of features from both images. Now, we would like to compare the 2 sets of features and stick with the pairs that show more similarity.

With OpenCV, feature matching requires a Matcher object. Here, we explore two flavors:

  • Brute Force Matcher
  • KNN (k-Nearest Neighbors)

The BruteForce (BF) Matcher does exactly what its name suggests. Given 2 sets of features (from image A and image B), each feature from set A is compared against all features from set B. By default, BF Matcher computes the Euclidean distance between two points. Thus, for every feature in set A, it returns the closest feature from set B. For SIFT and SURF OpenCV recommends using Euclidean distance. For other feature extractors like ORB and BRISK, Hamming distance is suggested.

To create a BruteForce Matcher using OpenCV we only need to specify 2 parameters. The first is the distance metric. The second is the crossCheck boolean parameter.

def createMatcher(method,crossCheck):
    "Create and return a Matcher Object"
    
    if method == 'sift' or method == 'surf':
        bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck=crossCheck)
    elif method == 'orb' or method == 'brisk':
        bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=crossCheck)
    return bf

The crossCheck bool parameter indicates whether the two features have to match each other to be considered valid. In other words, for a pair of features (f1, f2) to considered valid, f1 needs to match f2 and f2 has to match f1 as the closest match as well. This procedure ensures a more robust set of matching features and is described in the original SIFT paper.

However, for cases where we want to consider more than one candidate match, we can use a KNN based matching procedure.

Instead of returning the single best match for a given feature, KNN returns the k best matches.
Note that the value of k has to be pre-defined by the user. As we expect, KNN provides a larger set of candidate features. However, we need to ensure that all these matching pairs are robust before going further.

Ratio Testing

To make sure the features returned by KNN are well comparable, the authors of the SIFT paper, suggests a technique called ratio test. Basically, we iterate over each of the pairs returned by KNN and perform a distance test. For each pair of features (f1, f2), if the distance between f1 and f2 is within a certain ratio, we keep it, otherwise, we throw it away. Also, the ratio value must be chosen manually.

In essence, ratio testing does the same job as the cross-checking option from the BruteForce Matcher. Both, ensure a pair of detected features are indeed close enough to be considered similar. The 2 figures below show the results of BF and KNN Matcher on SIFT features. We chose to display only 100 matching points to clear visualization.

Feature matching using KNN and Ration Testing on SIFT features

Feature matching using Brute Force Matcher on SIFT features

Note that even after cross-checking for Brute force and ratio testing in KNN, some of the features do not match properly.

Nevertheless, the Matcher algorithm will give us the best (more similar) set of features from both images. Now, we need to take these points and find the transformation matrix that will stitch the 2 images together based on their matching points.

Such a transformation is called the Homography matrix. Briefly, the homography is a 3x3 matrix that can be used in many applications such as camera pose estimation, perspective correction, and image stitching. The Homography is a 2D transformation. It maps points from one plane (image) to another. Let’s see how we get it.

Estimating the Homography

RANdom SAmple Consensus or RANSAC is an iterative algorithm to fit linear models. Different from other linear regressors, RANSAC is designed to be robust to outliers.

Models like Linear Regression uses least-squares estimation to fit the best model to the data. However, ordinary least squares is very sensitive to outliers. As a result, it might fail if the number of outliers is significant.

RANSAC solves this problem by estimating parameters only using a subset of inliers in the data. The figure below shows a comparison between Linear Regression and RANSAC. First, note that the dataset contains a fairly high number of outliers.

We can see that the Linear Regression model gets easily influenced by the outliers. That is because it is trying to reduce the average error. Thus, it tends to favor models that minimize the overall distance from all data points to the model itself. And that includes outliers.

On the contrary, RANSAC only fits the model on the subset of points identified as the inliers.

This characteristic is very important to our use case. Here, we are going to use RANSAC to estimate the Homography matrix. It turns out that the Homography is very sensitive to the quality of data we pass to it. Hence, it is important to have an algorithm (RANSAC) that can filter points that clearly belong to the data distribution from the ones which do not.

Comparison between Least Squares and RANSAC model fitting. Note the substantial number of outliers in the data.

Once we have the estimated Homography, we need to warp one of the images to a common plane.

Here, we are going to apply a perspective transformation to one of the images. Basically, a perspective transform may combine one or more operations like rotation, scale, translation, or shear. The idea is to transform one of the images so that both images merge as one. To do this, we can use the OpenCV warpPerspective() function. It takes an image and the homography as input. Then, it warps the source image to the destination based on the homography.

# Apply panorama correction
width = trainImg.shape[1] + queryImg.shape[1]
height = trainImg.shape[0] + queryImg.shape[0]

result = cv2.warpPerspective(trainImg, H, (width, height))
result[0:queryImg.shape[0], 0:queryImg.shape[1]] = queryImg

plt.figure(figsize=(20,10))
plt.imshow(result)

plt.axis('off')
plt.show()

The resulting panorama image is shown below. As we see, there are a couple of artifacts in the result. More specifically, we can see some problems related to lighting conditions and edge effects at the image boundaries. Ideally, we can perform post-processing techniques to normalize the intensities like histogram matching. This would likely make the result look more realistic.

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading about Python

Complete Python Bootcamp: Go from zero to hero in Python 3

Machine Learning A-Z™: Hands-On Python & R In Data Science

Python Tutorial - Python GUI Programming - Python GUI Examples (Tkinter Tutorial)

Python Programming Tutorial | Full Python Course for Beginners 2019

Computer Vision Using OpenCV

OpenCV Python Tutorial - Computer Vision With OpenCV In Python

Python Tutorial: Image processing with Python (Using OpenCV)

A guide to Face Detection in Python

Machine Learning Tutorial - Image Processing using Python, OpenCV, Keras and TensorFlow

Face Detection using Open-CV

A guide to Object Detection with OpenCV and Swift

Input image pair

Panoramic Image

Input image pair

Panoramic Image

Input image pair

Panoramic Image

Input image pair.

Panoramic Image

Python Tutorial: Image processing with Python (Using OpenCV)

Python Tutorial: Image processing with Python (Using OpenCV)

In this tutorial, you will learn how you can process images in Python using the OpenCV library.

In this tutorial, you will learn how you can process images in Python using the OpenCV library.

OpenCV is a free open source library used in real-time image processing. It’s used to process images, videos, and even live streams, but in this tutorial, we will process images only as a first step. Before getting started, let’s install OpenCV.

Table of Contents

Install OpenCV

To install OpenCV on your system, run the following pip command:

 pip install opencv-python

Now OpenCV is installed successfully and we are ready. Let’s have some fun with some images!

Rotate an Image

First of all, import the cv2 module.

 import cv2

Now to read the image, use the imread() method of the cv2 module, specify the path to the image in the arguments and store the image in a variable as below:

 img = cv2.imread("pyimg.jpg")

The image is now treated as a matrix with rows and columns values stored in img.

Actually, if you check the type of the img, it will give you the following result:

>>>print(type(img))
 
<class 'numpy.ndarray'>

It’s a NumPy array! That why image processing using OpenCV is so easy. All the time you are working with a NumPy array.

To display the image, you can use the imshow() method of cv2.

cv2.imshow('Original Image', img) 
 
cv2.waitKey(0)

The waitkey functions take time as an argument in milliseconds as a delay for the window to close. Here we set the time to zero to show the window forever until we close it manually.

To rotate this image, you need the width and the height of the image because you will use them in the rotation process as you will see later.

 height, width = img.shape[0:2]

The shape attribute returns the height and width of the image matrix. If you print img.shape[0:2] , you will have the following output:

Okay, now we have our image matrix and we want to get the rotation matrix. To get the rotation matrix, we use the getRotationMatrix2D() method of cv2. The syntax of getRotationMatrix2D() is:

 cv2.getRotationMatrix2D(center, angle, scale)

Here the center is the center point of rotation, the angle is the angle in degrees and scale is the scale property which makes the image fit on the screen.

To get the rotation matrix of our image, the code will be:

 rotationMatrix = cv2.getRotationMatrix2D((width/2, height/2), 90, .5)

The next step is to rotate our image with the help of the rotation matrix.

To rotate the image, we have a cv2 method named wrapAffine which takes the original image, the rotation matrix of the image and the width and height of the image as arguments.

 rotatedImage = cv2.warpAffine(img, rotationMatrix, (width, height))

The rotated image is stored in the rotatedImage matrix. To show the image, use imshow() as below:

cv2.imshow('Rotated Image', rotatedImage)
 
cv2.waitKey(0)

After running the above lines of code, you will have the following output:

Crop an Image

First, we need to import the cv2 module and read the image and extract the width and height of the image:

import cv2
 
img = cv2.imread("pyimg.jpg")
 
height, width = img.shape[0:2]

Now get the starting and ending index of the row and column. This will define the size of the newly created image. For example, start from row number 10 till row number 15 will give the height of the image.

Similarly, start from column number 10 until column number 15 will give the width of the image.

You can get the starting point by specifying the percentage value of the total height and the total width. Similarly, to get the ending point of the cropped image, specify the percentage values as below:

startRow = int(height*.15)
 
startCol = int(width*.15)
 
endRow = int(height*.85)
 
endCol = int(width*.85)

Now map these values to the original image. Note that you have to cast the starting and ending values to integers because when mapping, the indexes are always integers.

 croppedImage = img[startRow:endRow, startCol:endCol]

Here we specified the range from starting to ending of rows and columns.

Now display the original and cropped image in the output:

cv2.imshow('Original Image', img)
 
cv2.imshow('Cropped Image', croppedImage)
 
cv2.waitKey(0)

The result will be as follows:

Resize an Image

To resize an image, you can use the resize() method of openCV. In the resize method, you can either specify the values of x and y axis or the number of rows and columns which tells the size of the image.

Import and read the image:

import cv2
 
img = cv2.imread("pyimg.jpg")

Now using the resize method with axis values:

newImg = cv2.resize(img, (0,0), fx=0.75, fy=0.75)
 
cv2.imshow('Resized Image', newImg)
 
cv2.waitKey(0)

The result will be as follows:

Now using the row and column values to resize the image:

newImg = cv2.resize(img, (550, 350))
 
cv2.imshow('Resized Image', newImg)
 
cv2.waitKey(0)

We say we want 550 columns (the width) and 350 rows (the height).

The result will be:

Adjust Image Contrast

In Python OpenCV module, there is no particular function to adjust image contrast but the official documentation of OpenCV suggests an equation that can perform image brightness and image contrast both at the same time.

 new_img = a * original_img + b

Here a is alpha which defines contrast of the image. If a is greater than 1, there will be higher contrast.

If the value of a is between 0 and 1 (smaller than 1 but greater than 0), there would be lower contrast. If a is 1, there will be no contrast effect on the image.

b stands for beta. The values of b vary from -127 to +127.

To implement this equation in Python OpenCV, you can use the addWeighted() method. We use The addWeighted() method as it generates the output in the range of 0 and 255 for a 24-bit color image.

The syntax of addWeighted() method is as follows:

 cv2.addWeighted(source_img1, alpha1, source_img2, alpha2, beta)

This syntax will blend two images, the first source image (source_img1) with a weight of alpha1 and second source image (source_img2).

If you only want to apply contrast in one image, you can add a second image source as zeros using NumPy.

Let’s work on a simple example. Import the following modules:

import cv2
 
import numpy as np

Read the original image:

 img = cv2.imread("pyimg.jpg")

Now apply the contrast. Since there is no other image, we will use the np.zeros which will create an array of the same shape and data type as the original image but the array will be filled with zeros.

contrast_img = cv2.addWeighted(img, 2.5, np.zeros(img.shape, img.dtype), 0, 0)
 
cv2.imshow('Original Image', img)
 
cv2.imshow('Contrast Image', contrast_img)
 
cv2.waitKey(0)

In the above code, the brightness is set to 0 as we only want to apply contrast.

The comparison of the original and contrast image is as follows:

Make an image blurry

Gaussian Blur

To make an image blurry, you can use the GaussianBlur() method of OpenCV.

The GaussianBlur() uses the Gaussian kernel. The height and width of the kernel should be a positive and an odd number.

Then you have to specify the X and Y direction that is sigmaX and sigmaY respectively. If only one is specified, both are considered the same.

Consider the following example:

import cv2
 
img = cv2.imread("pyimg.jpg")
 
blur_image = cv2.GaussianBlur(img, (7,7), 0)
 
cv2.imshow('Original Image', img)
 
cv2.imshow('Blur Image', blur_image)
 
cv2.waitKey(0)

In the above snippet, the actual image is passed to GaussianBlur() along with height and width of the kernel and the X and Y directions.

The comparison of the original and blurry image is as follows:

Median Blur

In median blurring, the median of all the pixels of the image is calculated inside the kernel area. The central value is then replaced with the resultant median value. Median blurring is used when there are salt and pepper noise in the image.

To apply median blurring, you can use the medianBlur() method of OpenCV.

Consider the following example where we have a salt and pepper noise in the image:

import cv2
 
img = cv2.imread("pynoise.png")
 
blur_image = cv2.medianBlur(img,5)

This will apply 50% noise in the image along with median blur. Now show the images:

cv2.imshow('Original Image', img)
 
cv2.imshow('Blur Image', blur_image)
 
cv2.waitKey(0)

The result will be like the following:

Another comparison of the original image and after blurring:

Detect Edges

To detect the edges in an image, you can use the Canny() method of cv2 which implements the Canny edge detector. The Canny edge detector is also known as the optimal detector.

The syntax to Canny() is as follows:

 cv2.Canny(image, minVal, maxVal)

Here minVal and maxVal are the minimum and maximum intensity gradient values respectively.

Consider the following code:

import cv2
 
img = cv2.imread("pyimg.jpg")
 
edge_img = cv2.Canny(img,100,200)
 
cv2.imshow("Detected Edges", edge_img)
 
cv2.waitKey(0)

The output will be the following:

Here is the result of the above code on another image:

Convert image to grayscale (Black & White)

The easy way to convert an image in grayscale is to load it like this:

 img = cv2.imread("pyimg.jpg", 0)

There is another method using BGR2GRAY.

To convert a color image into a grayscale image, use the BGR2GRAY attribute of the cv2 module. This is demonstrated in the example below:

Import the cv2 module:

 import cv2

Read the image:

 img = cv2.imread("pyimg.jpg")

Use the cvtColor() method of the cv2 module which takes the original image and the COLOR_BGR2GRAY attribute as an argument. Store the resultant image in a variable:

 gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Display the original and grayscale images:

cv2.imshow("Original Image", img)
 
cv2.imshow("Gray Scale Image", gray_img)
 
cv2.waitKey(0)

The output will be as follows:

Centroid (Center of blob) detection

To find the center of an image, the first step is to convert the original image into grayscale. We can use the cvtColor() method of cv2 as we did before.

This is demonstrated in the following code:

import cv2
 
img = cv2.imread("py.jpg")
 
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

We read the image and convert it to a grayscale image. The new image is stored in gray_img.

Now we have to calculate the moments of the image. Use the moments() method of cv2. In the moments() method, the grayscale image will be passed as below:

 moment = cv2.moments(gray_img)

Finally, we have the center of the image. To highlight this center position, we can use the circle method which will create a circle in the given coordinates of the given radius.

The circle() method takes the img, the x and y coordinates where the circle will be created, the size, the color that we want the circle to be and the thickness.

 cv2.circle(img, (X, Y), 15, (205, 114, 101), 1)

The circle is created on the image.

cv2.imshow("Center of the Image", img)
 
cv2.waitKey(0)

The original image is:

After detecting the center, our image will be as follows:

Apply a mask for a colored image

Image masking means to apply some other image as a mask on the original image or to change the pixel values in the image.

To apply a mask on the image, we will use the HoughCircles() method of the OpenCV module. The HoughCircles() method detects the circles in an image. After detecting the circles, we can simply apply a mask on these circles.

The HoughCircles() method takes the original image, the Hough Gradient (which detects the gradient information in the edges of the circle), and the information from the following circle equation:

 (x - xcenter)2 + (y - ycenter)2 = r2

In this equation (xcenter , ycenter) is the center of the circle and r is the radius of the circle.

Our original image is:

After detecting circles in the image, the result will be:

Okay, so we have the circles in the image and we can apply the mask. Consider the following code:

import cv2
 
import numpy as np
 
img1 = cv2.imread('pyimg.jpg')
 
img1 = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

Detecting the circles in the image using the HoughCircles() code from OpenCV: Hough Circle Transform:

gray_img = cv2.medianBlur(cv2.cvtColor(img, cv2.COLOR_RGB2GRAY), 3)
 
circles = cv2.HoughCircles(gray_img, cv2.HOUGH_GRADIENT, 1, 20, param1=50, param2=50, minRadius=0, maxRadius=0)
 
circles = np.uint16(np.around(circles))

To create the mask, use np.full which will return a NumPy array of given shape:

masking=np.full((img1.shape[0], img1.shape[1]),0,dtype=np.uint8)
 
for j in circles[0, :]:
 
    cv2.circle(masking, (j[0], j[1]), j[2], (255, 255, 255), -1)

The next step is to combine the image and the masking array we created using the bitwise_or operator as follows:

 final_img = cv2.bitwise_or(img1, img1, masking=masking)

Display the resultant image:

Extracting text from Image (OCR)

To extract text from an image, you can use Google Tesseract-OCR. You can download it from this link

Then you should install the pytesseract module which is a Python wrapper for Tesseract-OCR.

The image from which we will extract the text from is as follows:

Now let’s convert the text in this image to a string of characters and display the text as a string on output:

Import the pytesseract module:

 import pytesseract

Set the path of the Tesseract-OCR executable file:

 pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract'

Now use the image_to_string method to convert the image into a string:

 print(pytesseract.image_to_string('pytext.png'))

The output will be as follows:

Works like charm!

Detect and correct text skew

In this section, we will correct the text skew.

The original image is as follows:

Import the modules cv2, NumPy and read the image:

import cv2
 
import numpy as np
 
img = cv2.imread("pytext1.png")

Convert the image into a grayscale image:

 gray_img=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Invert the grayscale image using bitwise_not:

 gray_img=cv2.bitwise_not(gray_img)

Select the x and y coordinates of the pixels greater than zero by using the column_stack method of NumPy:

 coordinates = np.column_stack(np.where(gray_img > 0))

Now we have to calculate the skew angle. We will use the minAreaRect() method of cv2 which returns an angle range from -90 to 0 degrees (where 0 is not included).

 ang=cv2.minAreaRect(coordinates)[-1]

The rotated angle of the text region will be stored in the ang variable. Now we add a condition for the angle; if the text region’s angle is smaller than -45, we will add a 90 degrees else we will multiply the angle with a minus to make the angle positive.

if ang<-45:
 
    ang=-(90+ang)
 
else:
 
    ang=-ang

Calculate the center of the text region:

height, width = img.shape[:2]
 
center_img = (width / 2, height / 2)

Now we have the angle of text skew, we will apply the getRotationMatrix2D() to get the rotation matrix then we will use the wrapAffine() method to rotate the angle (explained earlier).

rotationMatrix = cv2.getRotationMatrix2D(center, angle, 1.0)
 
rotated_img = cv2.warpAffine(img, rotationMatrix, (width, height), borderMode = cv2.BORDER_REFLECT)

Display the rotated image:

cv2.imshow("Rotated Image", rotated_img)
 
cv2.waitKey(0)

Color Detection

Let’s detect the green color from an image:

Import the modules cv2 for images and NumPy for image arrays:

import cv2
 
import numpy as np

Read the image and convert it into HSV using cvtColor():

img = cv2.imread("pydetect.png")
 
hsv_img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

Display the image:

 cv2.imshow("HSV Image", hsv_img)

Now create a NumPy array for the lower green values and the upper green values:

lower_green = np.array([34, 177, 76])
 
upper_green = np.array([255, 255, 255])

Use the inRange() method of cv2 to check if the given image array elements lie between array values of upper and lower boundaries:

 masking = cv2.inRange(hsv_img, lower_green, upper_green)

This will detect the green color.

Finally, display the original and resultant images:

 cv2.imshow("Original Image", img)

cv2.imshow("Green Color detection", masking)
 
cv2.waitKey(0)

Reduce Noise

To reduce noise from an image, OpenCV provides the following methods:

  1. fastNlMeansDenoising(): Removes noise from a grayscale image
  2. fastNlMeansDenoisingColored(): Removes noise from a colored image
  3. fastNlMeansDenoisingMulti(): Removes noise from grayscale image frames (a grayscale video)
  4. fastNlMeansDenoisingColoredMulti(): Same as 3 but works with colored frames

Let’s use fastNlMeansDenoisingColored() in our example:

Import the cv2 module and read the image:

2
3
	
import cv2
 
img = cv2.imread("pyn1.png")

Apply the denoising function which takes respectively the original image (src), the destination (which we have kept none as we are storing the resultant), the filter strength, the image value to remove the colored noise (usually equal to filter strength or 10), the template patch size in pixel to compute weights which should always be odd (recommended size equals 7) and the window size in pixels to compute average of the given pixel.

 result = cv2.fastNlMeansDenoisingColored(img,None,20,10,7,21)

Display original and denoised image:

cv2.imshow("Original Image", img)
 
cv2.imshow("Denoised Image", result)
 
cv2.waitKey(0)

The output will be:

Get image contour

Contours are the curves in an image that are joint together. The curves join the continuous points in an image. The purpose of contours is used to detect the objects.

The original image of which we are getting the contours of is given below:

Consider the following code where we used the findContours() method to find the contours in the image:

Import cv2 module:

 import cv2

Read the image and convert it to a grayscale image:

img = cv2.imread('py1.jpg')
 
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Find the threshold:

 retval, thresh = cv2.threshold(gray_img, 127, 255, 0)

Use the findContours() which takes the image (we passed threshold here) and some attributes. See findContours() Official.

 img_contours, _ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

Draw the contours on the image using drawContours() method:

  cv2.drawContours(img, img_contours, -1, (0, 255, 0))

Display the image:

cv2.imshow('Image Contours', img)
 
cv2.waitKey(0)

The result will be:

Remove Background from an image

To remove the background from an image, we will find the contours to detect edges of the main object and create a mask with np.zeros for the background and then combine the mask and the image using the bitwise_and operator.

Consider the example below:

Import the modules (NumPy and cv2):

import cv2
 
import numpy as np

Read the image and convert the image into a grayscale image:

img = cv2.imread("py.jpg")
 
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Find the threshold:

 _, thresh = cv2.threshold(gray_img, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

In the threshold() method, the last argument defines the style of the threshold. See Official documentation of OpenCV threshold.

Find the image contours:

 img_contours = cv2.findContours(threshed, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)[-2]

Sort the contours:

img_contours = sorted(img_contours, key=cv2.contourArea)
 
for i in img_contours:
 
    if cv2.contourArea(i) > 100:
 
        break

Generate the mask using np.zeros:

 mask = np.zeros(img.shape[:2], np.uint8)

Draw contours:

 cv2.drawContours(mask, [i],-1, 255, -1)

Apply the bitwise_and operator:

 new_img = cv2.bitwise_and(img, img, mask=mask)

Display the original image:

 cv2.imshow("Original Image", img)

Display the resultant image:

cv2.imshow("Image with background removed", new_img)
 
cv2.waitKey(0)

Image processing is fun when using OpenCV as you saw. I hope you find the tutorial useful. Keep coming back.

Thank you.