In the previous article, we discussed about object detection using GluonCV. In this article, we will discuss how to implement a binary image classifier that classify whether a given image is a tennis ball or not using a pre-trained image classification network from GluonCV. We implement the machine learning pipeline step by step, from loading and transforming an input image, to loading and using a pre-trained model.

  1. Import Libraries

To start with some initial setup we will import packages and set the path to the data.

import mxnet as mx
import gluoncv as gcv
import matplotlib.pyplot as plt
import numpy as np
import os
from pathlib import Path

2. Load Image

To load the image, let us implement a function that loads an image from disk given a filepath. The function should return an 8-bit image array, that’s in MXNet’s NDArray format and in HWC layout (i.e. height, width then channel).

def load_image(filepath):
    image = mx.image.imread(filepath)
    return image

3. Transform the Image

After loading the image, we should transform the image so it can be used as input to the pre-trained network. We plan to use a pre-trained network on **ImageNet. **Therefore, the image transformation should follow the same steps used for ImageNet pre-training. The image should be transformed by:

  1. Resizing the shortest dimension to 224. e.g (448, 1792) -> (224, 896).
  2. Cropping to a center square of dimension (224, 224).
  3. Converting the image from HWC layout to CHW layout.
  4. Normalizing the image using ImageNet statistics (i.e. per colour channel mean and variance).
  5. Creating a batch of 1 image.

This can be achived using the following function.

def transform_image(array):
    image = gcv.data.transforms.presets.imagenet.transform_eval(array)
    return image

4. Load a Model

We will use a MobileNet 1.0 image classification model that’s been pre-trained on ImageNet. The model can be loaded from the GluonCV model zoo as follows:

def load_pretrained_classification_network():
    model = gcv.model_zoo.get_model('MobileNet1.0', pretrained=True, root = M3_MODELS)
    return model

5. Use a Model

After loading an image, next task is to pass your transformed image through the pretrained network to obtain predicted probabilities for all ImageNet classes (ignore the tennis ball class for now).

Hint #1: Don’t forget that you’re typically working with a batch of images, even when you only have one image.

Hint #2: Remember that the direct outputs of our network aren’t probabilities.

#gluoncv #deep-learning #image-processing #mxnet #image-classification #deep learning

Image Classification using GluonCV
3.65 GEEK