Building Image Classification Model with Keras

Building Image Classification Model with Keras

Artificial intelligence (AI) works on prodigious amounts of data feeds to achieve cognitive abilities. Rich machine learning libraries such as Keras and TensorFlow are contributing to the dynamic artificial intelligence services. Keras and...

Artificial intelligence (AI) works on prodigious amounts of data feeds to achieve cognitive abilities. Rich machine learning libraries such as Keras and TensorFlow are contributing to the dynamic artificial intelligence services . Keras and Tensorflow together support model training to build image recognition, deep video analytics, brand monitoring, facial gesture recognition, and other machine learning models.

This post highlights some common operations that you would frequently need in the Keras. First, we will understand how to save the models and use them for prediction later. Also, this post explains how to display the images from a dataset and load images from our systems to predict the class.

Training and Saving the Model
Training the models is a very slow process, nobody wants to do that every time. Fortunately, we only need to train the model once, save it and then we can load it anytime and use it to predict the new images. Keras saves the models in the .h5 format.

import keras

from keras.datasets import mnist

from keras.layers import Dense

from keras.models import Sequential

from keras.optimizers import SGD

(train_x, train_y) , (test_x, test_y) = mnist.load_data()

#train_x = train_x.astype(‘float32’) / 255

#test_x = test_x.astype(‘float32’) / 255

print(train_x.shape)

print(train_y.shape)

print(test_x.shape)

print(test_y.shape)

train_x = train_x.reshape(60000,784)

test_x = test_x.reshape(10000,784)

train_y = keras.utils.to_categorical(train_y,10)

test_y = keras.utils.to_categorical(test_y,10)

model = Sequential()

model.add(Dense(units=128,activation=“relu”,input_shape=(784,)))

model.add(Dense(units=128,activation=“relu”))

model.add(Dense(units=128,activation=“relu”))

model.add(Dense(units=10,activation=“softmax”))

model.compile(optimizer=SGD(0.001),loss=“categorical_crossentropy”,metrics=[“accuracy”])

model.fit(train_x,train_y,batch_size=32,epochs=10,verbose=1)

accuracy = model.evaluate(x=test_x,y=test_y,batch_size=32)

print(“Accuracy: “,accuracy[1])

To save model, simply add below after model.fit()

model.save(“mnist-model.h5”)

Inference

Inference refers to process of predicting the images using our model.

import keras

from keras.datasets import mnist

from keras.layers import Dense

from keras.models import Sequential

from keras.optimizers import SGD

(train_x, train_y) , (test_x, test_y) = mnist.load_data()

#train_x = train_x.astype(‘float32’) / 255

#test_x = test_x.astype(‘float32’) / 255

print(train_x.shape)

print(train_y.shape)

print(test_x.shape)

print(test_y.shape)

train_x = train_x.reshape(60000,784)

test_x = test_x.reshape(10000,784)

train_y = keras.utils.to_categorical(train_y,10)

test_y = keras.utils.to_categorical(test_y,10)

model = Sequential()

model.add(Dense(units=128,activation=“relu”,input_shape=(784,)))

model.add(Dense(units=128,activation=“relu”))

model.add(Dense(units=128,activation=“relu”))

model.add(Dense(units=10,activation=“softmax”))

model.compile(optimizer=SGD(0.001),loss=“categorical_crossentropy”,metrics=[“accuracy”])

model.load_weights(“mnist-model.h5”)

#model.fit(train_x,train_y,batch_size=32,epochs=10,verbose=1)

#model.save(“mnistmodel.h5”)

accuracy = model.evaluate(x=test_x,y=test_y,batch_size=32)

print(“Accuracy: “,accuracy[1])

We loaded the parameters of the model from the saved model file and evaluated that function runs prediction over test dataset and returns accuracy of our predictions.

So far, we have demonstrated how to save the models and use them later for prediction. However, this is a comparatively easy and common task. The main task is being able to load a specific image and determine what class it belongs to.

img = test_x[130]

test_img = img.reshape((1,784))

img_class = model.predict_classes(test_img)

prediction = img_class[0]

classname = img_class[0]

print(“Class: “,classname)

Here we just pick a random image. In this case at index 130 from the test set, we create the flatten copy that is reshaped.

Now that we have a prediction, we use Matplotlib to display the image and its predicted class.

img = img.reshape((28,28))

plt.imshow(img)

plt.title(classname)

plt.show()

import keras

from keras.datasets import mnist

from keras.layers import Dense

from keras.models import Sequential

from keras.optimizers import SGD

import matplotlib.pyplot as plt

(train_x, train_y) , (test_x, test_y) = mnist.load_data()

train_x = train_x.reshape(60000,784)

test_x = test_x.reshape(10000,784)

train_y = keras.utils.to_categorical(train_y,10)

test_y = keras.utils.to_categorical(test_y,10)

model = Sequential()

model.add(Dense(units=128,activation=“relu”,input_shape=(784,)))

model.add(Dense(units=128,activation=“relu”))

model.add(Dense(units=128,activation=“relu”))

model.add(Dense(units=10,activation=“softmax”))

model.compile(optimizer=SGD(0.001),loss=“categorical_crossentropy”,metrics=[“accuracy”])

model.load_weights(“mnistmodel.h5”)

img = test_x[130]

test_img = img.reshape((1,784))

img_class = model.predict_classes(test_img)

prediction = img_class[0]

classname = img_class[0]

print(“Class: “,classname)

img = img.reshape((28,28))

plt.imshow(img)

plt.title(classname)

plt.show()

But, what if we want to upload an image that is not included in the test set? For this test, please save the image below to your system and copy it into the directory where your python file resides.

Angular 9 Tutorial: Learn to Build a CRUD Angular App Quickly

What's new in Bootstrap 5 and when Bootstrap 5 release date?

What’s new in HTML6

How to Build Progressive Web Apps (PWA) using Angular 9

What is new features in Javascript ES2020 ECMAScript 2020

Transfer Learning for Image Classification using Keras in Python

Transfer Learning for Image Classification using Keras in Python

In this tutorial, you will learn how to use transfer learning for image classification using Keras in Python. Keras’s high-level API makes this super easy, only requiring a few simple steps.

In the real world, it is rare to train a Convolutional Neural Network (CNN) from scratch, as it is hard to collect a massive dataset to get better performance. Instead, it is common to use a pretrained network on a very large dataset and tune it for your classification problem, this process is called Transfer Learning.

What is Transfer Learning

It is a machine learning method where a model is trained on a task that can be trained (or tuned) for another task, it is very popular nowadays especially in computer vision and natural language processing problems. Transfer learning is very handy given the enormous resources required to train deep learning models. Here are the most important benefits of transfer learning:

  • Speeds up training time.
  • It requires less data.
  • Use the state-of-the-art models that are developed by deep learning experts.

For these reasons, it is better to use transfer learning for image classification problems instead of creating your model and training from scratch, models such as ResNet, InceptionV3, Xception, and MobileNet are trained on a massive dataset called ImageNet which contains of more than 14 million images that classifies 1000 different objects.

Loading & Preparing the Dataset

We gonna be using flower photos dataset, which consists of 5 types of flowers (daisy, dandelion, roses, sunflowers and tulips).

After you have everything installed by the following command:

pip3 install tensorflow keras numpy matplotlib

Open up a new Python file and import the necessary modules:

import tensorflow as tf
from keras.models import Model
from keras.applications import MobileNetV2, ResNet50, InceptionV3 # try to use them and see which is better
from keras.layers import Dense
from keras.callbacks import ModelCheckpoint, TensorBoard
from keras.utils import get_file
from keras.preprocessing.image import ImageDataGenerator
import os
import pathlib
import numpy as np

The dataset comes with inconsistent image sizes, as a result, we gonna need to resize all the images to a shape that is acceptable by MobileNet (the model that we gonna use):

batch_size = 32
# 5 types of flowers
num_classes = 5
# training for 10 epochs
epochs = 10
# size of each image
IMAGE_SHAPE = (224, 224, 3)

Let's load the dataset:

def load_data():
    """This function downloads, extracts, loads, normalizes and one-hot encodes Flower Photos dataset"""
    # download the dataset and extract it
    data_dir = get_file(origin='https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',
                                         fname='flower_photos', untar=True)
    data_dir = pathlib.Path(data_dir)
    # count how many images are there
    image_count = len(list(data_dir.glob('*/*.jpg')))
    print("Number of images:", image_count)
    # get all classes for this dataset (types of flowers) excluding LICENSE file
    CLASS_NAMES = np.array([item.name for item in data_dir.glob('*') if item.name != "LICENSE.txt"])
    # roses = list(data_dir.glob('roses/*'))
    # 20% validation set 80% training set
    image_generator = ImageDataGenerator(rescale=1/255, validation_split=0.2)
    # make the training dataset generator
    train_data_gen = image_generator.flow_from_directory(directory=str(data_dir), batch_size=batch_size,
                                                        classes=list(CLASS_NAMES), target_size=(IMAGE_SHAPE[0], IMAGE_SHAPE[1]),
                                                        shuffle=True, subset="training")
    # make the validation dataset generator
    test_data_gen = image_generator.flow_from_directory(directory=str(data_dir), batch_size=batch_size, 
                                                        classes=list(CLASS_NAMES), target_size=(IMAGE_SHAPE[0], IMAGE_SHAPE[1]),
                                                        shuffle=True, subset="validation")
    return train_data_gen, test_data_gen, CLASS_NAMES

The above function downloads and extracts the dataset, and then uses the ImageDataGenerator keras utility class to wrap the dataset in a Python generator (so the images only loads to memory by batches, not in one shot).

After that, we scale and resize the images to a fixed shape and then split the dataset by 80% for training and 20% for validation.

Constructing the Model

We are going to use MobileNetV2 model, it is not a very heavy model but does a good job in the training and testing process.

As mentioned earlier, this model is trained to classify different 1000 objects, we need a way to tune this model so it can be suitable for just our flower classification. As a result, we are going to remove that last fully connected layer, and add our own final layer that consists of 5 units with softmax activation function:

def create_model(input_shape):
    # load MobileNetV2
    model = MobileNetV2(input_shape=input_shape)
    # remove the last fully connected layer
    model.layers.pop()
    # freeze all the weights of the model except the last 4 layers
    for layer in model.layers[:-4]:
        layer.trainable = False
    # construct our own fully connected layer for classification
    output = Dense(num_classes, activation="softmax")
    # connect that dense layer to the model
    output = output(model.layers[-1].output)
    model = Model(inputs=model.inputs, outputs=output)
    # print the summary of the model architecture
    model.summary()
    # training the model using rmsprop optimizer
    model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
    return model

The above function will first download the model weights (if not available) and then remove the last layer.

After that, we freeze the last layers, that's because it is pre trained, we don't wanna modify these weights. However, it is a good practice to retrain the last convolutional layer as this dataset is quite similar to the original ImageNet dataset, so we won't ruin the weights (that much).

Finally, we construct our own dense layer that consists of five neurons and connect it to last layer of the MobileNetV2 model. The following figure demonstrates the architecture:

Training the Model

Let's use the above two functions to start training:

if __name__ == "__main__":
    # load the data generators
    train_generator, validation_generator, class_names = load_data()
    # constructs the model
    model = create_model(input_shape=IMAGE_SHAPE)
    # model name
    model_name = "MobileNetV2_finetune_last5"
    # some nice callbacks
    tensorboard = TensorBoard(log_dir=f"logs/{model_name}")
    checkpoint = ModelCheckpoint(f"results/{model_name}" + "-loss-{val_loss:.2f}-acc-{val_acc:.2f}.h5",
                                save_best_only=True,
                                verbose=1)
    # make sure results folder exist
    if not os.path.isdir("results"):
        os.mkdir("results")
    # count number of steps per epoch
    training_steps_per_epoch = np.ceil(train_generator.samples / batch_size)
    validation_steps_per_epoch = np.ceil(validation_generator.samples / batch_size)
    # train using the generators
    model.fit_generator(train_generator, steps_per_epoch=training_steps_per_epoch,
                        validation_data=validation_generator, validation_steps=validation_steps_per_epoch,
                        epochs=epochs, verbose=1, callbacks=[tensorboard, checkpoint])

Nothing fancy here, loading the data, constructing the model and then using some callbacks for tracking and saving the best models.

As soon as you execute the script, the training process begins, you'll notice that not all weights are being trained:

Total params: 2,264,389
Trainable params: 418,565
Non-trainable params: 1,845,824

It'll take several minutes depending on your hardware.

I used tensorboard to experiment a little bit, for example, I tried freezing all the weights except for the last classification layer, decreasing the optimizer learning rate, used some image flipping, zooming and general augmentation, here is a screenshot:

MobileNetV2 was the model I freezed all its weights (except for the last 5 unit dense layer of course).

MobileNetV2_augmentation uses some image augmentation.

MobileNetV2_finetune_last5 the model we're using right know, which does not freeze the last 4 layers of MobileNetV2 model.

MobileNetV2_finetune_last5_less_lr was the dominant for almost 86% accuracy, that's because once you don't freeze the trained weights, you need to decrease the learning rate so you can slowly adjust the weights to your dataset. This was an Adam optimizer with 0.0005 learning rate.

Note: to modify the learning rate, you can import Adam optimizer from keras.optimizers package, and then compile the model with optimizer=Adam(lr=0.0005) parameter.

Testing the Model
# load the data generators
train_generator, validation_generator, class_names = load_data()
# constructs the model
model = create_model(input_shape=IMAGE_SHAPE)
# load the optimal weights
model.load_weights("results/MobileNetV2_finetune_last5_less_lr-loss-0.45-acc-0.86.h5")
validation_steps_per_epoch = np.ceil(validation_generator.samples / batch_size)
# print the validation loss & accuracy
evaluation = model.evaluate_generator(validation_generator, steps=validation_steps_per_epoch, verbose=1)
print("Val loss:", evaluation[0])
print("Val Accuracy:", evaluation[1])

Make sure to use the optimal weights, the one which has the lower loss and higher accuracy.

Output:

23/23 [==============================] - 6s 264ms/step
Val loss: 0.5659930361524
Val Accuracy: 0.8166894659134987

Okey, let's visualize a little bit, we are going to plot a complete batch of images with its corresponding predicted and correct labels:

# get a random batch of images
image_batch, label_batch = next(iter(validation_generator))
# turn the original labels into human-readable text
label_batch = [class_names[np.argmax(label_batch[i])] for i in range(batch_size)]
# predict the images on the model
predicted_class_names = model.predict(image_batch)
predicted_ids = [np.argmax(predicted_class_names[i]) for i in range(batch_size)]
# turn the predicted vectors to human readable labels
predicted_class_names = np.array([class_names[id] for id in predicted_ids])
# some nice plotting
plt.figure(figsize=(10,9))
for n in range(30):
    plt.subplot(6,5,n+1)
    plt.subplots_adjust(hspace = 0.3)
    plt.imshow(image_batch[n])
    if predicted_class_names[n] == label_batch[n]:
        color = "blue"
        title = predicted_class_names[n].title()
    else:
        color = "red"
        title = f"{predicted_class_names[n].title()}, correct:{label_batch[n]}"
    plt.title(title, color=color)
    plt.axis('off')
_ = plt.suptitle("Model predictions (blue: correct, red: incorrect)")
plt.show()

Once you run it, you'll get something like this:

Awesome! As you can see, out of 30 images, 25 was correctly predicted, that's a good result though, as some flower images are a little ambiguous.

Alright, that's it. In this tutorial, you discovered how you can use transfer learning to quickly develop and use state-of-the-art models using Tensorflow and Keras in Python.

Even though in the real world it's not suggested to train image classifiers models from scratch (except for different types of images such as human skins, etc.),

I highly encourage you to use other models that was mentioned above, try to fine tune them too, good luck!

Building an Image Classification Model in 10 Minutes

Building an Image Classification Model in 10 Minutes

Build a Deep learning model in a few minutes? It’ll take hours to train! I don’t even have a good enough machine. I’ve heard this countless times from aspiring data scientists who shy away from building Deep Learning models on their own machines.

Introduction

You don’t need to be working for Google or other big tech firms to work on deep learning datasets! It is entirely possible to build your own neural network from the ground up in a matter of minutes without needing to lease out Google’s servers. Fast.ai’s students designed a model on the Imagenet dataset in 18 minutes – and I will showcase something similar in this article.

Deep learning is a vast field so we’ll narrow our focus a bit and take up the challenge of solving an Image Classification project. Additionally, we’ll be using a very simple deep learning architecture to achieve a pretty impressive accuracy score.

You can consider the Python code we’ll see in this article as a benchmark for building Image Classification models. Once you get a good grasp on the concept, go ahead and play around with the code, participate in competitions and climb up the leaderboard!

If you’re new to deep learning and are fascinated by the field of computer vision (who isn’t?!), do check out the ‘Computer Vision using Deep Learning‘ course. It’s a comprehensive introduction to this wonderful field and will set you up for what is inevitably going to a huge job market in the near future.

Table of Contents
  1. What is Image Classification and its use cases
  2. Setting up the Structure of our Image Data
  3. Breaking Down the Model Building Process
  4. Setting up the Problem Statement and Understanding the Data
  5. Steps to Build the Image Classification Model
  6. Taking up Another Challenge
 What is Image Classification?

Consider the below image:

You will have instantly recognized it – it’s a (swanky) car. Take a step back and analyze how you came to this conclusion – you were shown an image and you classified the class it belonged to (a car, in this instance). And that, in a nutshell, is what image classification is all about.

There are potentially n number of categories in which a given image can be classified. Manually checking and classifying images is a very tedious process. The task becomes near impossible when we’re faced with a massive number of images, say 10,000 or even 100,000. How useful would it be if we could automate this entire process and quickly label images per their corresponding class?

Self-driving cars are a great example to understand where image classification is used in the real-world. To enable autonomous driving, we can build an image classification model that recognizes various objects, such as vehicles, people, moving objects, etc. on the road. We’ll see a couple more use cases later in this article but there are plenty more applications around us. Use the comments section below the article to let me know what potential use cases you can come with up!

Now that we have a handle on our subject matter, let’s dive into how an image classification model is built, what are the prerequisites for it, and how it can be implemented in Python.

Setting up the Structure of our Image Data

Our data needs to be in a particular format in order to solve an image classification problem. We will see this in action in a couple of sections but just keep these pointers in mind till we get there.

You should have 2 folders, one for the train set and the other for the test set. In the training set, you will have a .csv file and an image folder:

  • The .csv file contains the names of all the training images and their corresponding true labels
  • The image folder has all the training images.

The .csv file in our test set is different from the one present in the training set. This test set .csv file contains the names of all the test images, but they do not have any corresponding labels. Can you guess why? Our model will be trained on the images present in the training set and the label predictions will happen on the testing set images

If your data is not in the format described above, you will need to convert it accordingly (otherwise the predictions will be awry and fairly useless).

 Breaking Down the Process of Model Building

Before we deep dive into the Python code, let’s take a moment to understand how an image classification model is typically designed. We can divide this process broadly into 4 stages. Each stage requires a certain amount of time to execute:

  1. Loading and pre-processing Data – 30% time
  2. Defining Model architecture – 10% time
  3. Training the model – 50% time
  4. Estimation of performance – 10% time

Let me explain each of the above steps in a bit more detail. This section is crucial because not every model is built in the first go. You will need to go back after each iteration, fine-tune your steps, and run it again. Having a solid understanding of the underlying concepts will go a long way in accelerating the entire process.

 Stage 1: Loading and pre-processing the data

Data is gold as far as deep learning models are concerned. Your image classification model has a far better chance of performing well if you have a good amount of images in the training set. Also, the shape of the data varies according to the architecture/framework that we use.

Hence, the critical data pre-processing step (the eternally important step in any project). I highly recommend going through the ‘Basics of Image Processing in Python’ to understand more about how pre-processing works with image data.

But we are not quite there yet. In order to see how our model performs on unseen data (and before exposing it to the test set), we need to create a validation set. This is done by partitioning the training set data.

In short, we train the model on the training data and validate it on the validation data. Once we are satisfied with the model’s performance on the validation set, we can use it for making predictions on the test data.

Time required for this step: We require around 2-3 minutes for this task.

 Stage 2: Defining the model’s architecture

This is another crucial step in our deep learning model building process. We have to define how our model will look and that requires answering questions like:

  • How many convolutional layers do we want?
  • What should be the activation function for each layer?
  • How many hidden units should each layer have?

And many more. These are essentially the hyperparameters of the model which play a MASSIVE part in deciding how good the predictions will be.

How do we decide these values? Excellent question! A good idea is to pick these values based on existing research/studies. Another idea is to keep experimenting with the values until you find the best match but this can be quite a time consuming process.

Time required for this step: It should take around 1 minute to define the architecture of the model.

 Stage 3: Training the model

For training the model, we require:

  • Training images and their corresponding true labels
  • Validation images and their corresponding true labels (we use these labels only to validate the model and not during the training phase)

We also define the number of epochs in this step. For starters, we will run the model for 10 epochs (you can change the number of epochs later).

Time required for this step: Since training requires the model to learn structures, we need around 5 minutes to go through this step.

And now time to make predictions!

 Stage 4: Estimating the model’s performance

Finally, we load the test data (images) and go through the pre-processing step here as well. We then predict the classes for these images using the trained model.

Time required for this step: ~ 1 minute.

 Setting up the Problem Statement and Understanding the Data

We will be picking up a really cool challenge to understand image classification. We have to build a model that can classify a given set of images according to the apparel (shirt, trousers, shoes, socks, etc.). It’s actually a problem faced by many e-commerce retailers which makes it an even more interesting computer vision problem.

This challenge is called ‘Identify the Apparels’ and is one of the practice problems we have on our DataHack platform. You will have to register and download the dataset from the above link.

We have a total of 70,000 images (28 x 28 dimension), out of which 60,000 are from the training set and 10,000 from the test one. The training images are pre-labelled according to the apparel type with 10 total classes. The test images are, of course, not labelled. The challenge is to identify the type of apparel present in all the test images.

We will build our model on Google Colab since it provides a free GPU to train our models.

 Steps to Build our Model

Time to fire up your Python skills and get your hands dirty. We are finally at the implementation part of our learning!

  1. Setting up Google Colab
  2. Importing Libraries
  3. Loading and Preprocessing Data – (3 mins)
  4. Creating a validation set
  5. Defining the model structure – (1 min)
  6. Training the model – (5 min)
  7. Making predictions – (1 min)

Let’s look at each step in detail.

 Step 1: Setting up Google Colab

Since we’re importing our data from a Google Drive link, we’ll need to add a few lines of code in our Google Colab notebook. Create a new Python 3 notebook and write the following code blocks:

!pip install PyDrive

This will install PyDrive. Now we will import a few required libraries:

import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

Next, we will create a drive variable to access Google Drive:

auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

To download the dataset, we will use the ID of the file uploaded on Google Drive:

download = drive.CreateFile({'id': '1BZOv422XJvxFUnGh-0xVeSvgFgqVY45q'})

Replace the ‘id’ in the above code with the ID of your file. Now we will download this file and unzip it:

download.GetContentFile('train_LbELtWX.zip')
!unzip train_LbELtWX.zip

You have to run these code blocks every time you start your notebook.

 Step 2 : Import the libraries we’ll need during our model building phase.

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.utils import to_categorical
from keras.preprocessing import image
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from tqdm import tqdm

 Step 3: Recall the pre-processing steps we discussed earlier. We’ll be using them here after loading the data.

train = pd.read_csv('train.csv')

Next, we will read all the training images, store them in a list, and finally convert that list into a numpy array.

# We have grayscale images, so while loading the images we will keep grayscale=True, if you have RGB images, you should set grayscale as False
train_image = []
for i in tqdm(range(train.shape[0])):
    img = image.load_img('train/'+train['id'][i].astype('str')+'.png', target_size=(28,28,1), grayscale=True)
    img = image.img_to_array(img)
    img = img/255
    train_image.append(img)
X = np.array(train_image)

As it is a multi-class classification problem (10 classes), we will one-hot encode the target variable.

y=train['label'].values
y = to_categorical(y)

 Step 4: Creating a validation set from the training data.

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, test_size=0.2)

 Step 5: Define the model structure.

We will create a simple architecture with 2 convolutional layers, one dense hidden layer and an output layer.

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=(28,28,1)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

Next, we will compile the model we’ve created.

model.compile(loss='categorical_crossentropy',optimizer='Adam',metrics=['accuracy'])

 Step 6: Training the model.

In this step, we will train the model on the training set images and validate it using, you guessed it, the validation set.

model.fit(X_train, y_train, epochs=10, validation_data=(X_test, y_test))

 Step 7: Making predictions!

We’ll initially follow the steps we performed when dealing with the training data. Load the test images and predict their classes using the model.predict_classes() function.

download = drive.CreateFile({'id': '1KuyWGFEpj7Fr2DgBsW8qsWvjqEzfoJBY'})
download.GetContentFile('test_ScVgIM0.zip')
!unzip test_ScVgIM0.zip

Let’s import the test file:

test = pd.read_csv('test.csv')

Now, we will read and store all the test images:

test_image = []
for i in tqdm(range(test.shape[0])):
    img = image.load_img('test/'+test['id'][i].astype('str')+'.png', target_size=(28,28,1), grayscale=True)
    img = image.img_to_array(img)
    img = img/255
    test_image.append(img)
test = np.array(test_image)


# making predictions
prediction = model.predict_classes(test)

We will also create a submission file to upload on the DataHack platform page (to see how our results fare on the leaderboard).

download = drive.CreateFile({'id': '1z4QXy7WravpSj-S4Cs9Fk8ZNaX-qh5HF'})
download.GetContentFile('sample_submission_I5njJSF.csv')


# creating submission file
sample = pd.read_csv('sample_submission_I5njJSF.csv')
sample['label'] = prediction
sample.to_csv('sample_cnn.csv', header=True, index=False)

Download this sample_cnn.csv file and upload it on the contest page to generate your results and check your ranking on the leaderboard. This will give you a benchmark solution to get you started with any Image Classification problem!

You can try hyperparameter tuning and regularization techniques to improve your model’s performance further. I ecnourage you to check out this article to understand this fine-tuning step in much more detail – ‘A Comprehensive Tutorial to learn Convolutional Neural Networks from Scratch’.

 

Picking up a Different Challenge

Let’s test our learning on a different dataset. We’ll be cracking the ‘Identify the Digits’ practice problem in this section. Go ahead and download the dataset. Before you proceed further, try to solve this on your own. You already have the tools to solve it – you just need to apply them! Come back here to check your results or if you get stuck at some point.

In this challenge, we need to identify the digit in a given image. We have a total of 70,000 images – 49,000 labelled ones in the training set and the remaining 21,000 in the test set (the test images are unlabelled). We need to identify/predict the class of these unlabelled images.

Ready to begin? Awesome! Create a new Python 3 notebook and run the following code:

# Setting up Colab
!pip install PyDrive


import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials


auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)


# Replace the id and filename in the below codes
download = drive.CreateFile({'id': '1ZCzHDAfwgLdQke_GNnHp_4OheRRtNPs-'})
download.GetContentFile('Train_UQcUa52.zip')
!unzip Train_UQcUa52.zip


# Importing libraries
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.utils import to_categorical
from keras.preprocessing import image
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from tqdm import tqdm


train = pd.read_csv('train.csv')


# Reading the training images
train_image = []
for i in tqdm(range(train.shape[0])):
    img = image.load_img('Images/train/'+train['filename'][i], target_size=(28,28,1), grayscale=True)
    img = image.img_to_array(img)
    img = img/255
    train_image.append(img)
X = np.array(train_image)


# Creating the target variable
y=train['label'].values
y = to_categorical(y)


# Creating validation set
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, test_size=0.2)


# Define the model structure
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=(28,28,1)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))


# Compile the model
model.compile(loss='categorical_crossentropy',optimizer='Adam',metrics=['accuracy'])


# Training the model
model.fit(X_train, y_train, epochs=10, validation_data=(X_test, y_test))


download = drive.CreateFile({'id': '1zHJR6yiI06ao-UAh_LXZQRIOzBO3sNDq'})
download.GetContentFile('Test_fCbTej3.csv')


test_file = pd.read_csv('Test_fCbTej3.csv')


test_image = []
for i in tqdm(range(test_file.shape[0])):
    img = image.load_img('Images/test/'+test_file['filename'][i], target_size=(28,28,1), grayscale=True)
    img = image.img_to_array(img)
    img = img/255
    test_image.append(img)
test = np.array(test_image)


prediction = model.predict_classes(test)


download = drive.CreateFile({'id': '1nRz5bD7ReGrdinpdFcHVIEyjqtPGPyHx'})
download.GetContentFile('Sample_Submission_lxuyBuB.csv')


sample = pd.read_csv('Sample_Submission_lxuyBuB.csv')
sample['filename'] = test_file['filename']
sample['label'] = prediction
sample.to_csv('sample.csv', header=True, index=False)

Submit this file on the practice problem page to get a pretty decent accuracy number. It’s a good start but there’s always scope for improvement. Keep playing around with the hyperparameter values and see if you can improve on our basic model.

 End Notes

Who said deep learning models required hours or days to train. My aim here was to showcase that you can come up with a pretty decent deep learning model in double-quick time. You should pick up similar challenges and try to code them from your end as well. There’s nothing like learning by doing!

The top data scientists and analysts have these codes ready before a Hackathon even begins. They use these codes to make early submissions before diving into a detailed analysis. Once they have a benchmark solution, they start improving their model using different techniques.

Thanks for reading. If you liked this post, share it with all of your programming buddies!

Further reading

☞ Machine Learning A-Z™: Hands-On Python & R In Data Science

☞ Python for Data Science and Machine Learning Bootcamp

☞ Machine Learning, Data Science and Deep Learning with Python

☞ Deep Learning A-Z™: Hands-On Artificial Neural Networks

☞ Artificial Intelligence A-Z™: Learn How To Build An AI

☞ A Complete Machine Learning Project Walk-Through in Python

☞ Machine Learning: how to go from Zero to Hero

☞ Top 18 Machine Learning Platforms For Developers

☞ 10 Amazing Articles On Python Programming And Machine Learning

☞ 100+ Basic Machine Learning Interview Questions and Answers

☞ Machine Learning for Front-End Developers

☞ Top 30 Python Libraries for Machine Learning

Easy Image Classification with TensorFlow 2.0

Easy Image Classification with TensorFlow 2.0

Easy Image Classification with TensorFlow 2.0 ... Eager execution is enabled by default, without sacrificing the performance optimizations of graph-based execution. APIs are ... Tighter Keras integration as the high-level API.

Image Classification is one of the fundamental supervised tasks in the world of machine learning. TensorFlow’s new 2.0 version provides a totally new development ecosystem with Eager Execution enabled by default. By me, I assume most TF developers had a little hard time with TF 2.0 as we were habituated to use tf.Session and tf.placeholder that we can’t imagine TensorFlow without.

Today, we start with simple image classification without using TF Keras, so that we can take a look at the new API changes in TensorFlow 2.0

You can take a look at the Colab notebook for this story.

Trending AI Articles:

1. Machine Learning In Node.js With TensorFlow.js

2.Linear Regression using TensorFlow 2.0

3.Deep Learning Models with Tensorflow 2.0

Let’s import the data. For simplicity, use TensorFlow Datasets.

Data pipelines could be frustating ( Sometimes! ).

We need to play around with the low-level TF APIs rather than input pipelines. So, we import a well-designed dataset from TensorFlow Datasets directly. We will use the Horses Or Humans dataset.

img_classify_tf2.py

import tensorflow_datasets as tfds

dataset_name = 'horses_or_humans'

dataset = tfds.load( name=dataset_name , split=tfds.Split.TRAIN )
dataset = dataset.shuffle( 1024 ).batch( batch_size )

We can get a number of datasets readily available with TF Datasets.

Defining the model and related ops.

Remember what we needed for a CNN in Keras. Conv2D, MaxPooling2D, Flatten and Dense layers, right? We need to create these layers using the tf.nn module.

img_classify_tf2_1.py

leaky_relu_alpha = 0.2
dropout_rate = 0.5

def conv2d( inputs , filters , stride_size ):
    out = tf.nn.conv2d( inputs , filters , strides=[ 1 , stride_size , stride_size , 1 ] , padding=padding ) 
    return tf.nn.leaky_relu( out , alpha=leaky_relu_alpha ) 

def maxpool( inputs , pool_size , stride_size ):
    return tf.nn.max_pool2d( inputs , ksize=[ 1 , pool_size , pool_size , 1 ] , padding='VALID' , strides=[ 1 , stride_size , stride_size , 1 ] )

def dense( inputs , weights ):
    x = tf.nn.leaky_relu( tf.matmul( inputs , weights ) , alpha=leaky_relu_alpha )
    return tf.nn.dropout( x , rate=dropout_rate )

Also, we would require some weights. The shapes for our kernels ( filters ) need to be calculated.

img_classify_tf2_2.py

initializer = tf.initializers.glorot_uniform()
def get_weight( shape , name ):
    return tf.Variable( initializer( shape ) , name=name , trainable=True , dtype=tf.float32 )

shapes = [
    [ 3 , 3 , 3 , 16 ] , 
    [ 3 , 3 , 16 , 16 ] , 
    [ 3 , 3 , 16 , 32 ] , 
    [ 3 , 3 , 32 , 32 ] ,
    [ 3 , 3 , 32 , 64 ] , 
    [ 3 , 3 , 64 , 64 ] ,
    [ 3 , 3 , 64 , 128 ] , 
    [ 3 , 3 , 128 , 128 ] ,
    [ 3 , 3 , 128 , 256 ] , 
    [ 3 , 3 , 256 , 256 ] ,
    [ 3 , 3 , 256 , 512 ] , 
    [ 3 , 3 , 512 , 512 ] ,
    [ 8192 , 3600 ] , 
    [ 3600 , 2400 ] ,
    [ 2400 , 1600 ] , 
    [ 1600 , 800 ] ,
    [ 800 , 64 ] ,
    [ 64 , output_classes ] ,
]

weights = []
for i in range( len( shapes ) ):
    weights.append( get_weight( shapes[ i ] , 'weight{}'.format( i ) ) )

Note the trainable=True argument becomes necessary with tf.Variable. If not mentioned then we may receive an error regarding the differentiation of variables. In simpler words, a trainable variable is differentiable too.

Each weight is a tf.Variable with the trainable=True parameter which is important. Also, in TF 2.0, we get the tf.initializers module which makes it easier to initialize weights for neural networks. We need to encapsulate our weights in a weights array. This weights array will be used with the tf.optimizer.Adam for optimization.

Now, we assemble all the ops together to have a Keras-like model.

img_classify_tf2_3.py

def model( x ) :
    x = tf.cast( x , dtype=tf.float32 )
    c1 = conv2d( x , weights[ 0 ] , stride_size=1 ) 
    c1 = conv2d( c1 , weights[ 1 ] , stride_size=1 ) 
    p1 = maxpool( c1 , pool_size=2 , stride_size=2 )
    
    c2 = conv2d( p1 , weights[ 2 ] , stride_size=1 )
    c2 = conv2d( c2 , weights[ 3 ] , stride_size=1 ) 
    p2 = maxpool( c2 , pool_size=2 , stride_size=2 )
    
    c3 = conv2d( p2 , weights[ 4 ] , stride_size=1 ) 
    c3 = conv2d( c3 , weights[ 5 ] , stride_size=1 ) 
    p3 = maxpool( c3 , pool_size=2 , stride_size=2 )
    
    c4 = conv2d( p3 , weights[ 6 ] , stride_size=1 )
    c4 = conv2d( c4 , weights[ 7 ] , stride_size=1 )
    p4 = maxpool( c4 , pool_size=2 , stride_size=2 )

    c5 = conv2d( p4 , weights[ 8 ] , stride_size=1 )
    c5 = conv2d( c5 , weights[ 9 ] , stride_size=1 )
    p5 = maxpool( c5 , pool_size=2 , stride_size=2 )

    c6 = conv2d( p5 , weights[ 10 ] , stride_size=1 )
    c6 = conv2d( c6 , weights[ 11 ] , stride_size=1 )
    p6 = maxpool( c6 , pool_size=2 , stride_size=2 )

    flatten = tf.reshape( p6 , shape=( tf.shape( p6 )[0] , -1 ))

    d1 = dense( flatten , weights[ 12 ] )
    d2 = dense( d1 , weights[ 13 ] )
    d3 = dense( d2 , weights[ 14 ] )
    d4 = dense( d3 , weights[ 15 ] )
    d5 = dense( d4 , weights[ 16 ] )
    logits = tf.matmul( d5 , weights[ 17 ] )

    return tf.nn.softmax( logits )

Q. Why are declaring the model as a function? Later on, we will pass a batch of data to this function and get the outputs. We do not use Session as Eager execution is enabled by default. See this guide.

The loss function is easy.

def loss( pred , target ):
    return tf.losses.categorical_crossentropy( target , pred )

Next, comes the most confusing part for a beginner ( for me too! ). We will use tf.GradientTape for optimizing the model.

img_classify_tf2_4.py

optimizer = tf.optimizers.Adam( learning_rate )

def train_step( model, inputs , outputs ):
    with tf.GradientTape() as tape:
        current_loss = loss( model( inputs ), outputs)
    grads = tape.gradient( current_loss , weights )
    optimizer.apply_gradients( zip( grads , weights ) )
    print( tf.reduce_mean( current_loss ) )
    
 num_epochs = 256

for e in range( num_epochs ):
    for features in dataset:
        image , label = features[ 'image' ] , features[ 'label' ]
        train_step( model , image , tf.one_hot( label , depth=3 ) )

What’s happening here?

  1. We declare tf.GradientTape and within its scope, we call the model() and loss() methods in it. Hence, all the functions in these methods will be differentiated during backpropagation.
  2. We obtain the gradients using tape.gradient method.
  3. We optimize all the ops using the optimizer.apply_gradients method ( Earlier we used optimizer.minimize which is still available )

Read more about it from here.