Complete Tutorial On LeNet-5 | Guide To Begin With CNNs

In deep learning, Convolutional Neural Networks(CNNs or Convnets) take up a major role. CNNs are widely used in computer vision-based problems, natural language processing, time series analysis, recommendation systems. ConvNet architecture mainly has 3 layers – convolutional layer, pooling layer and fully connected layer. All these layers bring out features of the input by finding some pattern using mathematical calculations. Unlike other neural networks architecture, CNNs have a backpropagation algorithm.

To start with CNNs, LeNet-5 would be the best to learn first as it is a simple and basic model architecture. In this article, I’ll be discussing the architecture of LeNet-5 which is the very first convolutional neural network to be built.

What is LeNet-5?

LeNet-5 was developed by one of the pioneers of deep learning Yann LeCun in 1998 in his paper ‘Gradient-Based Learning Applied to Document Recognition’. LeNet was used in detecting handwritten cheques by banks based on MNIST dataset. Fully connected networks and activation functions were previously known in neural networks. LeNet-5 introduced convolutional and pooling layers. LeNet-5 is believed to be the base for all other ConvNets.

Source – Yann LeCun’s website showing LeNet-5 demo

A convolution is a linear operation. The convolutional layer does the major job by multiplying weight (kernel/filter) with the input.

A pooling layer generally comes after a convolutional layer. This layer helps in reducing the high dimensionality created by convolutional layers, to curb overfitting.

Architecture

LeNet-5 consists of 7 layers – alternatingly 2 convolutional and 2 average pooling layers, and then 2 fully connected layers and the output layer with activation function softmax.

Original Image of LeNet-5 architecture

MNIST images dimensions are 28 × 28 pixels, but they are zero-padded to 32 × 32 pixels and normalized before being fed forward to the network. Input image shrinks further down the network.
In the average pooling layers each neuron computes the mean of its inputs, then multiplies the result by a learnable coefficient and adds a learnable bias term then finally applies the activation function.
Most neurons in the 3rd convolutional layer are connected to neurons in only three or four 2nd avg pooling layers.
In the output layer each neuron outputs the square of the Euclidean distance between its input vector and its weight vector. Each output measure predicts the probability of the image that belongs to a particular digit class. The cross-entropy cost function is used in this step.

Implementation of LeNet-5

We implement the LeNet-5 using MNIST dataset for handwritten character recognition.

Importing libraries:

import tensorflow as tf
from tensorflow.keras import Model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, AveragePooling2D

Loading MNIST and splitting into training and testing datasets

mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()

#developers corner #beginners neural network #cnns #deep learning #lenet-5 #mnist #tutorial

analyticsindiamag.com

Complete Tutorial On LeNet-5 | Guide To Begin With CNNs