Easy Image Classification with TensorFlow 2.0

Image Classification is one of the fundamental supervised tasks in the world of machine learning. TensorFlow’s new 2.0 version provides a totally new development ecosystem with Eager Execution enabled by default. By me, I assume most TF developers had a little hard time with TF 2.0 as we were habituated to use tf.Session and tf.placeholder that we can’t imagine TensorFlow without.

Today, we start with simple image classification without using TF Keras, so that we can take a look at the new API changes in TensorFlow 2.0

You can take a look at the Colab notebook for this story.

Let’s import the data. For simplicity, use TensorFlow Datasets.

Data pipelines could be frustating ( Sometimes! ).

We need to play around with the low-level TF APIs rather than input pipelines. So, we import a well-designed dataset from TensorFlow Datasets directly. We will use the Horses Or Humans dataset.

img_classify_tf2.py

import tensorflow_datasets as tfds

dataset_name = 'horses_or_humans'

dataset = tfds.load( name=dataset_name , split=tfds.Split.TRAIN )
dataset = dataset.shuffle( 1024 ).batch( batch_size )

We can get a number of datasets readily available with TF Datasets.

Defining the model and related ops.

Remember what we needed for a CNN in Keras. Conv2D, MaxPooling2D, Flatten and Dense layers, right? We need to create these layers using the tf.nn module.

img_classify_tf2_1.py

leaky_relu_alpha = 0.2
dropout_rate = 0.5

def conv2d( inputs , filters , stride_size ):
    out = tf.nn.conv2d( inputs , filters , strides=[ 1 , stride_size , stride_size , 1 ] , padding=padding ) 
    return tf.nn.leaky_relu( out , alpha=leaky_relu_alpha ) 

def maxpool( inputs , pool_size , stride_size ):
    return tf.nn.max_pool2d( inputs , ksize=[ 1 , pool_size , pool_size , 1 ] , padding='VALID' , strides=[ 1 , stride_size , stride_size , 1 ] )

def dense( inputs , weights ):
    x = tf.nn.leaky_relu( tf.matmul( inputs , weights ) , alpha=leaky_relu_alpha )
    return tf.nn.dropout( x , rate=dropout_rate )

Also, we would require some weights. The shapes for our kernels ( filters ) need to be calculated.

img_classify_tf2_2.py

initializer = tf.initializers.glorot_uniform()
def get_weight( shape , name ):
    return tf.Variable( initializer( shape ) , name=name , trainable=True , dtype=tf.float32 )

shapes = [
    [ 3 , 3 , 3 , 16 ] , 
    [ 3 , 3 , 16 , 16 ] , 
    [ 3 , 3 , 16 , 32 ] , 
    [ 3 , 3 , 32 , 32 ] ,
    [ 3 , 3 , 32 , 64 ] , 
    [ 3 , 3 , 64 , 64 ] ,
    [ 3 , 3 , 64 , 128 ] , 
    [ 3 , 3 , 128 , 128 ] ,
    [ 3 , 3 , 128 , 256 ] , 
    [ 3 , 3 , 256 , 256 ] ,
    [ 3 , 3 , 256 , 512 ] , 
    [ 3 , 3 , 512 , 512 ] ,
    [ 8192 , 3600 ] , 
    [ 3600 , 2400 ] ,
    [ 2400 , 1600 ] , 
    [ 1600 , 800 ] ,
    [ 800 , 64 ] ,
    [ 64 , output_classes ] ,
]

weights = []
for i in range( len( shapes ) ):
    weights.append( get_weight( shapes[ i ] , 'weight{}'.format( i ) ) )

Note the trainable=True argument becomes necessary with tf.Variable. If not mentioned then we may receive an error regarding the differentiation of variables. In simpler words, a trainable variable is differentiable too.

Each weight is a tf.Variable with the trainable=True parameter which is important. Also, in TF 2.0, we get the tf.initializers module which makes it easier to initialize weights for neural networks. We need to encapsulate our weights in a weights array. This weights array will be used with the tf.optimizer.Adam for optimization.

Now, we assemble all the ops together to have a Keras-like model.

img_classify_tf2_3.py

def model( x ) :
    x = tf.cast( x , dtype=tf.float32 )
    c1 = conv2d( x , weights[ 0 ] , stride_size=1 ) 
    c1 = conv2d( c1 , weights[ 1 ] , stride_size=1 ) 
    p1 = maxpool( c1 , pool_size=2 , stride_size=2 )
    
    c2 = conv2d( p1 , weights[ 2 ] , stride_size=1 )
    c2 = conv2d( c2 , weights[ 3 ] , stride_size=1 ) 
    p2 = maxpool( c2 , pool_size=2 , stride_size=2 )
    
    c3 = conv2d( p2 , weights[ 4 ] , stride_size=1 ) 
    c3 = conv2d( c3 , weights[ 5 ] , stride_size=1 ) 
    p3 = maxpool( c3 , pool_size=2 , stride_size=2 )
    
    c4 = conv2d( p3 , weights[ 6 ] , stride_size=1 )
    c4 = conv2d( c4 , weights[ 7 ] , stride_size=1 )
    p4 = maxpool( c4 , pool_size=2 , stride_size=2 )

    c5 = conv2d( p4 , weights[ 8 ] , stride_size=1 )
    c5 = conv2d( c5 , weights[ 9 ] , stride_size=1 )
    p5 = maxpool( c5 , pool_size=2 , stride_size=2 )

    c6 = conv2d( p5 , weights[ 10 ] , stride_size=1 )
    c6 = conv2d( c6 , weights[ 11 ] , stride_size=1 )
    p6 = maxpool( c6 , pool_size=2 , stride_size=2 )

    flatten = tf.reshape( p6 , shape=( tf.shape( p6 )[0] , -1 ))

    d1 = dense( flatten , weights[ 12 ] )
    d2 = dense( d1 , weights[ 13 ] )
    d3 = dense( d2 , weights[ 14 ] )
    d4 = dense( d3 , weights[ 15 ] )
    d5 = dense( d4 , weights[ 16 ] )
    logits = tf.matmul( d5 , weights[ 17 ] )

    return tf.nn.softmax( logits )

Q. Why are declaring the model as a function? Later on, we will pass a batch of data to this function and get the outputs. We do not use Session as Eager execution is enabled by default. See this guide.

The loss function is easy.

def loss( pred , target ):
    return tf.losses.categorical_crossentropy( target , pred )

Next, comes the most confusing part for a beginner ( for me too! ). We will use tf.GradientTape for optimizing the model.

img_classify_tf2_4.py

optimizer = tf.optimizers.Adam( learning_rate )

def train_step( model, inputs , outputs ):
    with tf.GradientTape() as tape:
        current_loss = loss( model( inputs ), outputs)
    grads = tape.gradient( current_loss , weights )
    optimizer.apply_gradients( zip( grads , weights ) )
    print( tf.reduce_mean( current_loss ) )
    
 num_epochs = 256

for e in range( num_epochs ):
    for features in dataset:
        image , label = features[ 'image' ] , features[ 'label' ]
        train_step( model , image , tf.one_hot( label , depth=3 ) )

What’s happening here?

We declare tf.GradientTape and within its scope, we call the model() and loss() methods in it. Hence, all the functions in these methods will be differentiated during backpropagation.
We obtain the gradients using tape.gradient method.
We optimize all the ops using the optimizer.apply_gradients method ( Earlier we used optimizer.minimize which is still available )

Trending AI Articles:

Let’s import the data. For simplicity, use TensorFlow Datasets.

Defining the model and related ops.

becominghuman.ai

Easy Image Classification with TensorFlow 2.0