Image Classification is one of the fundamental supervised tasks in the world of machine learning. TensorFlow’s new 2.0 version provides a totally new development ecosystem with Eager Execution enabled by default. By me, I assume most TF developers had a little hard time with TF 2.0 as we were habituated to use tf.Session
and tf.placeholder
that we can’t imagine TensorFlow without.
Today, we start with simple image classification without using TF Keras, so that we can take a look at the new API changes in TensorFlow 2.0
You can take a look at the Colab notebook for this story.
1. Machine Learning In Node.js With TensorFlow.js
2.Linear Regression using TensorFlow 2.0
3.Deep Learning Models with Tensorflow 2.0
Data pipelines could be frustating ( Sometimes! ).
We need to play around with the low-level TF APIs rather than input pipelines. So, we import a well-designed dataset from TensorFlow Datasets directly. We will use the Horses Or Humans dataset.
img_classify_tf2.py
import tensorflow_datasets as tfds
dataset_name = 'horses_or_humans'
dataset = tfds.load( name=dataset_name , split=tfds.Split.TRAIN )
dataset = dataset.shuffle( 1024 ).batch( batch_size )
We can get a number of datasets readily available with TF Datasets.
Remember what we needed for a CNN in Keras. Conv2D
, MaxPooling2D
, Flatten
and Dense
layers, right? We need to create these layers using the tf.nn
module.
img_classify_tf2_1.py
leaky_relu_alpha = 0.2
dropout_rate = 0.5
def conv2d( inputs , filters , stride_size ):
out = tf.nn.conv2d( inputs , filters , strides=[ 1 , stride_size , stride_size , 1 ] , padding=padding )
return tf.nn.leaky_relu( out , alpha=leaky_relu_alpha )
def maxpool( inputs , pool_size , stride_size ):
return tf.nn.max_pool2d( inputs , ksize=[ 1 , pool_size , pool_size , 1 ] , padding='VALID' , strides=[ 1 , stride_size , stride_size , 1 ] )
def dense( inputs , weights ):
x = tf.nn.leaky_relu( tf.matmul( inputs , weights ) , alpha=leaky_relu_alpha )
return tf.nn.dropout( x , rate=dropout_rate )
Also, we would require some weights. The shapes for our kernels ( filters ) need to be calculated.
img_classify_tf2_2.py
initializer = tf.initializers.glorot_uniform()
def get_weight( shape , name ):
return tf.Variable( initializer( shape ) , name=name , trainable=True , dtype=tf.float32 )
shapes = [
[ 3 , 3 , 3 , 16 ] ,
[ 3 , 3 , 16 , 16 ] ,
[ 3 , 3 , 16 , 32 ] ,
[ 3 , 3 , 32 , 32 ] ,
[ 3 , 3 , 32 , 64 ] ,
[ 3 , 3 , 64 , 64 ] ,
[ 3 , 3 , 64 , 128 ] ,
[ 3 , 3 , 128 , 128 ] ,
[ 3 , 3 , 128 , 256 ] ,
[ 3 , 3 , 256 , 256 ] ,
[ 3 , 3 , 256 , 512 ] ,
[ 3 , 3 , 512 , 512 ] ,
[ 8192 , 3600 ] ,
[ 3600 , 2400 ] ,
[ 2400 , 1600 ] ,
[ 1600 , 800 ] ,
[ 800 , 64 ] ,
[ 64 , output_classes ] ,
]
weights = []
for i in range( len( shapes ) ):
weights.append( get_weight( shapes[ i ] , 'weight{}'.format( i ) ) )
Note the
trainable=True
argument becomes necessary withtf.Variable
. If not mentioned then we may receive an error regarding the differentiation of variables. In simpler words, a trainable variable is differentiable too.
Each weight is a tf.Variable
with the trainable=True
parameter which is important. Also, in TF 2.0, we get the tf.initializers
module which makes it easier to initialize weights for neural networks. We need to encapsulate our weights in a weights
array. This weights
array will be used with the tf.optimizer.Adam
for optimization.
Now, we assemble all the ops together to have a Keras-like model.
img_classify_tf2_3.py
def model( x ) :
x = tf.cast( x , dtype=tf.float32 )
c1 = conv2d( x , weights[ 0 ] , stride_size=1 )
c1 = conv2d( c1 , weights[ 1 ] , stride_size=1 )
p1 = maxpool( c1 , pool_size=2 , stride_size=2 )
c2 = conv2d( p1 , weights[ 2 ] , stride_size=1 )
c2 = conv2d( c2 , weights[ 3 ] , stride_size=1 )
p2 = maxpool( c2 , pool_size=2 , stride_size=2 )
c3 = conv2d( p2 , weights[ 4 ] , stride_size=1 )
c3 = conv2d( c3 , weights[ 5 ] , stride_size=1 )
p3 = maxpool( c3 , pool_size=2 , stride_size=2 )
c4 = conv2d( p3 , weights[ 6 ] , stride_size=1 )
c4 = conv2d( c4 , weights[ 7 ] , stride_size=1 )
p4 = maxpool( c4 , pool_size=2 , stride_size=2 )
c5 = conv2d( p4 , weights[ 8 ] , stride_size=1 )
c5 = conv2d( c5 , weights[ 9 ] , stride_size=1 )
p5 = maxpool( c5 , pool_size=2 , stride_size=2 )
c6 = conv2d( p5 , weights[ 10 ] , stride_size=1 )
c6 = conv2d( c6 , weights[ 11 ] , stride_size=1 )
p6 = maxpool( c6 , pool_size=2 , stride_size=2 )
flatten = tf.reshape( p6 , shape=( tf.shape( p6 )[0] , -1 ))
d1 = dense( flatten , weights[ 12 ] )
d2 = dense( d1 , weights[ 13 ] )
d3 = dense( d2 , weights[ 14 ] )
d4 = dense( d3 , weights[ 15 ] )
d5 = dense( d4 , weights[ 16 ] )
logits = tf.matmul( d5 , weights[ 17 ] )
return tf.nn.softmax( logits )
Q. Why are declaring the model as a function? Later on, we will pass a batch of data to this function and get the outputs. We do not use
Session
as Eager execution is enabled by default. See this guide.
The loss function is easy.
def loss( pred , target ):
return tf.losses.categorical_crossentropy( target , pred )
Next, comes the most confusing part for a beginner ( for me too! ). We will use tf.GradientTape
for optimizing the model.
img_classify_tf2_4.py
optimizer = tf.optimizers.Adam( learning_rate )
def train_step( model, inputs , outputs ):
with tf.GradientTape() as tape:
current_loss = loss( model( inputs ), outputs)
grads = tape.gradient( current_loss , weights )
optimizer.apply_gradients( zip( grads , weights ) )
print( tf.reduce_mean( current_loss ) )
num_epochs = 256
for e in range( num_epochs ):
for features in dataset:
image , label = features[ 'image' ] , features[ 'label' ]
train_step( model , image , tf.one_hot( label , depth=3 ) )
What’s happening here?
tf.GradientTape
and within its scope, we call the model()
and loss()
methods in it. Hence, all the functions in these methods will be differentiated during backpropagation.tape.gradient
method.optimizer.apply_gradients
method ( Earlier we used optimizer.minimize
which is still available )Read more about it from here.
#TensorFlow #ai #Image