How I achieved a 95.5% accuracy on a Kaggle Deep Learning competition

It is a very weird time to be alive. Suddenly, you have so much time in your hands that you really do not know what to do with it. I am writing this blog cause I am bored of procrastinating. Never thought it would come to this but it has.

I have two tasks on my To-Do list which have been pending for quite some time now. First being, participate in an online ML competition and second being, writing a tech blog. At the onset, I would like to thank Shia LaBeouf for his motivational pep talk. If you are lacking motivation, I suggest you check this video out. His wise words can motivate a rock to erode faster, just saying.

Cut to the chase, this blog is intended to provide some direction to newbies who have taken the ML courses by Andrew Ng or are familiar with the concepts of ML and Deep Learning but do not know where to start and how to write good code for ML. I have written this blog as a series of questions I had in mind and how I answered it when I came across the Kaggle challenge.

The objective for me was just to learn how to train models the right way and organize my code with the best coding practices and hopefully achieve a good accuracy in the process. I found an interesting challenge called the Plant Pathology Challenge hosted by Cornell University. The problem statement seemed fairly simple and a good challenge to begin with. The problem’s objectives as described in Kaggle read as follows:-

TL;DR: Given an image of a leaf, you have to diagnose the health of the plant. Classify it into one of 4 classes: healthy, multiple diseases, rust or scab.

“Objectives of ‘Plant Pathology Challenge’ are to train a model using images of training dataset to 1) Accurately classify a given image from testing dataset into different diseased category or a healthy leaf; 2) Accurately distinguish between many diseases, sometimes more than one on a single leaf; 3) Deal with rare classes and novel symptoms; 4) Address depth perception — angle, light, shade, physiological age of the leaf; and 5) Incorporate expert knowledge in identification, annotation, quantification, and guiding computer vision to search for relevant features during learning.”

4 sample outputs from the leaf dataset

What are the first thoughts I had after reading about the challenge?

The first thoughts I had was that I had to read on the state-of-the-art CNN models. At the moment, EfficientNet B7 has achived 84.4% top-1 accuracy on the ImageNet datasets and seems a pretty solid ConvNet to tackle the challenge.

How do I train a state-of-the-art ConvNet?

It is not advisable to train large ConvNets from scratch. The state-of-the-art ConvNets are large and they have a huge number of parameters(EfficientNet-B5 has 30 million parameters). It is better to load pretrained weights trained on the large datasets and then train specific to your task aka transfer learning.

Think of the pretrained weights like the foundations of a house. Training from pretrained weights is like constructing the rest of the house based upon that foundation. The pretrained weights are tuned to extract important features from the images. Thus when you train with your custom dataset, the model does not need to learn how to extract important features. Features could be as basic as vertical lines on an image to as complex as detecting a car on an image. What the model learns to extract is solely dependent on the dataset on which the network is pretrained. So a model trained on a larger dataset like the ImageNet is pretty solid!

How do I structure my code?

I have 2+ years of professional experience as a software developer. In these 2 years I have picked up some good practicesfor writing well structured code. Furthermore, I went through various jupyter notebooks published on Kaggle and multiple open source GitHub projects. Everybody has their own way to write code but there is a general trend amongst most developers. I follow a structure which makes sense to me. I have used this structure for developing the codebase for the Plant Pathology challenge. Hopefully it makes sense to you as well. :)

Here is the structure I follow:-

1. Put all your hyperparameters and constants in the beginning and in a single place

batch_size = 16
epoch = 50
model_name = 'efficientnet-b5'
image_size = EfficientNet.get_image_size(model_name)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

This is helpful because you can easily change your hyperparameters from one place and it is good to keep it in one place cause you will do whole of of hyperparameter tuning.

2. Create a custom class for your dataset and define functions that caters to your needs.

It is a very Pythonian way to go about doing stuff. Basically the idea is that you create a custom class that imports the data (say from a csv) and does all sorts of transformations and outputs the transformed data which can be fed into your model.

For example, I have defined a class _Dataset _that loads the train/cross validation/test data based on the parameters that I pass to it while creating its object.

#pytorch #deep-learning #machine-learning #kaggle #towards-data-science

medium.com

How I achieved a 95.5% accuracy on a Kaggle Deep Learning competition