How Transfer Learning works

Transfer Learning is the process of taking a pre-trained neural network and adapting the neural network to a new different dataset by transferring or repurposing the learned features. For example, we take a model trained on ImageNet and use the learned weight in that model to initialize the training and classification of an entirely new dataset. For example, in research work by published by Sebastrian Thrun and his team, 129,450 clinical skin cancer images were classified using pre-trained models and it achieved excellent results. The approach achieves performance at the same level with results from experts on two tasks: skin cancer identification and identification of deadliest skin cancer tasks. This is just an example that shows us that artificial intelligence is indeed capable of classifying skin cancer with a level of competence comparable to dermatologists. The CEO of DeepMind has this to say on transfer learning:

“I think transfer learning is the key to general intelligence. And I think the key to doing transfer learning will be the acquisition of conceptual knowledge that is abstracted away from perceptual details of where you learned it from”- Demis Hassabis (CEO DeepMind)

Transfer learning is also particularly useful with a limited computing resource. A lof of state of the art models takes several days and weeks in some cases to train even when trained on highly powerful GPU machines. Thus, in order not to repeat the same process over a long period of time, transfer learning allows us to make use of pre-trained weights as a starting point.

Learning from scratch is hard to do and difficult to achieve the same performance level as in transfer learning approach. To put in perspective, I used an AlexNet architecture to train a dog breed classifier from scratch and achieved 10% accuracy after 200 epochs. The same AlexNet pre-trained on ImageNet, when used with freeze weights, achieved 69% accuracy over 50 epochs.

Transfer learning often involves taking the pre-trained weights in the first layers which are often general to many datasets and initializing the last layers randomly with and training them for classification purpose. Thus, in transfer learning approach, learning or backpropagation occurs only at the last layers initialized with random weights. Meanwhile, there are several approaches to transfer learning and the approach we use is dependent on the nature of our new dataset we want to classify with respect to the dataset of the pre-trained models. There are four main scenarios or cases of transfer learning.

#transfer-learning #deep-learning #neural-networks #machine-learning

towardsdatascience.com

How Transfer Learning works