Weight Initialization for Neural Networks

Machine learning and deep learning techniques have encroached in every possible domain you can think of. With the increasing availability of digitized data and the advancement of computation capability of modern computers, it will possibly flourish in near future. As part of it, every day more and more people are delving into the task of building and training models. As a newbie in the domain, I also have to build and train a few neural network models. Well, most of the time, my primary goal remains normally to make a highly accurate as well generalized model. And in order to achieve that I normally break my head over finding proper hyper-parameters for my model, what all different regularization techniques I can apply, or thinking about the question, do I need to further deepen my model and so on. But often I forget to play around with the weight-initialization techniques. And I believe this is the case for many others like me.

Wait, weight initialization? Does it matter at all? I dedicate this article to find out the answer to the questions. When I went over the internet to find the answer to the question, there is overwhelming information about it. Some articles talk about the mathematics behind these articles, some compare this technique theoretically and some are more into deciding whether uniform initialization is better or the normal one. In this situation, I resorted to a result-based approach. I tried to find out the impact of weight initialization techniques via a small experiment. I applied a few of these techniques into a model and tried to visualize the training process itself, keeping aside the goal of achieving high accuracy for some time.

#model-training #neural-networks #machine-learning #weight-initialization #deep-learning

levelup.gitconnected.com

Weight Initialization for Neural Networks — Does it matter?