In this post we’re going to embark on a journey to explore and dive deep into the world of optimizers for machine learning models. We will also try and understand the foundational mathematics behind these functions and discuss their use cases, merits, and demerits.

So, what are we waiting for? Let’s get started!

What is an Optimizer?

Don’t we all love when neural networks work their magic and provide us with tremendous accuracy? Let’s get to the core of this magic trick by understanding how our networks find the most optimal parameters for our model.

We know that loss functions are used to understand how good/bad our model performs on the data provided to it. Loss functions are essentially the summation of the difference between the predicted and calculated values for given training samples. For training a neural network to minimize its losses so as to perform better, we need to tweak the weights and parameters associated with the model and the loss function. This is where optimizers play a crucial role.

Optimizers associate loss function and model parameters together by updating the model, i.e. the weights and biases of each node based on the output of the loss function.

#optimization #data-science #machine-learning #heartbeat #gradient-descent

Exploring Optimizers in Machine Learning
1.25 GEEK