We can see the presence of a loss function i.e the difference between the actual value(y) and the predicted value(ŷ) from the neural network.

In order to reduce this loss we use an optimizer.There are various types of optimizers one of them being Gradient Descent.

Gradient Descent is a very generic optimization algorithm capable of finding optimal solutions to a wide range of problems.

The general idea of Gradient Descent is to tweak parameters iteratively in order to minimize the loss function.

In short what the Gradient descent does is to find out the best weights(or parameters) in order to reduce the loss function by modifying all the weights(or parameters) with the help of backpropagation through the network.

Eq1: Modifying old weights to obtain new weights

*Here λ is known as the learning rate.*

## How does Gradient Descent help us?

Suppose you are lost in the mountains in a dense fog, you can only feel the slope of the ground below your feet. A good strategy to get to the bottom of the valley quickly is to go downhill in the direction of the steepest slope. This is exactly what Gradient Descent does: it measures the local gradient of the error function with regards to the parameter θ, and it goes in the direction of descending gradient.Once the gradient is zero, you have reached a minimum.

#gradient-descent #regression #machine-learning #normalization #deep-learning #deep learning