# Implementation of Stochastic Gradient Descent

The purpose of writing this post is to understand the maths behind gradient descent. Most of us are using gradient descent in machine learning, but we need to understand the maths behind it. As a fresher, when I was learning stochastic gradient descent, I found it a little bit complex. Here, I tried to make it simpler for those who want to know how it works. My focus on this post is to demonstrate the mathematics behind gradient descent.

Take a quick refresher on what is gradient descent?

Gradient Descent: It is an optimization technique that is used to find coefficients of a function that minimizes an output error.

1. Initialize the values for the coefficients (It could be 0.0 or small random value)
2. Calculate the cost function by substituting coefficients to the function
3. Calculate the partial derivative of the total error with respect to weight
4. Update the values of the coefficients
5. Repeat the above procedure until we get cost 0.0 or no further improvements in cost can be achieved

