The code that accompanies this article can be found here.
☞ Machine Learning Optimization (1) - Gradient Descent with Python
So far in our journey through the Machine Learning universe, we covered several big topics. We investigated some regression algorithms, classification algorithms and algorithms that can be used for both types of problems (SVM**, **Decision Trees and Random Forest). Apart from that, we dipped our toes in unsupervised learning, saw how we can use this type of learning for clustering and learned about several clustering techniques.
We also talked about how to quantify machine learning model performance and how to improve it with regularization. In all these articles, we used Python for “from the scratch” implementations and libraries like TensorFlow, Pytorch and SciKit Learn. In the previous article, we covered the topic of Gradient Descent, the grandfather of all optimization techniques. Following down that path, we explore momentum-based optimizers and the optimizers that scale the learning rate.
Gradient Descent keeps track only of the gradient from the previous step, which can make it slow. If the gradient is small, meaning that the slope is not that big, it will take it some time until the minimum is reached. Essentially, this is the problem that optimizers that we explore are trying to solve. These optimizers are more often used with neural networks than with regular machine learning algorithms. However, in order to simplify explanations, we use linear regression for all explanations. As in the previous article, it is important to note that these techniques are not machine learning algorithms. They are solvers of minimization problems in which the function to minimize has a gradient in most points of its domain.
#machine learning #python #artificaial inteligance #data science