Optimization methods are the engines underlying neural networks that enable them to learn from data. In this lecture, DeepMind Research Scientist James Martens covers the fundamentals of gradient-based optimization methods, and their application to training neural networks. Major topics include gradient descent, momentum methods, 2nd-order methods, and stochastic methods. James analyzes these methods through the interpretive framework of local 2nd-order approximations.
#machine-learning