Scott Fahlman’s idea to speed up gradient descent. Due to the slowly converging nature of the vanilla back-propagation algorithms of the ’80s/’90s, Scott Fahlman invented a learning algorithm dubbed Quickprop  that is roughly based on Newton’s method. His simple idea outperformed back-propagation (with various adjustments) on problem domains like the ‘N-M-N Encoder’ task — i.e.
Due to the slowly converging nature of the vanilla back-propagation algorithms of the ’80s/’90s, Scott Fahlman invented a learning algorithm dubbed Quickprop  that is roughly based on Newton’s method. His simple idea outperformed back-propagation (with various adjustments) on problem domains like the ‘N-M-N Encoder’ task — i.e. training a de-/encoder network with N inputs, M hidden units and N outputs.
One of the problems that Quickprop specifically tackles is the issue of finding a domain-specific optimal learning rate, or rather: an algorithm that adjusts it appropriately dynamically.
In this article, we’ll look at the simple mathematical idea behind Quickprop. We’ll implement the basic algorithm and some improvements that Fahlman suggests — all in Python and PyTorch.
A rough implementation of the algorithm and some background can already be found in this useful blog post by Giuseppe Bonaccorso. We are going to expand on that — both on the theory and code side — but if in doubt, have a look at how Giuseppe explains it.
The motivation to look into Quickprop came from writing [my last article_](https://towardsdatascience.com/cascade-correlation-a-forgotten-learning-architecture-a2354a0bec92) on the “Cascade-Correlation Learning Architecture” . There, I used it to train the neural network’s output and hidden neurons, which was a mistake I realized only later and which we’ll also look into here._
To follow along with this article, you should be familiar with how neural networks can be trained using back-propagation of the loss gradient (as of 2020, a widely used approach). That is, you should understand how the gradient is usually calculated and applied to the parameters of a network to try to iteratively achieve convergence of the loss to a global minimum.
We’ll start with the mathematics behind Quickprop and then look at how it can be implemented and improved step by step.
To make following along easier, any equations used and inference steps done are explained in more detail than in the original paper.
The often used learning method of back-propagation for neural networks is based on the idea of iteratively ‘riding down’ the slop of a function, by taking short steps in the inverse direction of its gradient.
These ‘short steps’ are the crux here. Their length usually depends on a learning rate factor, and that is kept intentionally small to not overshoot a potential minimum.
Enroll now at CETPA, the best Institute in India for Artificial Intelligence Online Training Course and Certification for students & working professionals & avail 50% instant discount.
What is Artificial Intelligence (AI)? AI is the ability of a machine to think like human, learn and perform tasks like a human. Know the future of AI, Examples of AI and who provides the course of Artificial Intelligence?
You got intrigued by the machine learning world and wanted to get started as soon as possible, read all the articles, watched all the videos, but still isn’t sure about where to start, welcome to the club.
PyTorch is a library in Python which provides tools to build deep learning models. What python does for programming PyTorch does for deep learning.
PyTorch for Deep Learning | Data Science | Machine Learning | Python. PyTorch is a library in Python which provides tools to build deep learning models. What python does for programming PyTorch does for deep learning. Python is a very flexible language for programming and just like python, the PyTorch library provides flexible tools for deep learning.