You’ve probably started hearing a lot more about Reinforcement Learning in the last few years, ever since the AlphaGo model, which was trained using reinforcement-learning, stunned the world by beating the then reigning world champion at the complex game of Go.Over a series of articles, I’ll go over the basics of Reinforcement Learning (RL) and some of the most popular algorithms and deep learning architectures used to solve RL problems. We’ll try to focus on understanding these principles in as intuitive a way as possible without going too much into mathematical theory. In this first article, I’ll introduce many of the fundamental concepts and terminology of RL, so that we can build solutions using them in the following articles.
Typically when people provide an overview of ML, the first thing they explain is that it can be divided into two categories, Supervised Learning and Unsupervised Learning. However, there is a third category, viz. RL although it isn’t mentioned as often as its two more glamorous siblings.
Supervised Learning uses labeled data as input, and predicts outcomes. It receives feedback from a Loss function acting as a ‘supervisor’.Unsupervised Learning uses unlabeled data as input and detects hidden patterns in the data such as clusters or anomalies. It receives no feedback from a supervisor.Reinforcement Learning gathers inputs and receives feedback by interacting with the external world. It outputs the best actions that it needs to take while interacting with that world.
Rather than the typical ML problems such as Classification, Regression, Clustering and so on, RL is most commonly used to solve a different class of real-world problems, such as a Control task or Decision task, where you operate a system that interacts with the real world.
It is useful for a variety of applications like:
With RL the learning happens from experience by trial and error, similar to a human eg. A baby can touch fire or milk and then learns from negative or positive reinforcement.
#reinforcement-learning #machine-learning #ai #data-science #algorithms