Reinforcement Learning Made Simple: Intro to Basic Concepts and Terminology

You’ve probably started hearing a lot more about Reinforcement Learning in the last few years, ever since the AlphaGo model, which was trained using reinforcement-learning, stunned the world by beating the then reigning world champion at the complex game of Go.Over a series of articles, I’ll go over the basics of Reinforcement Learning (RL) and some of the most popular algorithms and deep learning architectures used to solve RL problems. We’ll try to focus on understanding these principles in as intuitive a way as possible without going too much into mathematical theory. In this first article, I’ll introduce many of the fundamental concepts and terminology of RL, so that we can build solutions using them in the following articles.

Overview of RL

Where does RL fit in the world of Machine Learning?

Typically when people provide an overview of ML, the first thing they explain is that it can be divided into two categories, Supervised Learning and Unsupervised Learning. However, there is a third category, viz. RL although it isn’t mentioned as often as its two more glamorous siblings.

Supervised Learning uses labeled data as input, and predicts outcomes. It receives feedback from a Loss function acting as a ‘supervisor’.Unsupervised Learning uses unlabeled data as input and detects hidden patterns in the data such as clusters or anomalies. It receives no feedback from a supervisor.Reinforcement Learning gathers inputs and receives feedback by interacting with the external world. It outputs the best actions that it needs to take while interacting with that world.

How is RL different from Supervised (or Unsupervised) Learning?

There is no supervisor to guide the trainingYou don’t train with a large (labeled or unlabeled) pre-collected dataset. Rather, your ‘data’ is provided to you dynamically via feedback from the real-world environment with which you are interacting.You iteratively make decisions over a sequence of time-steps eg. In a Classification problem, you run inference once on data input to produce an output prediction. With Reinforcement Learning, you run inference repeatedly, navigating through the real-world environment as you go.

What problems are RL used to solve?

Rather than the typical ML problems such as Classification, Regression, Clustering and so on, RL is most commonly used to solve a different class of real-world problems, such as a Control task or Decision task, where you operate a system that interacts with the real world.

eg. A robot or drone that has to learn the task of picking a device from one box and putting it in a container

It is useful for a variety of applications like:

Operating a drone or autonomous vehicleManipulating a robot to navigate the environment and perform various tasksManaging an investment portfolio and taking trading decisionsPlaying games such as Go, Chess, video games

Reinforcement Learning happens through trial and error

With RL the learning happens from experience by trial and error, similar to a human eg. A baby can touch fire or milk and then learns from negative or positive reinforcement.

The baby takes some actionReceives feedback from the environment about the result of that actionRepeats this process till it learns which actions produce favorable results and which actions produce unfavorable results.

#reinforcement-learning #machine-learning #ai #data-science #algorithms