Introduction to Reinforcement Learning: the Frozen Lake Example

Introduction to Reinforcement Learning: the Frozen Lake Example

In this post, we introduce Reinforcement Learning basic concepts using the Frozen Lake game example.

Let's understand how Reinforcement Learning works through a simple example. Let's play a game called The Frozen Lake. Suppose you were playing frisbee with your friends in a park during winter. One of you threw the frisbee so far that it has dropped in a frozen lake. Your mission is to walk over the frozen lake to get the frisbee back, but taking caution to not fall in a hole of freezing water.

We could easily create a bot that always wins this game by writing a simple algorithm giving the right directions to reach the frisbee. But that's not challenging or fun at all. Instead, we want to create an agent that can learn the path to the frisbee while playing the game multiples times.

In the reinforcement learning paradigm, the learning process is a loop in which the agent reads the state of the environment and then executes an action. Then the environment returns its new state and a reward signal, indicating if the action was correct or not. The process continues until the environment reaches a terminal state or if a maximum number of iterations is executed.

Therefore, the first step to train our agent to play the Frozen Lake game is to model each aspect of the game as components of a reinforcement learning problem.

The Environment

The environment is a representation of the context that our agent will interact with. It can represent an aspect of the real world, like the stock market, or a street for example, or it can be a completely virtual environment, like a game. In either case, the environment defines the states and rewards the agent can receive as well as the possible actions that the agent can execute for each state. In our case, the environment is the Frozen Lake game, which consists of a grid of squares that can be of two types:

  • Grey squares, representing a safe thick layer of ice that you can walk over.
  • Blue squares, representing holes in the ice.

The hero can move in four directions (up, down, right, left) inside the grid. If you reach the frisbee you win the game, but you lose if you fall into a hole.

State

States are observations that the agent receives from the environment. It's the way the agent receives all available information about the environment. In our example, the state of the game is simply the position of the character in the grid, which will be represented by a pair of coordinates (i, j).

Actions

Actions are performed by the agent and may change the state of the environment. All the rules of how an action changes the state of the environment are internal to the environment. For a given state, the agent can choose which will be its next action, but it does not have any control over how this action will affect the environment. For the Frozen Lake example, the available actions to the agent are the four directions to which our hero can move: up, right, down and left.

Rewards

Rewards signal to the agent if an action was correct or not. In our example, the environment returns +1 when the hero reaches the frisbee or -1 if the hero falls into a hole. Every other cases are considered neutral, so the environment returns 0 (zero).

The Gameplay

The game starts with the hero at the position (0, 0). From this state, there are two possible actions, moving right or moving down. If the agent chooses to move to the right, then the new state will be (0, 1) and the agent will receive a reward of 0 (zero) because it's a neutral play. From the position (0, 1) there are three possible moves: left, right and down. Supposing the agent chooses to move down, then the new state will be the position (1, 1). However, this position is a hole in the ice, which means the agent lost the game. So the agent receives an -1 penalty and the game is over. This sequence is shown at the table below:

Current state Action Next State Reward Game over?
(0, 0) ➡️ (0, 1) 0 No
(0, 1) ⬇️ (1, 1) -1 Yes

Another possible sequence from the state (0, 1) is the hero moves to the right and reaches the state (0, 2). From there, the hero keeps moving until it reaches the position (3, 2) and finally it moves to the left, reaching the frisbee at the position (3, 3) and receiving a +1 reward.

Current state Action Next State Reward Game over?
(0, 0) ➡️ (0, 1) 0 No
(0, 1) ➡️ (0, 2) 0 No
(0, 2) ⬇️ (1, 2) 0 No
(1, 2) ⬇️ (2, 2) 0 No
(2, 2) ⬇️ (3, 2) 0 No
(3, 2) ➡️ (3, 3) +1 Yes
Conclusion

Using a simple example, we learned the fundamental concepts of the reinforcement learning paradigm. Every time we want to apply the reinforcement learning approach to solve a problem, we always have to delimit the environment, identify states, possible actions and set the appropriate rewards.

Learning in Artificial Intelligence - Great Learning

Learning in Artificial Intelligence - Great Learning

What is Artificial Intelligence (AI)? AI is the ability of a machine to think like human, learn and perform tasks like a human. Know the future of AI, Examples of AI and who provides the course of Artificial Intelligence?

US and China are massively investing in Artificial Intelligence which create a promising career in the field. One of the first steps to a successful artificial Intelligence career is to learn the basics around the domain. Articles and Guides are your opening friends towards a successful AI Career. Read on to know more.

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Artificial Intelligence vs. Machine Learning vs. Deep Learning. We are going to discuss we difference between Artificial Intelligence, Machine Learning, and Deep Learning

We are going to discuss we difference between Artificial Intelligence, Machine Learning, and Deep Learning

Furthermore, we will address the question of why Deep Learning as a young emerging field is far superior to traditional Machine Learning

Artificial Intelligence, Machine Learning, and Deep Learning are popular buzzwords that everyone seems to use nowadays.

But still, there is a big misconception among many people about the meaning of these terms.

In the worst case, one may think that these terms describe the same thing — which is simply false.

A large number of companies claim nowadays to incorporate some kind of “ Artificial Intelligence” (AI) in their applications or services.

But artificial intelligence is only a broader term that describes applications when a machine mimics “ cognitive “ functions that humans associate with other human minds, such as “learning” and “problem-solving”.

On a lower level, an AI can be only a programmed rule that determines the machine to behave in a certain way in certain situations. So basically Artificial Intelligence can be nothing more than just a bunch of if-else statements.

An if-else statement is a simple rule explicitly programmed by a human. Consider a very abstract, simple example of a robot who is moving on a road. A possible programmed rule for that robot could look as follows:

Instead, when speaking of Artificial Intelligence it's only worthwhile to consider two different approaches: Machine Learning and Deep Learning. Both are subfields of Artificial Intelligence

Machine Learning vs Deep Learning

Now that we now better understand what Artificial Intelligence means we can take a closer look at Machine Learning and Deep Learning and make a clearer distinguishment between these two.

Machine Learning incorporates “ classical” algorithms for various kinds of tasks such as clustering, regression or classification. Machine Learning algorithms must be trained on data. The more data you provide to your algorithm, the better it gets.

The “training” part of a Machine Learning model means that this model tries to optimize along a certain dimension. In other words, the Machine Learning models try to minimize the error between their predictions and the actual ground truth values.

For this we must define a so-called error function, also called a loss-function or an objective function … because after all the model has an objective. This objective could be for example classification of data into different categories (e.g. cat and dog pictures) or prediction of the expected price of a stock in the near future.

When someone says they are working with a machine-learning algorithm, you can get to the gist of its value by asking: What’s the objective function?

At this point, you may ask: How do we minimize the error?

One way would be to compare the prediction of the model with the ground truth value and adjust the parameters of the model in a way so that next time, the error between these two values is smaller. This is repeated again and again and again.

Thousands and millions of times, until the parameters of the model that determine the predictions are so good, that the difference between the predictions of the model and the ground truth labels are as small as possible.

In short machine learning models are optimization algorithms. If you tune them right, they minimize their error by guessing and guessing and guessing again.

Machine Learning is old…

The basic definition of machine learning is:

Algorithms that analyze data, learn from it and make informed decisions based on the learned insights.

Machine learning leads to a variety of automated tasks. It affects virtually every industry — from IT security malware search, to weather forecasting, to stockbrokers looking for cheap trades. Machine learning requires complex math and a lot of coding to finally get the desired functions and results.

Machine learning algorithms need to be trained on large amounts of data.
The more data you provide for your algorithm, the better it gets.

Machine Learning is a pretty old field and incorporates methods and algorithms that have been around for dozens of years, some of them since as early as the sixties.

These classic algorithms include algorithms such as the so-called Naive Bayes Classifier and the Support Vector Machines. Both are often used in the classification of data.

In addition to the classification, there are also cluster analysis algorithms such as the well-known K-Means and the tree-based clustering. To reduce the dimensionality of data and gain more insight into its nature, machine learning uses methods such as principal component analysis and tSNE.

Deep Learning — The next big Thing

Now let’s focus on the essential thing that is at stake here. On deep learning.
Deep Learning is a very young field of artificial intelligence based on artificial neural networks.

Again, deep learning can be seen as a part of machine learning because deep learning algorithms also need data to learn how to solve problems. Therefore, the terms of machine learning and deep learning are often treated as the same. However, these systems have different capabilities.

Deep Learning uses a multi-layered structure of algorithms called the neural network:

It can be viewed again as a subfield of Machine Learning since Deep Learning algorithms also require data in order to learn to solve tasks. Although methods of Deep Learning are able to perform the same tasks as classic Machine Learning algorithms, it is not the other way round.

Artificial neural networks have unique capabilities that enable Deep Learning models to solve tasks that Machine Learning models could never solve.

All recent advances in intelligence are due to Deep Learning. Without Deep Learning we would not have self-driving cars, chatbots or personal assistants like Alexa and Siri. Google Translate app would remain primitive and Netflix would have no idea which movies or TV series we like or dislike.

We can even go so far as to say that the new industrial revolution is driven by artificial neural networks and Deep Learning. This is the best and closest approach to true machine intelligence we have so far. The reason is that Deep Learning has two major advantages over Machine Learning.

Why is Deep Learning better than Machine Learning?

Feature Extraction

The first advantage of Deep Learning over machine learning is the needlessness of the so-called Feature Extraction.

Long before deep learning was used, traditional machine learning methods were popular, such as Decision Trees, SVM, Naïve Bayes Classifier and Logistic Regression. These algorithms are also called “flat algorithms”.

Flat means here that these algorithms can not normally be applied directly to the raw data (such as .csv, images, text, etc.). We require a preprocessing step called Feature Extraction.

The result of Feature Extraction is an abstract representation of the given raw data that can now be used by these classic machine learning algorithms to perform a task. For example, the classification of the data into several categories or classes.

Feature Extraction is usually pretty complicated and requires detailed knowledge of the problem domain. This step must be adapted, tested and refined over several iterations for optimal results.

On the other side are the artificial neural networks. These do not require the step of feature extraction. The layers are able to learn an implicit representation of the raw data directly on their own.

Here, a more and more abstract and compressed representation of the raw data is produced over several layers of an artificial neural network. This compressed representation of the input data is then used to produce the result. The result can be, for example, the classification of the input data into different classes.

In other words, we can also say that the feature extraction step is already a part of the process that takes place in an artificial neural network. During the training process, this step is also optimized by the neural network to obtain the best possible abstract representation of the input data. This means that the models of deep learning thus require little to no manual effort to perform and optimize the feature extraction process.

For example, if you want to use a machine learning model to determine whether a particular image shows a car or not, we humans first need to identify the unique features of a car (shape, size, windows, wheels, etc.), extract these features and give them to the algorithm as input data. This way, the machine learning algorithm would perform a classification of the image. That is, in machine learning, a programmer must intervene directly in the classification process.

In the case of a deep learning model, the feature extraction step is completely unnecessary. The model would recognize these unique characteristics of a car and make correct predictions- completely without the help of a human.

In fact, this applies to every other task you’ll ever do with neural networks.
They just give the raw data to the neural network, the rest is done by the model.

The Era of Big Data…

The second huge advantage of Deep Learning and a key part in understanding why it’s becoming so popular is that it’s powered by massive amounts of data. The “Big Data Era” of technology will provide huge amounts of opportunities for new innovations in deep learning. To quote Andrew Ng, the chief scientist of China’s major search engine Baidu and one of the leaders of the Google Brain Project:

The analogy to deep learning is that the rocket engine is the deep learning models and the fuel is the huge amounts of data we can feed to these algorithms.

Deep Learning models scale better with a larger amount of data

Deep Learning models tend to increase their accuracy with the increasing amount of training data, where’s traditional machine learning models such as SVM and Naive Bayes classifier stop improving after a saturation point.

Special Announcement: We just released a free Course on Deep Learning!

I am the founder of DeepLearning Academy, an advanced Deep Learning education platform. We provide practical state-of-the-art Deep Learning education and mentoring to professionals and beginners.

Among our things we just released a free Introductory Course on Deep Learning with TensorFlow, where you can learn how to implement Neural Networks from Scratch for various use-cases using TensorFlow.

If you are interested in this topic, feel free to check it out ;)

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Learn the Difference between the most popular Buzzwords in today's tech. World — AI, Machine Learning and Deep Learning

In this article, we are going to discuss we difference between Artificial Intelligence, Machine Learning, and Deep Learning.

Furthermore, we will address the question of why Deep Learning as a young emerging field is far superior to traditional Machine Learning.

Artificial Intelligence, Machine Learning, and Deep Learning are popular buzzwords that everyone seems to use nowadays.

But still, there is a big misconception among many people about the meaning of these terms.

In the worst case, one may think that these terms describe the same thing — which is simply false.

A large number of companies claim nowadays to incorporate some kind of “ Artificial Intelligence” (AI) in their applications or services.

But artificial intelligence is only a broader term that describes applications when a machine mimics “ cognitive “ functions that humans associate with other human minds, such as “learning” and “problem-solving”.

On a lower level, an AI can be only a programmed rule that determines the machine to behave in a certain way in certain situations. So basically Artificial Intelligence can be nothing more than just a bunch of if-else statements.

An if-else statement is a simple rule explicitly programmed by a human. Consider a very abstract, simple example of a robot who is moving on a road. A possible programmed rule for that robot could look as follows:

Instead, when speaking of Artificial Intelligence it's only worthwhile to consider two different approaches: Machine Learning and Deep Learning. Both are subfields of Artificial Intelligence

Machine Learning vs Deep Learning

Now that we now better understand what Artificial Intelligence means we can take a closer look at Machine Learning and Deep Learning and make a clearer distinguishment between these two.

Machine Learning incorporates “ classical” algorithms for various kinds of tasks such as clustering, regression or classification. Machine Learning algorithms must be trained on data. The more data you provide to your algorithm, the better it gets.

The “training” part of a Machine Learning model means that this model tries to optimize along a certain dimension. In other words, the Machine Learning models try to minimize the error between their predictions and the actual ground truth values.

For this we must define a so-called error function, also called a loss-function or an objective function … because after all the model has an objective. This objective could be for example classification of data into different categories (e.g. cat and dog pictures) or prediction of the expected price of a stock in the near future.

When someone says they are working with a machine-learning algorithm, you can get to the gist of its value by asking: What’s the objective function?

At this point, you may ask: How do we minimize the error?

One way would be to compare the prediction of the model with the ground truth value and adjust the parameters of the model in a way so that next time, the error between these two values is smaller. This is repeated again and again and again.

Thousands and millions of times, until the parameters of the model that determine the predictions are so good, that the difference between the predictions of the model and the ground truth labels are as small as possible.

In short machine learning models are optimization algorithms. If you tune them right, they minimize their error by guessing and guessing and guessing again.

Machine Learning is old…

Machine Learning is a pretty old field and incorporates methods and algorithms that have been around for dozens of years, some of them since as early as the sixties.

Some known methods of classification and prediction are the Naive Bayes Classifier and the Support Vector Machines. In addition to the classification, there are also clustering algorithms such as the well-known K-Means and tree-based clustering. To reduce the dimensionality of data to gain more insights about it’ nature methods such as Principal component analysis and tSNE are used.

Deep Learning — The next big Thing

Deep Learning, on the other hand, is a very young field of Artificial Intelligence that is powered by artificial neural networks.

It can be viewed again as a subfield of Machine Learning since Deep Learning algorithms also require data in order to learn to solve tasks. Although methods of Deep Learning are able to perform the same tasks as classic Machine Learning algorithms, it is not the other way round.

Artificial neural networks have unique capabilities that enable Deep Learning models to solve tasks that Machine Learning models could never solve.

All recent advances in intelligence are due to Deep Learning. Without Deep Learning we would not have self-driving cars, chatbots or personal assistants like Alexa and Siri. Google Translate app would remain primitive and Netflix would have no idea which movies or TV series we like or dislike.

We can even go so far as to say that the new industrial revolution is driven by artificial neural networks and Deep Learning. This is the best and closest approach to true machine intelligence we have so far. The reason is that Deep Learning has two major advantages over Machine Learning.

Why is Deep Learning better than Machine Learning?

The first advantage is the needlessness of Feature Extraction. What do I mean by this?

Well if you want to use a Machine Learning model to determine whether a given picture shows a car or not, we as humans, must first program the unique features of a car (shape, size, windows, wheels etc.) into the algorithm. This way the algorithm would know what to look after in the given pictures.

In the case of a Deep Learning model, is step is completely unnecessary. The model would recognize all the unique characteristics of a car by itself and make correct predictions.

In fact, the needlessness of feature extraction applies to any other task for a deep learning model. You simply give the neural network the raw data, the rest is done by the model. While for a machine learning model, you would need to perform additional steps, such as the already mentioned extraction of the features of the given data.

The second huge advantage of Deep Learning and a key part in understanding why it’s becoming so popular is that it’s powered by massive amounts of data. The “Big Data Era” of technology will provide huge amounts of opportunities for new innovations in deep learning. To quote Andrew Ng, the chief scientist of China’s major search engine Baidu and one of the leaders of the Google Brain Project:

The analogy to deep learning is that the rocket engine is the deep learning models and the fuel is the huge amounts of data we can feed to these algorithms.

Deep Learning models tend to increase their accuracy with the increasing amount of training data, where’s traditional machine learning models such as SVM and Naive Bayes classifier stop improving after a saturation point.