Deep Q-Networks have revolutionized the field of Deep Reinforcement Learning, but the technical prerequisites for easy experimentation have barred newcomers until now.
In this post, we will investigate how easily we can train a Deep Q-Network (DQN) agent (Mnih et al., 2015) for Atari 2600 games using the Google reinforcement learning library Dopamine. While many RL libraries exist, this library is specifically designed with four essential features in mind:
We believe these principles makes __Dopamine __one of the _**_best RL learning environment available today_. Additionally, we even got the library to work on Windows, which we think is _quite a feat**!
In my view, the visualization of any trained RL agent is an absolute must in reinforcement learning! Therefore, we will (of course) include this for our own trained agent at the very end!
We will go through all the pieces of code required (which is** minimal compared to other libraries**), but you can also find all scripts needed in the following Github repo.
The general premise of deep reinforcement learning is to
“derive efficient representations of the environment from high-dimensional sensory inputs, and use these to generalize past experience to new situations.”
- Mnih et al. (2015)
As stated earlier, we will implement the DQN model by Deepmind, which only uses raw pixels and game score as input. The raw pixels are processed using convolutional neural networks similar to image classification. The primary difference lies in the objective function, which for the DQN agent is called the optimal action-value function
where_ rₜ is the maximum sum of rewards at time _t discounted by γ, obtained using a behavior policy_ π = P(a_∣s) for each observation-action pair.
There are relatively many details to Deep Q-Learning, such as Experience Replay (Lin, 1993) and an _iterative update rule. _Thus, we refer the reader to the original paper for an excellent walk-through of the mathematical details.
One key benefit of DQN compared to previous approaches at the time (2015) was the ability to outperform existing methods for Atari 2600 games using the same set of hyperparameters and only pixel values and game score as input, clearly a tremendous achievement.
This post does not include instructions for installing Tensorflow, but we do want to stress that you can use both the CPU and GPU versions.
Nevertheless, assuming you are using
Python 3.7.x, these are the libraries you need to install (which can all be installed via
tensorflow-gpu=1.15 (or tensorflow==1.15 for CPU version) cmake dopamine-rl atari-py matplotlib pygame seaborn pandas
Introduction to Q-Learning from scratch, we’ll illustrate how this technique works by introducing a robot example where a reinforcement learning agent tries to maximize points. So, let’s get to it!
Designing user experiences is a difficult art. Compared to other applications, video games provide designers a huge canvas to work with.
This paper presents a deep reinforcement learning model that learns control policies directly from high-dimensional sensory inputs.
Dummies guide to Reinforcement learning, Q learning, Bellman Equation. You’re getting bore stuck in lockdown, you decided to play computer games to pass your time.
Paper Summary: Discovering Reinforcement Learning Agents. Learning the Update Rule through Meta-Learning. One exception to this rule can be found in the field of meta-learning.