Deep Deterministic Policy Gradients (DDPG)

Deep Deterministic Policy Gradients (DDPG) is an actor critic algorithm designed for use in environments with continuous action spaces. This makes it great for fields like robotics, that rely on applying continuous voltages to electric motors. You’ll get a crash course with a quick lecture, followed by a live coding tutorial.

Despite being an actor critic method, DDPG makes use of a number of innovations from deep Q learning. We’re going to make use of a replay memory for training our agent, as well as target actor and target critic networks for learning stability. One key difference is that DDPG uses a soft update rule for the target network parameters, rather than a direct hard copy of the online networks.

In this tutorial we’re going to use Tensorflow 2 to implement a deep deterministic policy gradient agent in the pendulum environment from the Open AI gym.

https://youtu.be/4jh32CvwKYw

#python #deep-learning #artificial-intelligence #tensorflow #machine-learning

youtu.be

Deep Deterministic Policy Gradients (DDPG) | Tensorflow 2 Tutorial