Policy gradient methods are very popular reinforcement learning(RL) algorithms. They are very useful in that they can directly model the policy, and they work in both discrete and continuous space. In this article, we will:
I assume readers have an understanding of reinforcement learning basics. As a refresher, you can take a quick look at the first section of my previous post A Structural Overview of Reinforcement Learning Algorithms.
I have also implemented Deep Q-net (DQN) in Tensorflow to play CartPole previously. Check it out here if you are interested. :)
#reinforcement-learning #artificial-intelligence #policy-gradient #tensorflow