Q-Learning and difficulties with continuous action space

Value-Based Methods like DQN have achieved remarkable breakthroughs in the domain of Reinforcement Learning. However, their success is bound to problems with discrete action spaces, like Atari games.

For this kind of problem, the agent has a discrete set of possible actions to take. Whereas an action can only be taken or not taken. Certainly, this limits the scope of applicability. Because a wide range of problems, arguably the majority, deal with continuous action space problems!

So then what are these problems with continuous action space?

Let’s consider our agent has to steer a car. In the case of the discrete action space, our agent would have two options: Move right or move left.

You can imagine for steering a car you had only the option to “move right” or “move left” would most likely end up in a strong zigzag trajectory.

Image for post

However, if you could choose a specific steering angle, let’s say in a range from 0° to 180°, it would lead to a way more smooth trajectory. This is why most of the real-world applications use continuous action spaces. Be it for the steering angle, the control of the joints of a robot, or setting the current of an e-motor.

#artificial-intelligence #machine-learning #deep-learning #reinforcement-learning #q-learning

Q-Learning and difficulties with continuous action space

medium.com

NAF: Normalized Advantage Function — DQN for Continuous Control Tasks