Introduction

Deep Reinforcement Learning (Deep RL) has been gaining a lot of traction over the years. With several landmark successes[6], it is easy to think that we can just instantiate the current state-of-the-art algorithm, apply it to a problem, and it would just work… it often doesn’t.


Andrychowicz et al explored one such class of failure cases in 2017 in their seminal work on the Hindsight Experience Replay (HER)[1] algorithm: they explored the class of problems where an agent needs to reach a goal, and only receives a reward on success.

This article discusses why this class of problem is particularly difficult, how the HER algorithm works, how it alleviates aspects of the problem, some aspects of the problem it doesn’t address, and how we can go further to improve performance on those.

#deep-learning #q-learning #ai #reinforcement-learning #machine-learning

Accelerating Information Propagation in Hindsight Experience Replay
1.15 GEEK