The Soft Actor Critic Algorithm is a powerful tool for solving cutting edge deep reinforcement learning problems involving continuous action space environments. It’s a variation of the actor critic method that leverages a maximum entropy framework, double Q networks, and target value networks.

The entropy is modeled by scaling the reward factor, with an inverse relationship between the reward scale and the entropy of our agent. Larger reward scaling means more deterministic behavior, and a larger reward scale means more stochastic behavior.

We’re going to implement this algorithm using the tensorflow 2 framework, and test it out on the Inverted Pendulum environment found in the PyBullet package.

https://youtu.be/YKhkTOU0l20

#deep-learning #machine-learning #artificial-intelligence #python #reinforcement-learning #data-science

Soft Actor Critic (SAC) in Tensorflow2
3.30 GEEK