Recently, DeepMind collaborated with Columbia University to propose Taylor expansion Policy Optimisation (TayPO), which is a policy optimisation formalism that generalises methods like trust region policy optimisation (TRPO) and improves the performance of several state-of-the-art distributed algorithms.

Read more: https://analyticsindiamag.com/deepmind-introduces-taypo-a-policy-optimisation-framework-for-rl-algorithm/

#deepmind #algorithm #reinforcementlearning

DeepMind Introduces TayPO, A PO Framework For RL Algorithm
1.10 GEEK