Tensorflow implementation of TRPO(Trust Region Policy Optimization) with GAE(Generalized Advantage Estimator) on mujoco
yjhong89 / trpo-gae Goto Github PK
View Code? Open in Web Editor NEWTrust Region Policy Optimization with Generalized Advantage Estimator