Reinforcement learning control of biped robot
On the basis of biped kinematics, dynamics and conservation of angular momentum, the impact map is computed to model robot step switching
PD controller for this under-actuated system is implemented
And then the gait metrics are introduced and the unconstrained
optimization is carried out to decide optimal parameters as
The reinforcement learning toolbox is used to implement the environment and the agent. The reward function is designed as
The episode resward during training as
It can be seen that:
Pros: can achieve better performances
than any other controller thanks to learning
Cons: difficult to find good reward function,
training is very slow
Bonus: pretty robust against perturbations