Giter Site home page Giter Site logo

diegoferigo / phd-thesis Goto Github PK

View Code? Open in Web Editor NEW
7.0 1.0 1.0 3.21 MB

Simulation Architectures for Reinforcement Learning applied to Robotics

Home Page: http://diegoferigo.github.io/phd-thesis

License: GNU General Public License v2.0

TeX 99.73% PLSQL 0.23% Perl 0.04%
algorithms latex manchester phd phd-thesis reinforcement-learning rigid-body-dynamics robotics simulations synthetic-data

phd-thesis's Introduction

Simulation Architectures
for
Reinforcement Learning applied to Robotics

University of Manchester

2022

phd_thesis_pdf

Abstract

Abstract

There is no doubt that we are living in the age of data. In the last two decades, the scientific community has been able to produce systems with superhuman capabilities through the combination of modern hardware advancements, novel learning algorithms and architectures, and advances in software frameworks. Such progress revolutionised domains like computer vision and language processing, showing performance previously out of reach. One may think that results could transfer straightforwardly to other fields like robotics until realising the existence of domain-specific characteristics and limitations hindering the potential of these learning methods. Generating enough data from real-world robots is often too expensive or not even possible to the desired scale. Data sampled from robots has a sequential nature, and not all families of learning algorithms are effective in this context. Furthermore, most algorithms that excel in this sequential setting, such as those belonging to the Reinforcement Learning (RL) family, learn by a trial-and-error process, which could lead to trajectories that damage either the robots or their surroundings.

In this thesis, we attempt to answer the question, "How can modern technology help us generate synthetic data for humanoid robot planning and control?".

Motivated by the advancements in hardware accelerators that are revolutionising scientific computing, we limit our analysis to the simulation realm. In this context, we first introduce a software architecture allowing to structure learning environments for robotics that can be adopted to train and run RL policies regardless of the simulated or real-world setting. With its underlying simulation technology and exploiting a scheme based on reward shaping, we validate the architecture by training with RL a push-recovery controller capable of synthesising whole-body references for the humanoid robot iCub. Then, motivated by overcoming the bottlenecks related to the poor sampling performance of traditional rigid-body simulators, we present a new physics engine in reduced coordinates that can simulate robots interacting with a ground surface on hardware accelerators like GPUs and TPUs. To this end, we present a contact-aware continuous state-space representation describing the dynamical evolution of floating-base robots that can be numerically integrated for simulation purposes. We adopt the new general-purpose Gazebo Sim simulator as our first solution to sample synthetic data, and exploit JAX and its hardware support to scale the sampling performance for highly parallel problems. Furthermore, we implement and benchmark common Rigid Body Dynamics Algorithms part of the proposed physics engine on hardware accelerators and assess their scalability properties on different GPUs. These pieces of technology help to lower the computational barriers that nowadays are still among the main bottlenecks for obtaining intelligent agents, democratising the applicability of this family of learning-based methods.

Citing

@phdthesis{ferigo_phd_thesis_2022,
  title = {Simulation Architectures for Reinforcement Learning applied to Robotics},
  author = {Ferigo, Diego},
  school = {University of Manchester},
  type = {PhD Thesis},
  month = {July},
  year = {2022},
  url = {https://github.com/diegoferigo/phd-thesis/releases/latest/download/thesis.pdf},
}

Contributing

For any doubt or to report an error, please open an issue.

If you want to fix the document yourself, please open a PR against the main branch (see branching details below). The Continuous Integration pipeline implemented in this repository will compile the LaTeX sources with your contribution and upload the PDF document as artifact of the workflow for inspection.

Branching

This repository has two branches:

  • overleaf is the branch connected to my personal Overleaf project.
  • main is the branch associated to external contributions and releases.

The Overleaf Git system does not currently support branching. For this reason, I cannot select main as default branch of the repository, even if it is.

If you want to contribute with a new PR, please target the main branch.

phd-thesis's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

niceboy120

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.