Giter Site home page Giter Site logo

smart_excavator's Introduction

SmartExcavator

How it works?

Slides: https://docs.google.com/presentation/d/1zghz1NY8mvjboHIhBuQBRlYX4yq5YN74vpu2CCoh68M/edit?usp=sharing Demonstration video: https://drive.google.com/open?id=1xaPrbcaeuSkuyYewuuIVlgGE9vkiKVN0

Requirements

  1. Python 3.6.8 or higher
  2. Mevea Simulation Software
  3. Excavator Model.

Installation

  1. Clone the repository:
git clone https://github.com/mizolotu/SmartExcavator
  1. From "SmartExcavator" directory, install required python packages (it is obviously recommended to create a new virtualenv and install everything there):
python -m pip install -r requirements.txt
  1. Open Mevea Modeller and load Excavator Model. Go to Scripting -> Scripts, create a new script object, select "env_backend.py" from "SmartExcavator" directory as the file name in the object view. Go to I/O -> Inputs -> Input_Slew (or any other input component) and select the object just created as the script name.

  2. In terminal, navigate to the Mevea Software resources directory (default: C:\Program Files (x86)\Mevea\Resources\Python\Bin) and install numpy and requests:

python -m pip install numpy requests

Demo

In terminal, navigate to "SmartExcavator" directory and run:

python excavator_demo.py -m path_to_the_excavator_mvs

Once the solver has started, use joysticks or keyboard to grab some soil with the excavator and put it to the dumper. After that, the reinforcement learning agent kicks in and attempts to complete the work.

Continue training

Both models have been learned for several days so far. To continue training, substitute "policy_name" with either "residual" or "pure" (default: residual) and specify number of environments (default: 2), for example:

python excavator_demo.py -m path_to_the_excavator_mvs -t train -p residual -n 2

Both pure and residual policies are learned using OpenAI's implementation of proximal policy optimization (PPO2) algorithm. The difference is that in case of pure reinforcement learning, demonstration data is only used to navigate the excavator to the target, after that the agent starts exploring the environment from scratch. In case of residual learning, the resulting policy is equal to sum of the agent policy and the best (in terms of the amount of soil grabbed) demonstration example. According to https://arxiv.org/pdf/1906.05841.pdf, residual learning outperforms other forms of learning from demonstration.

The amount of trainable variables of the neural network can be increased in "baselines/common/models.py". Find the "mlp" model and increase the number of layers and the number of neurons in the hidden layers, e.g. num_layers=3 and num_hidden=256. Bigger numbers of parameters require more iterations to train, but in theory the resulting model should perform better. Regardless of the neural network architecture, at least one week of training is required to see some progress. In the figure below, the red line is the reward caclulated for the demonstration examples and the blue line is the evolution of the RL agent reward (pure) during first two weeks: Reward

Other scripts

plot_progress.py - plots mean, minimal and maximal rewards obtained during the training and stored in "progress.csv" file. Usage:

python plot_progress.py policy_name

select_demonstration.py - simulates the excavator inputs using the demonstration data and selects the best example in terms of the mass of soil grabbed and amount of soil lost during the turn to the dumper. The trajectories found are stored in files "data/dig.txt" and "data/emp.txt" respectively. Usage:

python select_demonstration.py path_to_excavator_mvs

These trajectories are the ones used by the agent to learn its policy, not the ones that the user inputs during the inference stage. This can be easily modified in such a way that the trajectories are built based on the user's input, but it would not make much sense, because an unexperienced user would certainly like to use the demonstration data recorded by a professional operator, instead of his/her own.

Known issues

  1. Excavator may get stuck if the slew angle is about 135 degrees, becasue of the collison betweeen the bucket and the corner of the excavator platform.
  2. Excavator always rotates in the same direction that has been specified during the user's input, because it cannot read minds.
  3. Spawing several instances of the solver during the training not only slows down the training, but may also negatively affect the policy learned, if the speed loss is significant. We recommend to run a single training iteration using only one solver, then restart with two solvers and compare the time spent in both of those cases. If the increase of the iteration duration is not significant, you can try to increment the number of environments.
  4. Running "excavator_demo.py" in the train mode results in file "progress.csv" being overwritten. For this reason, executing "plot_progress.py" only makes sense immediately after (or during) the training.
  5. Communication between excavator gym environment specified in "excavator_env.py" and "env_backend.py" running by the solver instance is carried out via http (flask), which is probably overkill.

smart_excavator's People

Contributors

mizolotu avatar

Stargazers

Benjamin avatar  avatar  avatar  avatar  avatar YuyingShen avatar Riceball LEE avatar  avatar

Watchers

James Cloos avatar  avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.