Giter Site home page Giter Site logo

relesas's Introduction

RELESAS

"REinforcement LEarning Signal And Speed control"


RELESAS provides an easy-to-use interface for traffic control on the base of Multi Agent Reinforcement Learning with SUMO.

Firstly starting as a reimplementation of SUMO-RL, RELESAS went a step further to integrate a few more features. Summed-up, this repo offers the following core functionalities:

  1. Traffic light control ‒ either solely based on lane statistics (SUMO-RL), or utilizing also information on lane-leading vehicles (V2I*),
  2. Lane-based vehicle speed control. That is, to selectively reduce the traffic speed to prevent queueing-up and thus preserving traffic flow. Speed control can be either applied to an entire lane, or just to the leading vehicle.

The folder where most of the modules reside, is environment. A few of the herein enclosed classes are:

  • The BaseEnv handles low-level stuff, such as starting/resetting/quitting SUMO, obtaining trip info of arrived vehicles, etc. When implementing own environments, one could take this as a useful starting point by inheriting from it.
  • GymEnv directly inherits from BaseEnv and implements higher-level logic. That is, handling the enclosed actuator instances and implementing the gym interface.
  • RLLibEnv forms a thin wrapper around GymEnv such that it can be interfaced by RLLib.
  • The actuators, forming an abstraction for various agents types that may act within the environment. So-far implemented actuators are TrafficLight, Lane and LaneCompound. More information on the actuators is provided below.
  • MetricsWrapper wraps the RLLibEnv and occasionally writes stats into a set of YAML and CSV files. Using these files, one can later plot the results of finished training runs (see below).
  • env_config_templates.py contains multiple functions, each providing template GymEnv.Config objects pointing to standard SUMO scenarios. Most of these scenarios stem from SUMO-RL and RESCO.

Furthermore, this repo implements a MetricsCallback to report training progress to Tensorboard.

To make training results comparable, this framework implements RESCO's trip delay metric. This metric consists of the mean vehicle time loss (in seconds), and is computed each time an episode ends.


*V2I, short for Vehicle-To-Infrastructure: Vehicles constantly report their speeds/positions/etc. to the infrastructure.

Actuator types

The so-far present actuator types serve the purposes of traffic light control and vehicle speed control.

With the prior being represented by the TrafficLight actuator, the latter was implemented as Lane and LaneCompound. Why separating into the last two?

The reason is computational resources. While Lane was the first to be implemented (and works fine!), it soon became evident that providing an agent for each single controlled lane meant too much of a computational load ‒ the trainings became too slow, especially on larger scenarios. LaneCompound is an attempt to mitigate this by combining multiple spatially-close lanes. The result was training times being reduced by about 50+%, while maintaining equal performance. Note that the usage of Lane and LaneCompound is mutually exclusive!

Implementing new actuators
By offering the actuators interface, this framework can be easily extended by new agent types. For doing so, the following steps should be followed:

  1. Place the actuator code into a new file within the actuators folder
  2. Create a new class that inherits from BaseActuator. Implement the abstract functions.
  3. Extend GymEnv to instantiate the new actuator class and to return its instances from within the _actuators property

Install

Installing SUMO

sudo add-apt-repository ppa:sumo/stable
sudo apt-get update
sudo apt-get install sumo sumo-tools sumo-doc 

When developing this framework, SUMO version 1.15.0-1 was used.

After having SUMO installed, the SUMO_HOME variable must be set (with default sumo installation path being /usr/share/sumo)

echo 'export SUMO_HOME="/usr/share/sumo"' >> ~/.bashrc
source ~/.bashrc

Important
Setting the environment variable LIBSUMO_AS_TRACI=1 is not recommended, as this framework heavily utilizes multiprocessing. See Libsumo limitations here.

Downloading this framework

I explicitly decided against publishing this work on PyPI, because ‒in order to start trainings‒ it is necessary to change a few lines in the enclosed train.py. Also, having the code present, makes it easier to experiment and implement own features.

To download RELESAS, enter the following lines

cd ~/path/to/folder
git clone https://github.com/robvoe/RELESAS

Build Conda environment

This step requires Conda to be installed.

cd ~/path/to/RELESAS
conda env create -f environment.yml 

Unpack the trained sample models for scenario SingleIntersection

cd ~/path/to/RELESAS/outputs
7z x examples-single-intersection.7z.001

The shipped evaluation scripts refer to these trainings, in order to provide a few examples.

Training on GPU
In order to conduct GPU-enabled trainings, one must include the commented cudatoolkit line in environment.yml before building the Conda env. In addition to that, the train.py file accordingly needs a minor modification. Nevertheless, the benefit from using GPU is expected to be limited, as the network sizes are too small to outweigh the copy-to-GPU cost.

Run trainings

The trainings start from train.py. I explicitly decided against creating an argparse monster, because I find them difficult to extend/understand/maintain. Instead, the user is expected to parametrize the trainings within train.py itself.

The only parameters that have to be supplied to a training are experiment_name and env_name. While the first can be an arbitrarily chosen name, the second follows the naming of the functions found in env_config_templates.py

A training can be started with the following lines from shell:

cd ~/path/to/RELESAS
conda activate RELESAS
python train.py  --env_name sumo_rl_single_intersection  --experiment_name training-si.1

Single trials & compound trials
Since RL (or better, DL in general) is strongly stochastic, each training run may look a bit different from others ‒ even if it was started with exactly the same parameters set. To make the train results a bit more reasonable, train.py per default starts so-called trial compounds. These represent a multitude of parallel, equally-parametrized single trials (usually 5).

Training outputs
Right after starting a training, a new subfolder is created within outputs ‒ which in case of the training run being a trial compound contains N single trial directories. Inside each single trial folder, one can find checkpoints, Tensorboard logs, metrics logs and more.

Checkpointing & early stopping
During training, each time a model is found to be better than a previous one (see RESCO trip delay above), a model checkpoint is saved to the folder outputs/trial-compound/single-trial/checkpoint. As soon as a model is found not to be improving anymore, the trial stops prematurely. Note that potential other parallel trials may keep running.

Observing running trainings using Tensorboard
During an ongoing training, one can observe the model convergence using Tensorboard.

cd ~/path/to/RELESAS
conda activate RELESAS
tensorboard --logdir outputs
tensorboard --logdir_spec label1:outputs/trial-compound1,label2:outputs/trial-compound2

While the first shows all contents from the outputs folder, the second can be used to show only certain trainings.

Plotting training results
Statistics collected during training can be plotted using plot_episode_end_metrics.py. See function test_plot_exemplary_trainings() on how to plot the shipped exemplary trainings. The uncertainty boundaries come from the number of three parallel, equally-parametrized trainings.

Exemplaric plot of two finished trial compounds

Exemplaric plot of two finished trial compounds

Exemplaric plot of two finished trial compounds

Loading model checkpoints and evaluation
The module util/evaluation.py offers logic to load checkpointed models and run evaluation on them. Examples on how to use evaluation can be found in the same file.

Running trainings using Singularity
Singularity is a container platform that's often found in high performance computing (HPC) environments. Since a user usually is not able to install software (in this case SUMO), trainings have to take place within containers. This repo provides a definition file of such a Singularity container ‒ including SUMO and a pre-installed Conda environment.

Enter the following lines to build the container. Note that sudo permissions are necessary ‒ but often it is enough to build the container locally and transfer it to the HPC afterwards (the .sif file).

cd ~/path/to/RELESAS/singularity
sudo singularity build train-container.sif train-container.def

Later on, trainings can be run as follows, without sudo permissions.

cd ~/path/to/RELESAS
singularity  run  ~/path/to/RELESAS/singularity/train-container.sif   python  ~/path/to/RELESAS/train.py  --env_name sumo_rl_single_intersection  --experiment_name training-si.1

Scenarios

RELESAS ships common scenarios from RESCO and SUMO-RL. These can be found in the folder scenarios. Note that all scenarios are distributed under their original licenses, see their regarding subfolders.

Exemplaric plot of two finished trial compounds

RESCO's results on the respective scenarios can be found in their paper.

Influences from SUMO-RL

RELESAS loosely bases on SUMO-RL. This majorly means that a few implementation details were adopted from SUMO-RL's TrafficSignal class, as well as a few details on SUMO application handling.

In opposition to SUMO-RL, it is not this repo's goal to offer compatibility to various MARL frameworks. The only so-far supported framework is RLLib. Nevertheless, adapting the GymEnv to another MARL framework shouldn't be a big deal at all!

relesas's People

Contributors

robvoe avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Forkers

gavin-tao

relesas's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.