Giter Site home page Giter Site logo

svs-distributeddriving's Introduction

DISTRIBUTED DEEP REINFORCEMENT LEARNING FOR AUTONOMOUS DRIVING

This work aims to explore the potentialities of distributed reinforcement learning algorithms in the autonomous driving field

INFO

This work is based on the following research:

http://www.mitchellspryn.com/content/Autonomous-Driving-With-Deep-Reinforcement-Learning/DistributedRlForAd.pdf

A reference implementation (utilized as a starting point for this work) is available at the following address:

https://github.com/microsoft/AutonomousDrivingCookbook/tree/master/DistributedRL

ENVIRONMENT PREPARATION

0 - Install Anaconda

Fix SSL: https://github.com/conda/conda/issues/8273

2 - Create the virtual environment

conda create --prefix=./envs python=3.6

3 - Activete the newly created environment

conda activate ./envs

4 - Install the dependencies

python ./install_dependencies.py

5 - Disable the virtual environment

conda deactivate

6 - To delete the envireonment use

conda env remove -p ./envs

SIMULATOR PREPARATION

1 - Download the simulator

azcopy copy 'https://airsimtutorialdataset.blob.core.windows.net/e2edl/AD_Cookbook_AirSim.7z' './'

2 - Start the simulator

.\AD_Cookbook_Start_AirSim.ps1 neighborhood -window

COORDINATOR NODE

To start the coordinator agent use the following command (parameters must be replaced)

python src\manage.py runserver ip:port data_dir={0} experiment_name={1} batch_update_frequency={2} weights_path={3} train_conv_layers={4} per_iter_epsilon_reduction={5} min_epsilon={6}

Example:

python .\manage.py runserver 0.0.0.0:7777 data_dir='C:\\Users\\peppe_000\\Documents\\MyProjects\\SmartVehicularSystems\\DistributedRL\\data' experiment_name='experiment_refactored_1' batch_update_frequency=200 weights_path='C:\\Users\\peppe_000\\Documents\\MyProjects\\SmartVehicularSystems\\DistributedRL\\data\\pretrain_model_weights.h5' train_conv_layers='false' per_iter_epsilon_reduction=0.003 min_epsilon=0.1

WORKER NODE

To start a node agent use the following command (parameters must be replaced)

python src\app\distributed_agent.py data_dir={0} max_epoch_runtime_sec={1} batch_size={2} replay_memory_size={3} experiment_name={4} weights_path={5} train_conv_layers={6} 

Example:

python .\distributed_agent.py data_dir='C:\\Users\\peppe_000\\Documents\\MyProjects\\SmartVehicularSystems\\DistributedRL\\data' max_epoch_runtime_sec=30 batch_size=32 replay_memory_size=1500 experiment_name='experiment_refactored_1' weights_path='C:\\Users\\peppe_000\\Documents\\MyProjects\\SmartVehicularSystems\\DistributedRL\\data\\pretrain_model_weights.h5' train_conv_layers='false' airsim_path='D:\\AirSim\\AD_Cookbook_AirSim' airsim_simulation_name='neighborhood' coordinator_address='192.168.1.6:7777'

PARAMETERS

batch_update_frequency: This is how often the weights from the actively trained network get copied to the target network. It is also how often the model gets saved to disk. For more details on how this works, check out the Deep Q-learning paper.

max_epoch_runtime_sec: This is the maximum runtime for each epoch. If the car has not reached a terminal state after this many seconds, the epoch will be terminated and training will begin.

per_iter_epsilon_reduction: The agent uses an epsilon greedy linear annealing strategy while training. This is the amount by which epsilon is reduced each iteration.

min_epsilon: The minimum value for epsilon. Once reached, the epsilon value will not decrease any further. batch_size: The minibatch size to use for training.

replay_memory_size: The number of examples to keep in the replay memory. The replay memory is a FIFO buffer used to reduce the effects of nearby states being correlated. Minibatches are generated from randomly selecting examples from the replay memory.

weights_path: If we are doing transfer learning and using pretrained weights for the model, they will be loaded from this path.

train_conv_layers: If we are using pretrained weights, we may prefer to freeze the convolutional layers to speed up training.

airsim_path: Location of the AirSim executable (AD_Cookbook_Start_AirSim.ps1)

airsim_simulation_name: Simulation scenario. The default AirSim deistribution contains the following configurations: 'city', 'landscape', 'neighborhood', 'coastline', 'hawaii'

coordinator_address: The address of the master node in the form 'IP:PORT'

Example parameters:

batch_update_frequency = 300
max_epoch_runtime_sec = 30
per_iter_epsilon_reduction = 0.003
min_epsilon = 0.1
batch_size = 32
replay_memory_size = 2000
weights_path = 'Z:\\data\\pretrain_model_weights.h5'
train_conv_layers = 'false'
airsim_path = 'Z:\\AirSim'
airsim_simulation_name = 'neighborhood'
coordinator_address = '192.169.1.5:7777'

TEST

To run the simulator using a given model use the following command:

python .\tester.py 'model_path' 'isH5file'

model_path: the path from where the model will be loaded.

isH5file: True/False. True if the model to load is an H5 file

Example:

python .\tester.py 'C:\\Users\\peppe_000\\Documents\\MyProjects\\SmartVehicularSystems\\DistributedRL\\data\\checkpoint\\experiment_refactored_1\\3444.json' 'False'

The simulator must be launched manually

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.