Giter Site home page Giter Site logo

faizansana / intersection-driving Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 0.0 132 KB

Dockerized Container Architecture for Parallel Training of CARLA Gym Environments

License: MIT License

Dockerfile 10.70% Python 82.22% Shell 7.08%
carla carla-reinforcement-learning carla-simulator docker reinforcement-learning stable-baselines3

intersection-driving's Introduction

Training Architecture for CARLA-based Reinforcement Learning Environments

Push to Docker Hub

Containerized DRL training architecture for gymnasium based CARLA Simulator environments. Particularly designed for the intersection carla gym repository.

Getting Started

DRL Algorithms Supported

System Requirements

The following are the requirements for running this repository using the provided Docker files:

  • Operating System: Linux (tested on Ubuntu 20.04/22.04)
  • NVIDIA GPU with CUDA support (tested on NVIDIA GeForce RTX 3060/3080/3090/4080/4090)

Setup

  1. Clone the repository

    git clone https://github.com/faizansana/intersection-driving.git
    
  2. Run the dev_config.sh file to set the environment variables for docker.

    bash dev_config.sh
    
  3. From within the working directory, open the .env file to change any specific requirements such as CARLA version, CUDA version etc. The following are the default configurations:

    Variable Description Default Value
    FIXUID UID of current user (UID of your current user)
    FIXGID GID of current user (GID of current user)
    CARLA_VERSION Version of CARLA 0.9.10.1
    CARLA_QUALITY Quality setting for CARLA Low
    GPU_ID_CARLA_MAIN GPU ID for CARLA main 0
    GPU_ID_CARLA_DEBUG GPU ID for CARLA debug 0
    GPU_ID_MAIN_CONTAINER GPU ID for main container 0
    CARLA_SERVER_REPLICAS Number of CARLA server replicas 5
    CARLA_DEBUG_SERVER_REPLICAS Number of CARLA debug server replicas 0
    CUDA_VERSION Version of CUDA 12.0.0

    Note: The GPU IDs are automatically set by checking the least used GPUs on the system.

  4. Pull the already built containers from Docker Hub if available.

    docker compose pull
    
  5. After the containers have been pulled, start them using the following command.

    docker compose up -d
    
  6. Open the main_container, and attach it to VS Code using the Remote Explorer extension.

Scripts Usage (from within main container)

The following are the scripts developed for use (found within src folder):

  1. multi_retrain.py: Retrain multiple DRL models using a yaml file with their locations.

    Example usage:

    python multi_retrain.py -f file_with_model_paths.yaml -t number_of_timesteps_to_train
  2. multi_testmodel.py: Test multiple models based on the performance metrics defined in test_model.py.

    Example usage:

    python multi_testmodel.py

    Note: Modify the model_paths list in the script to select the model paths

  3. multi_train.py: Train multiple DRL algorithms in parallel in different CARLA instances

    Example usage:

    python multi_train.py -t number_of_timesteps_to_train
  4. test_model.py: Test a single DRL model.

    Example usage:

    python test_model.py -m path_to_model -v verbosity_level -c carla_host --episodes numberof_episodes -d display_or_not --config-file path_to_environment_config 
  5. train.py: Train a single DRL model or retrain a model.

    Example usage:

    python train.py -m name_of_model -v verbosity_level -c carla_host --episodes numberof_episodes -d display_or_not --config-file path_to_environment_config -p carla_port

intersection-driving's People

Contributors

faizansana avatar dependabot[bot] avatar deepsource-io[bot] avatar

Stargazers

Evilpigkiller avatar Harry avatar

Watchers

Kostas Georgiou avatar  avatar

intersection-driving's Issues

Update launch.json for args based on file

Currently, if a configuration is set to for example train.py, it only runs the debugger on that. Read into VS Code documentation to see how to set the profile such that debugger works for any file but if a specific file such as train.py is run, then the provided arguments are taken into consideration.

CARLA ROS Bridge Container only works for versions <0.9.12

For some reason, it does not install the rviz dependency for CARLA versions >=0.9.12. So during catkin build, since carla-rviz depends on rviz, it fails.

It works when doing it within an interactive docker container. This could likely be due to incorrect environment variable setting.

Unify the `--model` argument in train.py

Currently, there are two arguments to pass in a model.

The --model is the name of the model while --model-path is for selecting a path to a model to be retrained. To unify, remove the model-path argument and test if the model argument is a path or a name of a model and handle it within script directly.

Improve Training logging and naming of models

Currently the models are saved based on time.

Do the following:

  • Save the model based on an experiment name set in potentially either config file or in args
  • For logging, save the config parameters used for training of that model.
  • The logging of each file in multi_train.py needs to be enhanced by making a unique one each time.

Fix Recurrent PPO model testing

In the case of Recurrent PPO, since we are using LSTMs, the states need to be stored and passed into the model args for the next prediction.

When retraining, error thrown "No data found in the saved file"

When training fails due to seg fault or similar, and retraining is started, sometimes the latest_model.zip does not contain any information.

Env running on server intersection-driving-carla_server-1
connecting to Carla server...
Carla server port 2000 connected!
Loading model from /home/docker/src/src/Training/Models/DDPG/2024-02-23_22-52-16/latest_model.zip
------ custom_carla_gym ------
------ 1,500,000 ------
No data found in the saved file

Possible solutions:

  • Use best_model.zip if that happens
  • Enhance saving of latest_model.zip such that it always saves a copy even if error occurs. Enhance exception handling.

Fix callback model naming

Currently, the callback saves the best model everytime a new high reward is achieved. However, it currently overwrites the old model. Change this by checking the time at the current timestep and saving the model with that name.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.