Giter Site home page Giter Site logo

yasinalm / gramme Goto Github PK

View Code? Open in Web Editor NEW
36.0 1.0 7.0 27.52 MB

We provide the code, pretrained models, and scripts to reproduce the experiments of the paper "Towards All-Weather Autonomous Driving". All code was implemented in Python using the deep learning framework PyTorch.

License: MIT License

Python 99.28% Shell 0.72%
unsupervised-learning radar-odometry visual-odometry lidar-odometry self-driving-cars self-supervised-learning

gramme's Introduction

DOI GitHub

Towards All-Weather Autonomous Driving



Implementation, source code, pre-trained models and scripts for our paper "Deep learning-based robust positioning for all-weather autonomous driving" published in Nature Machine Intelligence.

In this work, we propose deep learning-based GRAMME generalizable to diverse settings such as day, night, rain, fog, and snow. GRAMME is a geometry-Aware, multi-modal, modular, interpretable, and self-supervised ego-motion estimation system.

GRAMME involves the following architectures:

  • Self-supervised monocular depth and ego-motion
  • Self-supervised stereo depth and ego-motion
  • Self-supervised lidar ego-motion
  • Self-supervised radar ego-motion
  • Self-supervised depth and ego-motion from camera and lidar
  • Self-supervised depth and ego-motion from camera and radar

Overview of the GRAMME conceptual framework and architecture:

Example Results

Generalization to adverse conditions and depth prediction performance.

Generalization to adverse conditions and depth prediction performance.

Computational Hardware and Software

The experiments are conducted on an Ubuntu 20.04 LTS computer with Intel Xeon CPUs and NVIDIA RTX 3090 GPUs accelerated by CuDNN, using Python 3.8.

Getting started

  • Clone this repo:
git clone https://github.com/yasinalm/gramme
cd gramme

Prerequisites

We recommend creating a virtual environment with Python 3.8 conda create -n gramme python=3.8 anaconda. Using a fresh Anaconda distribution, you can install the dependencies with:

conda install pytorch=1.8.2 torchvision=0.9.1 cudatoolkit=10.2 -c pytorch
conda install -c conda-forge tensorboardx=2.4
pip install --user colour-demosaicing # needed to process datasets

You can install the remaining dependencies as follow:

conda install --file requirements.txt

Datasets

GRAMME currently supports the following datasets and the sensors. To easily make use of our preprocessing codes, keep the directory structure the same as the original datasets. Please follow their licencing regulations for the datasets.

  • Oxford Robotcar dataset
    • Bumblebee XB3 stereo camera (main data source for the experiments in the paper)
    • Grasshopper2 monocular camera
    • SICK LD-MRS 3D LIDAR
    • NovAtel SPAN-CPT ALIGN inertial and GPS navigation system
  • Oxford Radar Robotcar dataset
    • Bumblebee XB3
    • Grasshopper2
    • NovAtel SPAN-CPT ALIGN inertial and GPS navigation system
    • Navtech CTS350-X radar
    • Velodyne HDL-32E lidar
  • RADIATE dataset.
    • ZED stereo camera
    • Velodyne HDL-32E lidar
    • Navtech CTS350-X radar
    • Advanced Navigation Spatial Dual GPS/IMU

To test our provided pre-trained models, you can download the small-size sample sequences.

In addition, you can quickly adopt GRAMME to custom datasets by extending the data loader classes located in the ./datasets/ folder.

Preprocessing

Although GRAMME supports on-the-fly processing of the input files, we recommend offline processing of the input datasets using the provided scripts. The scripts will process the datasets and save the processed files to disk. The preprocessing includes colour-demosaicing of the Bayer images and rectification. We also optionally crop the bottom part of the Robotcar images occluded by the bonnet. You can use the following script for preprocessing:

For the Robotcar dataset:

cd preprocess/
python undistort_robotcar.py /path/to/robotcardataset/

For the RADIATE dataset:

cd preprocess/
python rectify_radiate.py /path/to/radiatedataset/

The scripts will create a sub-directory in the dataset folder named stereo_undistorted and save the processed files to the directory. You can notify the training and inference codes passing the pretrained flag. See the documentation of the codes for more details.

Experiments

You can use the following commands to train and test the models.

Training

After the download and the preprocessing, you can run the following commands. For ease of use, we created individual files to train/test the desired modalities.

Radar-based experiments To train a radar-based model including the camera, run the following command:

python train.py /path/to/dataset/ --dataset dataset_name --name experiment_name

The code can be configured with the following arguments:

usage: train.py [-h] [--sequence-length N] [--skip-frames N] [-j N]
                [--epochs N] [--train-size N] [--val-size N] [-b N] [--lr LR]
                [--momentum M] [--beta M] [--weight-decay W] [--print-freq N]
                [--ckpt-freq N] [--seed SEED] [--log-summary PATH]
                [--log-full PATH] [--log-output LOG_OUTPUT]
                [--resnet-layers {18,50}] [--num-scales W] [--img-height W]
                [--img-width W] [-p W] [-f W] [-s W] [-c W]
                [--with-ssim WITH_SSIM] [--with-mask WITH_MASK]
                [--with-auto-mask WITH_AUTO_MASK] [--with-masknet]
                [--masknet {convnet,resnet}] [--with-vo]
                [--cam-mode {mono,stereo}] [--pretrained-disp PATH]
                [--pretrained-vo-pose PATH] [--with-pretrain]
                [--dataset {hand,robotcar,radiate,cadcd}]
                [--with-preprocessed WITH_PREPROCESSED]
                [--pretrained-mask PATH] [--pretrained-pose PATH]
                [--pretrained-fusenet PATH] [--pretrained-optim PATH] --name
                NAME [--padding-mode {zeros,border}] [--gt-file DIR]
                [--radar-format {cartesian,polar}] [--range-res W]
                [--angle-res W] [--cart-res W] [--cart-pixels W]
                DIR

Unsupervised Geometry-Aware Ego-motion Estimation for radars and cameras.

positional arguments:
  DIR                   path to dataset

optional arguments:
  -h, --help            show this help message and exit
  --sequence-length N   sequence length for training (default: 3)
  --skip-frames N       gap between frames (default: 1)
  -j N, --workers N     number of data loading workers (default: 4)
  --epochs N            number of total epochs to run (default: 200)
  --train-size N        manual epoch size (will match dataset size if not
                        set) (default: 0)
  --val-size N          manual validation size (will match dataset size if
                        not set) (default: 0)
  -b N, --batch-size N  mini-batch size (default: 4)
  --lr LR, --learning-rate LR
                        initial learning rate (default: 0.0001)
  --momentum M          momentum for sgd, alpha parameter for adam (default:
                        0.9)
  --beta M              beta parameters for adam (default: 0.999)
  --weight-decay W, --wd W
                        weight decay (default: 0)
  --print-freq N        print frequency (default: 10)
  --ckpt-freq N         checkpoint saving frequency in terms of epochs
                        (default: 1)
  --seed SEED           seed for random functions, and network initialization
                        (default: 0)
  --log-summary PATH    csv where to save per-epoch train and valid stats
                        (default: progress_log_summary.csv)
  --log-full PATH       csv where to save per-gradient descent train stats
                        (default: progress_log_full.csv)
  --log-output LOG_OUTPUT
                        will log dispnet outputs at validation step (default:
                        1)
  --resnet-layers {18,50}
                        number of ResNet layers for depth estimation.
                        (default: 18)
  --num-scales W, --number-of-scales W
                        the number of scales (default: 1)
  --img-height W        resized mono image height (default: 192)
  --img-width W         resized mono image width (default: 320)
  -p W, --photo-loss-weight W
                        weight for photometric loss (default: 1)
  -f W, --fft-loss-weight W
                        weight for FFT loss (default: 0.0003)
  -s W, --ssim-loss-weight W
                        weight for SSIM loss (default: 1)
  -c W, --geometry-consistency-weight W
                        weight for depth consistency loss (default: 1.0)
  --with-ssim WITH_SSIM
                        with ssim or not (default: 1)
  --with-mask WITH_MASK
                        with the mask for moving objects and occlusions or
                        not (default: 1)
  --with-auto-mask WITH_AUTO_MASK
                        with the mask for stationary points (default: 1)
  --with-masknet        with the masknet for multipath and noise (default:
                        False)
  --masknet {convnet,resnet}
                        MaskNet type (default: convnet)
  --with-vo             with VO fusion (default: False)
  --cam-mode {mono,stereo}
                        the dataset to train (default: stereo)
  --pretrained-disp PATH
                        path to pre-trained DispResNet model (default: None)
  --pretrained-vo-pose PATH
                        path to pre-trained VO Pose net model (default: None)
  --with-pretrain       with or without imagenet pretrain for resnet
                        (default: False)
  --dataset {hand,robotcar,radiate,cadcd}
                        the dataset to train (default: hand)
  --with-preprocessed WITH_PREPROCESSED
                        use the preprocessed undistorted images (default: 1)
  --pretrained-mask PATH
                        path to pre-trained masknet model (default: None)
  --pretrained-pose PATH
                        path to pre-trained Pose net model (default: None)
  --pretrained-fusenet PATH
                        path to pre-trained PoseFusionNet model (default:
                        None)
  --pretrained-optim PATH
                        path to pre-trained optimizer state (default: None)
  --name NAME           name of the experiment, checkpoints are stored in
                        checpoints/name (default: None)
  --padding-mode {zeros,border}
                        padding mode for image warping (default: zeros)
  --gt-file DIR         path to ground truth validation file (default: None)
  --radar-format {cartesian,polar}
                        Range-angle format (default: polar)
  --range-res W         Range resolution of FMCW radar in meters (default:
                        0.0432)
  --angle-res W         Angular azimuth resolution of FMCW radar in radians
                        (default: 0.015708)
  --cart-res W          Cartesian resolution of FMCW radar in meters/pixel
                        (default: 0.25)
  --cart-pixels W       Cartesian size in pixels (used for both height and
                        width) (default: 512)

Lidar-based experiments To train a lidar-based model including the camera, run the following command:

python train_lidar.py /path/to/dataset/ --dataset dataset_name --name experiment_name

The code can be configured with the following arguments:

usage: train_lidar.py [-h] [--sequence-length N] [--skip-frames N] [-j N]
                      [--epochs N] [--train-size N] [--val-size N] [-b N]
                      [--lr LR] [--momentum M] [--beta M] [--weight-decay W]
                      [--print-freq N] [--ckpt-freq N] [--seed SEED]
                      [--log-summary PATH] [--log-full PATH]
                      [--log-output LOG_OUTPUT] [--resnet-layers {18,50}]
                      [--num-scales W] [--img-height W] [--img-width W]
                      [-p W] [-f W] [-s W] [-c W] [--with-ssim WITH_SSIM]
                      [--with-mask WITH_MASK]
                      [--with-auto-mask WITH_AUTO_MASK] [--with-masknet]
                      [--masknet {convnet,resnet}] [--with-vo]
                      [--cam-mode {mono,stereo}] [--pretrained-disp PATH]
                      [--pretrained-vo-pose PATH] [--with-pretrain]
                      [--dataset {hand,robotcar,radiate}]
                      [--with-preprocessed WITH_PREPROCESSED]
                      [--pretrained-mask PATH] [--pretrained-pose PATH]
                      [--pretrained-fusenet PATH] [--pretrained-optim PATH]
                      --name NAME [--padding-mode {zeros,border}]
                      [--gt-file DIR] [--cart-res W] [--cart-pixels W]
                      DIR

Unsupervised Geometry-Aware Ego-motion Estimation for LIDARs and cameras.

positional arguments:
  DIR                   path to dataset

optional arguments:
  -h, --help            show this help message and exit
  --sequence-length N   sequence length for training (default: 3)
  --skip-frames N       gap between frames (default: 1)
  -j N, --workers N     number of data loading workers (default: 4)
  --epochs N            number of total epochs to run (default: 200)
  --train-size N        manual epoch size (will match dataset size if not
                        set) (default: 0)
  --val-size N          manual validation size (will match dataset size if
                        not set) (default: 0)
  -b N, --batch-size N  mini-batch size (default: 4)
  --lr LR, --learning-rate LR
                        initial learning rate (default: 0.0001)
  --momentum M          momentum for sgd, alpha parameter for adam (default:
                        0.9)
  --beta M              beta parameters for adam (default: 0.999)
  --weight-decay W, --wd W
                        weight decay (default: 0)
  --print-freq N        print frequency (default: 10)
  --ckpt-freq N         checkpoint saving frequency in terms of epochs
                        (default: 1)
  --seed SEED           seed for random functions, and network initialization
                        (default: 0)
  --log-summary PATH    csv where to save per-epoch train and valid stats
                        (default: progress_log_summary.csv)
  --log-full PATH       csv where to save per-gradient descent train stats
                        (default: progress_log_full.csv)
  --log-output LOG_OUTPUT
                        will log dispnet outputs at validation step (default:
                        1)
  --resnet-layers {18,50}
                        number of ResNet layers for depth estimation.
                        (default: 18)
  --num-scales W, --number-of-scales W
                        the number of scales (default: 1)
  --img-height W        resized mono image height (default: 192)
  --img-width W         resized mono image width (default: 320)
  -p W, --photo-loss-weight W
                        weight for photometric loss (default: 1)
  -f W, --fft-loss-weight W
                        weight for FFT loss (default: 0.0003)
  -s W, --ssim-loss-weight W
                        weight for SSIM loss (default: 1)
  -c W, --geometry-consistency-weight W
                        weight for depth consistency loss (default: 1.0)
  --with-ssim WITH_SSIM
                        with ssim or not (default: 1)
  --with-mask WITH_MASK
                        with the mask for moving objects and occlusions or
                        not (default: 1)
  --with-auto-mask WITH_AUTO_MASK
                        with the mask for stationary points (default: 1)
  --with-masknet        with the masknet for multipath and noise (default:
                        False)
  --masknet {convnet,resnet}
                        MaskNet type (default: convnet)
  --with-vo             with VO fusion (default: False)
  --cam-mode {mono,stereo}
                        the dataset to train (default: stereo)
  --pretrained-disp PATH
                        path to pre-trained DispResNet model (default: None)
  --pretrained-vo-pose PATH
                        path to pre-trained VO Pose net model (default: None)
  --with-pretrain       with or without imagenet pretrain for resnet
                        (default: False)
  --dataset {hand,robotcar,radiate}
                        the dataset to train (default: robotcar)
  --with-preprocessed WITH_PREPROCESSED
                        use the preprocessed undistorted images (default: 1)
  --pretrained-mask PATH
                        path to pre-trained masknet model (default: None)
  --pretrained-pose PATH
                        path to pre-trained Pose net model (default: None)
  --pretrained-fusenet PATH
                        path to pre-trained PoseFusionNet model (default:
                        None)
  --pretrained-optim PATH
                        path to pre-trained optimizer state (default: None)
  --name NAME           name of the experiment, checkpoints are stored in
                        checpoints/name (default: None)
  --padding-mode {zeros,border}
                        padding mode for image warping (default: zeros)
  --gt-file DIR         path to ground truth validation file (default: None)
  --cart-res W          Cartesian resolution of LIDAR in meters/pixel
                        (default: 0.25)
  --cart-pixels W       Cartesian size in pixels (used for both height and
                        width) (default: 512)

Camera-based experiments To train a camera-based model, run the following command:

python train_stereo.py /path/to/dataset/ --dataset dataset_name --name experiment_name # for stereo camera
python train_mono.py /path/to/dataset/ --dataset dataset_name --name experiment_name # for monocular camera

The details of the commands can be found using:

python train_stereo.py -h # for stereo camera
python train_mono.py -h # for monocular camera

Pretrained models

We provide pretrained GRAMME models for different datasets and modalities listed below. You can use the inference scripts explained above.

Inference and Test

You can use the following command to generate the depth maps:

python test_disp.py /path/to/dataset/ --dataset dataset_name --pretrained-disp /path/to/pretrained_model --results-dir /path/to/save/results

To test the odometry predictions and generate trajectories:

python test_ro.py /path/to/dataset/ --dataset dataset_name --pretrained-disp /path/to/pretrained_model --results-dir /path/to/save/results # for lidar and radar odometry
python test_stereo.py /path/to/dataset/ --dataset dataset_name --pretrained-disp /path/to/pretrained_model --results-dir /path/to/save/results # for stereo odometry
python test_mono.py /path/to/dataset/ --dataset dataset_name --pretrained-disp /path/to/pretrained_model --results-dir /path/to/save/results # for monocular odometry

GPU Usage

The code supports training on multiple GPUs by extending DataParallel class of PyTorch. We observed anomalies during the training with recently introduced Ampere GPU architecture related to floating-point precision. Therefore, we disable the auto-casting feature on NVIDIA RTX 3090 GPUs. You can specify the environment variable CUDA_VISIBLE_DEVICES to select the GPU:

CUDA_VISIBLE_DEVICES=2 python train.py ... # Use GPU#2
CUDA_VISIBLE_DEVICES=1,2 python train.py ... # Use GPU#1 and GPU#2

Reference

If you find our work useful in your research or if you use parts of this code, please consider citing our paper:

Almalioglu, Y., Turan, M., Trigoni, N., and Markham, A. Deep learning-based robust positioning for all-weather autonomous driving. Nature Machine Intelligence (2022). https://doi.org/10.1038/s42256-022-00520-5

Acknowledgments

We thank the authors of the datasets used in the experiments for sharing the datasets and the SDKs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.