Giter Site home page Giter Site logo

cmu-datf's Introduction

Diverse and Admissible Trajectory Forecasting through Multimodal Context Understanding

This code is PyTorch implementation of Diverse and Admissible Trajectory Forecasting through Multimodal Context Understanding.

Model Diagram

Requirements

Python 3 and the following external packages.

Downloading Dataset

You will have to download Argoverse and nuScenes dataset into data/Argoverse and data/nuScenes, then verify the subdirectory of each dataset. Dataset should have structure same as:

-Argoverse
  |- test_obs/data/*.csv
  |- train/data/*.csv
  |- val/data/*.csv

-nuScenes
  |- sweeps/"Sensor Name (e.g., LIDAR_TOP)"/*
  |- trainval_meta
      |- maps/*.json
      |- v1.0-trainval/*.json

Preprocessing Dataset

Both Argoverse and nuScenes dataset have to be preprocessed prior to testing this model. To do so, you may simply run python scripts preprocess_argoverse.py and preprocess_nuscenes.py. Each script will generate distance transform of drivable area map and partition train, train_val, and val (plus, test_obs for Argoverse) splits for training. Note that these names do not represent train/validation/test regime in general machine learning literature. Instead, they are derived from the naming done by authors for each dataset. In this model, overall, the train split corresponds to the train set, the train_val split corresponds to the validation set, and the val split corresponds to the test set. The scripts will additionally generate symlinks to form train_all split, which is the union of train and train_val to be used for finetuning this model.

In addition, preprocess_nuscenes.py will perform Kalman smoothing with constant velocity model to generate trajectories from tracklets since the original nuScenes dataset is not geared to trajectories task (UPDATE: nuScenes now has the prediction task open).

The preprocessed datasets will be saved at data/Preprocessed/.

Initial Training

This model is initially trained using train split with learning rate decay on plateau depending on performances measured using train_val split. See comments and values in main.py for the hyperparameter details. To train the proposed method under a minimal set of options, run the following commands.

nuScenes

python main.py --model_type=AttGlobal_Scene_CAM_NFDecoder --dataset=nuscenes \
--train_partition=train --train_cache=./train.nusc.pkl \
--val_partition=train_val --val_cache=./train_val.nusc.pkl \
--tag=attgscam.nusc --batch_size=8 

Argoverse

python main.py --model_type=AttGlobal_Scene_CAM_NFDecoder --dataset=argoverse \
--train_partition=train --train_cache=./train.argo.pkl \
--val_partition=train_val --val_cache=./train_val.argo.pkl \
--tag=attgscam.argo --batch_size=8 

Fine-tuning

After the initial training is done either due to reaching the maximum set epochs or the maximum set learning rate decays, one must perform finetuning to achieve the best performance in testing time. To do so, set the training split to train_all and provide the location of the initial training log and the restoring epoch to arguments --restore_path and --restore_epoch. In addition, set a small learning rate value (e.g., 1e-5) which should be much smaller than that used during the initial training. Also, the learning rate decay on plateau must be turned off or the model would cheat by adapting its training schedule to achieve a good performance on val split. Usually, an epoch of fine-tuning suffices to achieve the best performance; fine-tuning for more than two epochs may harm the performance.

To fine-tune the proposed method under a minimal set of options, run the following commands.

nuScenes

python main.py --model_type=AttGlobal_Scene_CAM_NFDecoder --dataset=nuscenes \
--train_partition=train_all --train_cache=./train_all.nusc.pkl \
--val_partition=val --val_cache=./val.nusc.pkl \
--tag=attgscam.finetune.nusc --batch_size=8 --init_lr=1e-5 --lr_decay=0 \
--restore_path=./experiment/attgscam.nusc* --restore_epoch=*

Argoverse

python main.py --model_type=AttGlobal_Scene_CAM_NFDecoder --dataset=argoverse \
--train_partition=train_all --train_cache=./train_all.argo.pkl \
--val_partition=val --val_cache=./val.argo.pkl \
--tag=attgscam.finetune.argo --batch_size=8 --init_lr=1e-5 --lr_decay=0 \
--restore_path=./experiment/attgscam.argo* --restore_epoch=*

Testing

Testing can be performed by assigning checkpoint to argument --test_ckpt. In the default setting, the test will output the mean and the standard deviation for ADEs/FDEs/rF/DAC/DAO after 10 testing trials on val split.

nuScenes

python main.py --model_type=AttGlobal_Scene_CAM_NFDecoder --dataset=nuscenes \
--test_partition=val --test_cache=./val.nusc.pkl \
--tag=attgscam.finetune.nusc --batch_size=8 \
--test_ckpt=./experiment/attgscam.finetune.nusc*/*.pth.tar

Argoverse

python main.py --model_type=AttGlobal_Scene_CAM_NFDecoder --dataset=argoverse \
--train_partition=train_all --train_cache=./train_all.argo.pkl \
--test_partition=val --test_cache=./val.argo.pkl \
--tag=attgscam.finetune.argo --batch_size=8 \
--test_ckpt=./experiment/attgscam.finetune.argo*/*.pth.tar

Example Outputs

Initial Training

python main.py --model_type=AttGlobal_Scene_CAM_NFDecoder --dataset=nuscenes \
--train_partition=train --train_cache=./train.nusc.pkl \
--val_partition=train_val --val_cache=./train_val.nusc.pkl \
--tag=attgscam.nusc --batch_size=8

Experiment Path experiment/attgscam.nusc__31_March__22_17_.
Set random seed 88245.
.
.
.
Loading nuScenes dataset.
Found train set cache train.nusc.pkl with 15691 samples.
Data Loading Complete!
Found train_val set cache train_val.nusc.pkl with 6139 samples.
Data Loading Complete!
Train Examples: 15691
Valid Examples: 6139
Trainer Initialized!

Init LR: 1.000e-4, ReduceLROnPlateau: On
Decay patience: 5, Decay factor: 0.50, Num decays: 3
TRAINING START .....
==========================================================================================
------------------------------------------------------------------------------------------
| Epoch: 00 | Train Loss: 13.803186 | Train minADE[2/3]: 1.4996 / 2.4273 | Train minFDE[2/3]: 2.3533 / 4.2484 | Train avgADE[2/3]: 3.5996 / 5.3038 | Train avgFDE[2/3]: 6.0856 / 9.5970
| Epoch: 00 | Valid Loss: 8.407707 | Valid minADE[2/3]: 0.8077 / 1.2822 | Valid minFDE[2/3]: 1.1757 /2.1056 | Valid avgADE[2/3]: 2.0120 / 2.9221 | Valid avgFDE[2/3]: 3.3554 /5.2128 | Scheduler Metric: 3.3879 | Learning Rate: 1.000e-4
.
.
.
==========================================================================================
------------------------------------------------------------------------------------------
| Epoch: 66 | Train Loss: -7.240098 | Train minADE[2/3]: 0.3152 / 0.5650 | Train minFDE[2/3]: 0.5148 / 1.0795 | Train avgADE[2/3]: 0.6551 / 1.0646 | Train avgFDE[2/3]: 1.1580 / 2.1559
| Epoch: 66 | Valid Loss: 28.089247 | Valid minADE[2/3]: 0.3160 / 0.5695 | Valid minFDE[2/3]: 0.5239 /1.0990 | Valid avgADE[2/3]: 0.6499 / 1.0646 | Valid avgFDE[2/3]: 1.1571 /2.1697 | Scheduler Metric: 1.6685 | Learning Rate: 0.062e-4

Halt training since the lr decayed below 1.25e-05.
Training Complete!

Finetuning

python main.py --model_type=AttGlobal_Scene_CAM_NFDecoder --dataset=nuscenes --
train_partition=train_all --train_cache=./train_all.nusc.pkl --val_partition=val --val_cache=./val.nusc.pkl --tag=attgscam.finetune.nusc
--batch_size=8 --init_lr=1e-5 --lr_decay=0 --restore_path=./experiment/attgscam.nusc__31_March__22_17_/ --restore_epoch=66

Experiment Path experiment/attgscam.finetune.nusc__01_April__17_03_.
Set random seed 88245.
.
.
.
Loading nuScenes dataset.
Found train_all set cache train_all.nusc.pkl with 21830 samples.
Data Loading Complete!
Found val set cache val.nusc.pkl with 4669 samples.
Data Loading Complete!
Train Examples: 21830
Valid Examples: 4669
Loading checkpoint from experiment/attgscam.nusc__31_March__22_17_/ck_66_22.9269_51.6231_0.5695_1.0990.pth.tar
Trainer Initialized!

Init LR: 0.100e-4, ReduceLROnPlateau: Off
TRAINING START .....
==========================================================================================
------------------------------------------------------------------------------------------
| Epoch: 67 | Train Loss: -1.602181 | Train minADE[2/3]: 0.3196 / 0.5684 | Train minFDE[2/3]: 0.5112 / 1.0631 | Train avgADE[2/3]: 0.6848 / 1.1082 | Train avgFDE[2/3]: 1.2039 / 2.2368
| Epoch: 67 | Valid Loss: 6.696390 | Valid minADE[2/3]: 0.3364 / 0.6064 | Valid minFDE[2/3]: 0.5441 /1.1487 | Valid avgADE[2/3]: 0.7248 / 1.1825 | Valid avgFDE[2/3]: 1.2856 /2.4029 | Scheduler Metric: 1.7551 | Learning Rate: 0.100e-4

KeyboardInterrupt

Testing

python main.py --model_type=AttGlobal_Scene_CAM_NFDecoder --dataset=nuscenes \
--test_partition=val --test_cache=./val.nusc.pkl \
--tag=attgscam.finetune.nusc --batch_size=8 \
--test_ckpt=./experiment/attgscam.finetune.nusc__01_April__17_03_/ck_67_1.4858_52.1059_0.6064_1.1487.pth.tar

Test Path tests/attgscam.finetune.nusc__01_April__17_41_.
Set random seed 88245.
.
.
.
Loading nuScenes dataset.
Found val set cache val.nusc.pkl with 4669 samples.
Data Loading Complete!
Test Examples: 4669
Tester Initialized!

Model Type: AttGlobal_Scene_CAM_NFDecoder
Velocity Const.: 0.50, Detach_Output: Off
Batchsize: 8
TESTING START .....
==========================================================================================
Working on test 1/10, batch 584/584... Complete.
minADE3: 0.609, minFDE3: 1.154
avgADE3: 1.186, avgFDE3: 2.408
minMSD: 0.609, avgMSD: 1.186
rF: 2.087, DAO: 22.189, DAC: 0.934
.
.
.
Working on test 10/10, batch 584/584... Complete.
minADE3: 0.606, minFDE3: 1.151
avgADE3: 1.185, avgFDE3: 2.407
minMSD: 0.606, avgMSD: 1.185
rF: 2.091, DAO: 22.162, DAC: 0.934
--Final Performane Report--
minADE3: 0.608±0.00136, minFDE3: 1.152±0.00204
avgADE3: 1.184±0.00091, avgFDE3: 2.405±0.00232
minMSD: 0.608±0.00136, avgMSD: 1.184±0.00091
rF: 2.087±0.00311, DAO: 22.168±0.01221, DAC: 0.934±0.00019

Things to do

  • Re-implement preprocess_argoverse.py and preprocess_nuscenes.py.
  • Re-organize implementations for MATF/R2P2/Desire/CSP/SimpleEncoderDecoder
  • Re-organize comments for model definition and miscellaneous scripts.
  • Note hyperparameters for training/fine-tuning at README.md.

Citation

Please cite the original publication;

@inproceedings{park2020diverse,
  title={Diverse and admissible trajectory forecasting through multimodal context understanding},
  author={Park, Seong Hyeon and Lee, Gyubok and Seo, Jimin and Bhat, Manoj and Kang, Minseok and Francis, Jonathan and Jadhav, Ashwin and Liang, Paul Pu and Morency, Louis-Philippe},
  booktitle={European Conference on Computer Vision},
  pages={282--298},
  year={2020},
  organization={Springer}
}

License

This code is published under the General Public License version 2.0.

cmu-datf's People

Contributors

argyroneta-aquatica avatar kami93 avatar manojbhat09 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cmu-datf's Issues

How to generate the '.pkl' files from the source dataset?

After I download the argoverse and nuscence dataset, I still cant run the code. I found all samples are saved in '.pkl' files in the code, but there are no these files in the source dataset.
So, how can I get these files that can be used in the code from the source argoverse motion forecasting dataset and nuscence dataset?

Number of hypotheses - Argoverse

Hi,
I have a question regarding the (Generalizability across datasets) section in the paper:

There you said that you are reporting the results on different dataset namely argoverse, where you used 6 hypotheses. Are the argoverse results in Table 5 computed for 6 hypotheses?

Also in the text of the same section, you said that your full model achieves 0.915 MinFDE while a baseline MFP3 achieves 1.399, but both of these numbers are not listed in Table 5?

And by the way, where is the MFP3 baseline results?

Thanks,

Cannot repreduce Argoverse result

Hello!
I tried to train a model on Argoverse using the default arguments in the readme (only change the dataset from nuscenes to argoverse)
--tag='Global_Scene_CAM_NFDecoder' --model_type='Global_Scene_CAM_NFDecoder'
--dataset=argoverse --batch_size=4 --num_epochs=100 --gpu_devices=3
--map_version '2.0'
In total the model converged well after 35 epochs and I stopped training at 48th epoch. However, there is a big performance gap between my reproducing (minADE:1.112, minFDE:1.802, validation set) and the number I found in the paper (minADE: 0.806 minFDE: 1.252). I guess there are something I missed. Is it my argument setting correct? Thank you!

What is the test data for Argoverse

Hello,
I notice that we don't have ground truth prediction data in argoverse test set. Do the numbers in paper come from the validation set? Thank you!

Data preprocessing of the Argoverse dataset

Hello!
I tried to train a model on Argoverse using the default arguments in the readme (only change the dataset from nuscenes to argoverse). But the error appears on line 284 and 316 in argoverse.py (functions: get_data() and extract_directory()).

Not A Directory Error: [Errno 20] Not a directory: './data/argoverse/train/1.csv/observation'

The structure of Argoverse data on my computer is:

-data
  |- argoverse
      |- train
         |- 1.csv
         |- 2.csv
         |- ...
      |- val
      |- test

Can you show the structure of Argoverse data on your computer? What can I do to solve this problem?

Thank you!

Multi-gpu Training

Hi,

Thanks for the nice work, I have a question regarding the multi-gpu training.
According to the training arguments, you specify --gpu_devices 3. But I could not find in the code the place for setting the model to be trained on multi-gpu.

Question about drivable area in Argoverse dataset

Thanks for sharing your great work.
I have some questions about the description of the drivable area in the paper.

  1. what is the distance transform of the binary mask? Is the binary mask the binary drivable area matrix?

  2. How can I calculate the distance between one predicted location and drivable area?

Running other baselines

First of all, thanks for the great work and for sharing that.

I ran your code and could achieve the results in your paper. Now I want to run your code for other baselines that you used in your work (R2P2_MA, desire, etc) but there is no dataset file for them in your repo and it's not clear what is the inputs for these models. More specifically, in "utils.py" inside the function "train_single_epoch", when you unpack the input batch, there are
different unpacked values compared with what we had in the proposed folder.

Does there exist a dataset file for other models or we should use the same dataloader? Can you elaborate on that, please?

thank you,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.