Giter Site home page Giter Site logo

diffmot's Introduction

DiffMOT (CVPR2024)

DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear Prediction arxiv paper arxiv paper

Teaser

Framework

Framework Framework

News

  • We now upload the trained motion model.
  • 2024-02-27: This work is accepted by CVPR-2024.

Tracking performance

Benchmark Evaluation

Dataset HOTA IDF1 Assa MOTA DetA Weight Results
DanceTrack 62.3 63.0 47.2 92.8 82.5 download DanceTrack_Results
SportsMOT 76.2 76.1 65.1 97.1 89.3 download SportsMOT_Results
MOT17 64.5 79.3 64.6 79.8 64.7 download MOT17_Results
MOT20 61.7 74.9 60.5 76.7 63.2 download MOT20_Results

Results on DanceTrack test set with different detector

Detector HOTA IDF1 MOTA FPS
YOLOX-S 53.3 56.6 88.4 30.3
YOLOX-M 57.2 58.6 91.2 25.4
YOLOX-L 61.5 61.7 92.0 24.2
YOLOX-X 62.3 63.0 92.8 22.7

The tracking speed (including detection and tracking speed) is test on an RTX 3090 GPU. Smaller detectors can achieve higher FPS, which indicates that DiffMOT can flexibly choose different detectors for various real-world application scenarios. With YOLOX-S, the tracking speed of the entire system can reach up to 30.3 FPS.

Video demos

I. Installation.

  • install torch
conda create -n diffmot python=3.9
conda activate diffmot
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2
  • install other packages.
pip install -r requirement.txt
  • install external dependencies.
cd external/YOLOX/
pip install -r requirements.txt && python setup.py develop
cd ../external/deep-person-reid/
pip install -r requirements.txt && python setup.py develop
cd ../external/fast_reid/
pip install -r docs/requirements.txt

II. Prepare Data.

The file structure should look like:

  • DanceTrack
{DanceTrack ROOT}
|-- dancetrack
|   |-- train
|   |   |-- dancetrack0001
|   |   |   |-- img1
|   |   |   |   |-- 00000001.jpg
|   |   |   |   |-- ...
|   |   |   |-- gt
|   |   |   |   |-- gt.txt            
|   |   |   |-- seqinfo.ini
|   |   |-- ...
|   |-- val
|   |   |-- ...
|   |-- test
|   |   |-- ...
  • SportsMOT
{SportsMOT ROOT}
|-- sportsmot
|   |-- splits_txt
|   |-- scripts
|   |-- dataset
|   |   |-- train
|   |   |   |-- v_1LwtoLPw2TU_c006
|   |   |   |   |-- img1
|   |   |   |   |   |-- 000001.jpg
|   |   |   |   |   |-- ...
|   |   |   |   |-- gt
|   |   |   |   |   |-- gt.txt
|   |   |   |   |-- seqinfo.ini         
|   |   |   |-- ...
|   |   |-- val
|   |   |   |-- ...
|   |   |-- test
|   |   |   |-- ...
  • MOT17/20 We train the MOT17 and MOT20 together.
{MOT17/20 ROOT}
|-- mot
|   |-- train
|   |   |-- MOT17-02
|   |   |   |-- img1
|   |   |   |   |-- 000001.jpg
|   |   |   |   |-- ...
|   |   |   |-- gt
|   |   |   |   |-- gt.txt            
|   |   |   |-- seqinfo.ini
|   |   |-- ...
|   |   |-- MOT20-01
|   |   |   |-- img1
|   |   |   |   |-- 000001.jpg
|   |   |   |   |-- ...
|   |   |   |-- gt
|   |   |   |   |-- gt.txt            
|   |   |   |-- seqinfo.ini
|   |   |-- ...
|   |-- test
|   |   |-- ...

and run:

python dancetrack_data_process.py
python sports_data_process.py
python mot_data_process.py

III. Model ZOO.

Detection Model

We provide some trained YOLOX weights in download for DiffMOT. Some of them are inherited from ByteTrack, DanceTrack, and MixSort.

ReID Model

Ours ReID models for MOT17/MOT20 is the same as BoT-SORT , you can download from MOT17-SBS-S50, MOT20-SBS-S50. The ReID model for DanceTrack is the same as Deep-OC-SORT, you can download from Dance-SBS-S50. The ReID model for SportsMOT is trained by ourself, you can download from Sports-SBS-S50.

Notes:

  • MOT20-SBS-S50 is trained by Deep-OC-SORT, because the weight from BOT-SORT is corrupted. Refer to Issue.
  • ReID models for SportsMOT is trained by ourself.

Motion Model (D$^2$MP)

Refer to models. We train on DanceTrack and MOT17/20 for 800 epochs, and train on SportsMOT for 1200 epochs.

IV. Training.

Train the detection model

Train the ReID model

Train the motion model (D$^2$MP)

  • Change the data_dir in config
  • Train on DanceTrack, SportsMOT, and MOT17/20:
python main.py --config ./configs/dancetrack.yaml
python main.py --config ./configs/sportsmot.yaml
python main.py --config ./configs/mot.yaml

V. Tracking.

Prepare detections

Prepare ReID embeddings

Track on DanceTrack

  • Change the info_dir, and save_dir in config.
  • High_thres is set to 0.6, low_thres is set to 0.4, w_assoc_emb is set to 2.2, and aw_param is set to 1.7.
python main.py --config ./configs/dancetrack_test.yaml

Track on SportsMOT

  • Change the info_dir, and save_dir in config.
  • High_thres is set to 0.6, low_thres is set to 0.4, w_assoc_emb is set to 2.0, and aw_param is set to 1.2.
python main.py --config ./configs/sportsmot_test.yaml

Track on MOT17

  • Change the info_dir, and save_dir in config.
  • High_thres is set to 0.6, low_thres is set to 0.1, w_assoc_emb is set to 2.2, and aw_param is set to 1.7.
python main.py --config ./configs/mot17_test.yaml

Track on MOT20

  • Change the info_dir, and save_dir in config.
  • High_thres is set to 0.4, low_thres is set to 0.1, w_assoc_emb is set to 2.2, and aw_param is set to 1.7.
python main.py --config ./configs/mot20_test.yaml

Contact

If you have some questions, please concat with [email protected].

Acknowledgement

A large part of the code is borrowed from DDM-Public and Deep-OC-SORT. Thanks for their wonderful works.

Citation

@article{lv2024diffmot,
  title={DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear Prediction},
  author={Lv, Weiyi and Huang, Yuhang and Zhang, Ning and Lin, Ruei-Sung and Han, Mei and Zeng, Dan},
  journal={arXiv preprint arXiv:2403.02075},
  year={2024}
}

diffmot's People

Contributors

guhuangai avatar kroery avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

diffmot's Issues

Testing error

Where can i get det_dir, info_dir, reid_dir? Thanks

Screenshot from 2024-05-14 11-57-22

License

Hi ๐Ÿ‘‹๐Ÿป I noticed that the reproduktory does not contain license. What is the license of this tracker?

about the detection_result on SportMOT.

Thank you for your excellent work.
I would like to ask, in the detection results you provided for the SportMOT dataset, only the validation set and test set detection results are available. Would it be possible for you to provide the detection results for the training set as well?

Performance -FPS

How does DiffMOT compare to OC-SORT in terms inference speed (fps)? Is it faster or slower?
According to the OC-SORT paper
"The inference speed is ~28FPS by a RTX 2080Ti GPU. If the detections are provided, the inference speed of OC-SORT association is 700FPS by a i9-3.0GHz CPU."

What is the inference speed of DiffMOT if the detections are provided?

Thanks

Track on custom data

Hi, I'd like to try Diffmot on a custom dataset with pretrained re-id and a newly trained detector. How do I go about it?

Training time

Hi, thanks for your excellent work!

I'm interested in the training cost of DiffMOT.
I see that you train DiffMOT on 4 3090 GPUs with batch size 2048 for 800 epochs.
Could you please tell us how much time does it spend?

In comparison, DiffusionTrack (AAAI2024) is trained on 8 3090 GPUs for 30 hours.

Thanks!

Demo

Hi,
Thanks for your wonderful work. Would that be possible to add a demo.py script for testing videos?

Thanks

different from the results in your paper.

I mentioned the MOT17_Results you provided on the motchallenge official website, but got the following results, which are quite different from the results in your paper.
ๅพฎไฟกๅ›พ็‰‡_20240310161742

Performance issue with YOLOX detection in SportsMOT

Thank you for sharing the source code. I used the yolox-x model trained on SportsMOT to infer on SportsMOT test data, but the detection accuracy is very poor. What could be the reason?
000001

Here is the execution script:

python -m yolox.tools.demo image \
    --save_result \
    --path ../../data/SportsMOT/test/v_-6Os86HzwCs_c009/img1 \
    -f exps/sportsmot/yolox_x_sportsmot.py \
    -n  yolox-x-sportsmot \
    -c ../../pretrained_models/SportsMOT_yolox_x.tar \
    --fuse

About "x_0" and "x_next" in the diffusion.py

                x0 = self.pred_x0_from_xt(x_t, noise_pred, C_pred, cur_time)
                x0.clamp_(-1., 1.)
                C_pred = -1 * x0
                x_next = self.pred_xtms_from_xt(x_t, noise_pred, C_pred, cur_time, s)

why traj[t-1] = x_next.detach(), why not traj[t-1] = x_0.detach()

Different from the results in your paper on DanceTrack val/test sets

Thank you for your excellent work.
I used the model weight of DanceTrack provide by you to obtain the tracking results of DanceTrack val/test sets. However, my results differ from those reported in your paper. To be specific, I obtained a HOTA of 59.169 on the DanceTrack validation set (compared to 55.7 in Table 5 of your paper) and 61.719 on the DanceTrack test set (compared to 62.3 in your paper). Of course, by submitting the DanceTrack_Results you provided, I can achieve the HOTA consistent with the paper.

  • DanceTrack val sets:

Weixin Image_20240515095658

  • DanceTrack test sets:
    1715738345098

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.