Giter Site home page Giter Site logo

mixformerv2's Introduction

MixFormerV2

The official implementation of the NeurIPS 2023 paper: MixFormerV2: Efficient Fully Transformer Tracking.

Model Framework

model

Distillation Training Pipeline

training

News

  • [Sep 22, 2023] MixFormerV2 is accpeted by NeurIPS 2023! ๐ŸŽ‰

  • [May 31, 2023] We released two versions of the pretrained model, which can be accessed on Google Driver.

  • [May 26, 2023] Code is available now!

Highlights

โœจ Efficient Fully Transformer Tracking Framework

MixFormerV2 is a well unified fully transformer tracking model, without any dense convolutional operation and complex score prediction module. We propose four key prediction tokens to capture the correlation between target template and search area.

โœจ A New Distillation-based Model Reduction Paradigm

To further improve efficiency, we present a new distillation paradigm for tracking model, including dense-to-sparse stage and deep-to-shallow stage.

โœจ Strong Performance and Fast Inference Speed

MixFormerV2 works well for different benchmarks and can achieve 70.6% AUC on LaSOT and 57.4% AUC on TNL2k, while keeping 165fps on GPU. To our best knowledge, MixFormerV2-S is the first transformer-based one-stream tracker which achieves real-time running on CPU.

Install the environment

Use the Anaconda

conda create -n mixformer2 python=3.6
conda activate mixformer2
bash install_requirements.sh

Data Preparation

Put the tracking datasets in ./data. It should look like:

   ${MixFormerV2_ROOT}
    -- data
        -- lasot
            |-- airplane
            |-- basketball
            |-- bear
            ...
        -- got10k
            |-- test
            |-- train
            |-- val
        -- coco
            |-- annotations
            |-- train2017
        -- trackingnet
            |-- TRAIN_0
            |-- TRAIN_1
            ...
            |-- TRAIN_11
            |-- TEST

Set project paths

Run the following command to set paths for this project

python tracking/create_default_local_file.py --workspace_dir . --data_dir ./data --save_dir .

After running this command, you can also modify paths by editing these two files

lib/train/admin/local.py  # paths about training
lib/test/evaluation/local.py  # paths about testing

Train MixFormerV2

Training with multiple GPUs using DDP. More details of other training settings can be found at tracking/train_mixformer.sh.

bash tracking/train_mixformer.sh

Test and evaluate MixFormerV2 on benchmarks

  • LaSOT/GOT10k-test/TrackingNet/OTB100/UAV123/TNL2k. More details of test settings can be found at tracking/test_mixformer.sh.
bash tracking/test_mixformer.sh

TODO

  • Progressive eliminating version of training.
  • Fast version of test forwarding.

Contant

Tianhui Song: [email protected]

Yutao Cui: [email protected]

Citiation

@misc{mixformerv2,
      title={MixFormerV2: Efficient Fully Transformer Tracking}, 
      author={Yutao Cui and Tianhui Song and Gangshan Wu and Limin Wang},
      year={2023},
      eprint={2305.15896},
      archivePrefix={arXiv}
}

mixformerv2's People

Contributors

songtianhui avatar yutaocui avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.