Giter Site home page Giter Site logo

tmacattank / sparsedrive Goto Github PK

View Code? Open in Web Editor NEW

This project forked from swc-17/sparsedrive

0.0 0.0 0.0 1000 KB

SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation

License: MIT License

Shell 0.34% C++ 0.84% Python 96.86% Cuda 1.96%

sparsedrive's Introduction

SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation

vis_sparsedrive.mp4

News

  • 24 June, 2024: We reorganize code for better readability. Code & Models are released.
  • 31 May, 2024: We release the SparseDrive paper on arXiv. Code & Models will be released in June, 2024. Please stay tuned!

Introduction

SparseDrive is a Sparse-Centric paradigm for end-to-end autonomous driving.

  • We explore the sparse scene representation for end-to-end autonomous driving and propose a Sparse-Centric paradigm named SparseDrive, which unifies multiple tasks with sparse instance representation.
  • We revise the great similarity shared between motion prediction and planning, correspondingly leading to a parallel design for motion planner. We further propose a hierarchical planning selection strategy incorporating a collision-aware rescore module to boost the planning performance.
  • On the challenging nuScenes benchmark, SparseDrive surpasses previous SOTA methods in terms of all metrics, especially the safety-critical metric collision rate, while keeping much higher training and inference efficiency.


Overview of SparseDrive. SparseDrive first encodes multi-view images into feature maps, then learns sparse scene representation through symmetric sparse perception, and finally perform motion prediction and planning in a parallel manner. An instance memory queue is devised for temporal modeling.


Model architecture of symmetric sparse perception, which unifies detection, tracking and online mapping in a symmetric structure.


Model structure of parallel motion planner, which performs motion prediction and planning simultaneously and outputs safe planning trajectory.

Results in paper

  • Comprehensive results for all tasks on nuScenes.
Method NDS AMOTA minADE (m) L2 (m) Avg Col. (%) Avg Training Time (h) FPS
UniAD 0.498 0.359 0.71 0.73 0.61 144 1.8
SparseDrive-S 0.525 0.386 0.62 0.61 0.08 20 9.0
SparseDrive-B 0.588 0.501 0.60 0.58 0.06 30 7.3
  • Open-loop planning results on nuScenes.
Method L2 (m) 1s L2 (m) 2s L2 (m) 3s L2 (m) Avg Col. (%) 1s Col. (%) 2s Col. (%) 3s Col. (%) Avg FPS
UniAD 0.45 0.70 1.04 0.73 0.62 0.58 0.63 0.61 1.8
VAD 0.41 0.70 1.05 0.72 0.03 0.19 0.43 0.21 4.5
SparseDrive-S 0.29 0.58 0.96 0.61 0.01 0.05 0.18 0.08 9.0
SparseDrive-B 0.29 0.55 0.91 0.58 0.01 0.02 0.13 0.06 7.3

Results of released checkpoint

We found that some collision cases were not taken into consideration in our previous code, so we re-implement the evaluation metric for collision rate in released code and provide updated results.

Main results

Model config ckpt log det: NDS mapping: mAP track: AMOTA track: AMOTP motion: EPA_car motion: minADE_car motion: minFDE_car motion: MissRate_car planning: CR planning: L2
Stage1 cfg ckpt log 0.5260 0.5689 0.385 1.260
Stage2 cfg ckpt log 0.5257 0.5656 0.372 1.248 0.492 0.61 0.95 0.133 0.097% 0.61

Detailed results for planning

Method L2 (m) 1s L2 (m) 2s L2 (m) 3s L2 (m) Avg Col. (%) 1s Col. (%) 2s Col. (%) 3s Col. (%) Avg
UniAD 0.45 0.70 1.04 0.73 0.66 0.66 0.72 0.68
UniAD-wo-post-optim 0.32 0.58 0.94 0.61 0.17 0.27 0.42 0.29
VAD 0.41 0.70 1.05 0.72 0.03 0.21 0.49 0.24
SparseDrive-S 0.30 0.58 0.95 0.61 0.01 0.05 0.23 0.10

Quick Start

Quick Start

Citation

If you find SparseDrive useful in your research or applications, please consider giving us a star ๐ŸŒŸ and citing it by the following BibTeX entry.

@article{sun2024sparsedrive,
  title={SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation},
  author={Sun, Wenchao and Lin, Xuewu and Shi, Yining and Zhang, Chuang and Wu, Haoran and Zheng, Sifa},
  journal={arXiv preprint arXiv:2405.19620},
  year={2024}
}

Acknowledgement

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.