OpenScene

The Largest 3D Occupancy Prediction Benchmark in Autonomous Driving

Medium Blog | Zhihu (in Chinese)

CVPR 2023 Autonomous Driving Challenge - Occupancy Track

Point of contact: [email protected]

Grad-and-Go

[2023/08/04] OpenScene v1.0 released

Highlights
Task and Evaluation Metric
Ecosystem and Leaderboard
TODO
Getting Started
License and Citation
Related Resources

Highlights

🚘 Representing 3D Scene as Occupancy

As we quote from OccNet:

Occupancy serves as a general representation of the scene and could facilitate perception and planning in the full-stack of autonomous driving. 3D Occupancy is a geometry-aware representation of the scene.

Compared to the formulation of 3D bounding box and BEV segmentation, 3D occupancy could capture the fine-grained details of critical obstacles in the driving scene.

🔥 OpenScene: The Largest Benchmark for 3D Occupancy Prediction

Driving behavior on a sunny day does not apply to that in dancing snowflakes. For machine learning, data is the must-have food. To highlight, we build OpenScene on top of nuPlan, covering a wide span of over 120 hours of occupancy labels collected in various cities, from Boston, Pittsburgh, Las Vegas to Singapore. The stats of the dataset is summarized here.

Dataset	Original Database	Sensor Data (hr)	Flow	Semantic Category
MonoScene	NYUv2 / SemanticKITTI	5 / 6	❌	10 / 19
Occ3D	nuScenes / Waymo	5.5 / 5.7	❌	16 / 14
Occupancy-for-nuScenes	nuScenes	5.5	❌	16
SurroundOcc	nuScenes	5.5	❌	16
OpenOccupancy	nuScenes	5.5	❌	16
SSCBench	KITTI-360 / nuScenes / Waymo	1.8 / 4.7 / 5.6	❌	19 / 16 / 14
OccNet	nuScenes	5.5	❌	16
OpenScene	nuPlan	💥 120	✔️	`TODO`

The time span of LiDAR frames accumulated for each occupancy annotation is 20 seconds.

Flow: the annotation of motion direction and velocity for each occupancy grid.

TODO: Full semantic labels of grids would be released in future version

🔥 OpenScene: Empowering DriveAGI in the era of Foundation Model

Which formulation is good for modeling the autonomous driving scenarios?

We posit that incorporating the motion information of occupancy flow can help bridge the gap between decision-making and scene representation. Besides, the OpenScene dataset provides a semantic label for each foreground grid, serving as a crucial initial step toward achieving DriveAGI.

(back to top)

Task and Evaluation Metric

Disclaimer: The following task (or title) is prone to change as we are shaping the 2024 edition of the Autonomous Driving Challenge.

Large-Scale Occupancy Prediction

Given massive images from multiple cameras in OpenScene, the goal is to predict the current occupancy state and semantics of each voxel grid in the scene. In this task, we use the intersection-over-union (mIoU) over all classes to evaluate model performance.

Here we provide a naive baseline for the Large-Scale Occupancy Prediction on OpenScene mini set, trained with 8 Tesla A100 GPUs.

Backbone	mIoU	IoU@Car	Precision	Recall	Memory	Time
ResNet-50	7.5 (not fully trained)	21.4	24.4	65.3	9260	43
VoVNet-99	14.4 (not fully trained)	35.9	46.7	76.1	14537	81

mIoU (%), IoU@Car (%), Precision (%), and Recall (%) are evaluated on 20% OpenScene mini set.

Memory (MB/GPU) and Time (hr) are recorded as the reference of resource consumption during training.

Foundation Model Challenge

In this task, given arbitrary data and architecture, we aim to have a unified backbone (aka, foundation model) to effectively address multifaceted downstream tasks. The OpenScene metric (OSM) is adopted to evaluate the effectiveness of such a foundation model in all aspects. In order to train the large model, you can use OpenScene or whatever means of solution at your discretion.

Downstream Task	KITTI	nuScenes	Waymo	Scene Diversity	OSM
3D Detection		✔️		`downtown` `crowded`	NDS
Semantic Segmentation		✔️		`downtown` `crowded`	mIoU
Scene Completion		✔️		`downtown` `crowded`	mIoU
Map Construction		✔️		`downtown` `crowded`	mAP
Object Tracking			✔️	`suburb` `nighttime` `rainy`	MOTA
Depth Estimation	✔️			`countryside` `highway`	SILog
Visual Odometry	✔️			`countryside` `highway`	Translation
Flow Estimation	✔️			`countryside` `highway`	Fl-all
3D Lane Detection			✔️	`suburb` `nighttime` `rainy`	F1-Score

We consolidate the above metrics to OSM by computing a weighted sum.

The listed datasets and tasks are tentative. Please refer to the AD24 challenge (TBA) for details.

(back to top)

Ecosystem and Leaderboard

Upcoming Challenge in 2024

We plan to release a trailer version of the upcoming challenge. Please stay tuned for more details in Late August.

Challenge website: AD24Challenge

CVPR 2023 3D Occupancy Prediction Challenge (Server Remains `Active`)

Please submit your great work as we would regularly maintain this leaderboard!
Challenge website: AD23Challenge

(back to top)

TODO

OpenScene v1.0
Full-stack annotation update: background label and camera-view mask
Official Announcement for Autonomous Driving Challenge 2024

(back to top)

Getting Started

(back to top)

License and Citation

Our dataset is based on the nuPlan Dataset and therefore we distribute the data under Creative Commons Attribution-NonCommercial-ShareAlike license and nuPlan Dataset License Agreement for Non-Commercial Use. You are free to share and adapt the data, but have to give appropriate credit and may not use the work for commercial purposes. All code within this repository is under Apache License 2.0.

Please consider citing our paper if the project helps your research with the following BibTex:

@misc{openscene2023,
      title = {OpenScene: The Largest Up-to-Date 3D Occupancy Prediction Benchmark in Autonomous Driving},
      author = {OpenScene Contributors},
      howpublished={\url{https://github.com/OpenDriveLab/OpenScene}},
      year = {2023}
}

@article{sima2023_occnet,
      title={Scene as Occupancy}, 
      author={Chonghao Sima and Wenwen Tong and Tai Wang and Li Chen and Silei Wu and Hanming Deng  and Yi Gu and Lewei Lu and Ping Luo and Dahua Lin and Hongyang Li},
      year={2023},
      eprint={2306.02851},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

(back to top)

Related Resources

DriveAGI | OpenLane-V2 | DriveLM
Survey on Bird's-eye-view Perception | BEVFormer | OccNet

(back to top)

tanjatang / openscene Goto Github PK

openscene's Introduction

OpenScene

Grad-and-Go

Table of Contents

Highlights

🚘 Representing 3D Scene as Occupancy

🔥 OpenScene: The Largest Benchmark for 3D Occupancy Prediction

🔥 OpenScene: Empowering DriveAGI in the era of Foundation Model

Task and Evaluation Metric

Large-Scale Occupancy Prediction

Foundation Model Challenge

Ecosystem and Leaderboard

Upcoming Challenge in 2024

CVPR 2023 3D Occupancy Prediction Challenge (Server Remains Active)

TODO

Getting Started

License and Citation

Related Resources

openscene's People

Contributors

Recommend Projects

Recommend Topics

Recommend Org

CVPR 2023 3D Occupancy Prediction Challenge (Server Remains `Active`)