Giter Site home page Giter Site logo

deepphysicvision / awesome-papers-world-models-autonomous-driving Goto Github PK

View Code? Open in Web Editor NEW

This project forked from chaytonmin/awesome-papers-world-models-autonomous-driving

0.0 0.0 0.0 697 KB

Awesome Papers about World Models in Autonomous Driving

awesome-papers-world-models-autonomous-driving's Introduction

World Models are adept at representing an agent's spatio-temporal knowledge about its environment through the prediction of future changes.

There are two main types of world models in Autonomous Driving aimed at reducing driving uncertainty, i.e., World Model as Neural Driving Simulator and World Model for End-to-end Driving.

In the real environment, methods like GAIA-1 and Copilot4D involve utilizing generative models to construct neural simulators that produce 2D or 3D future scenes to enhance predictive capabilities.

In the simulation environment, methods such as MILE and TrafficBots are based on reinforcement learning, enhancing their capacity for decision-making and future prediction, thereby paving the way to end-to-end autonomous driving.

Neural Driving Simulator based on World Models

2D Scene Generation

  • (2023 Arxiv) GAIA-1: A generative world model for autonomous driving [Paper][Blog] (Wayve)
  • (2023 CVPR 2023 workshop) [Video] (Tesla)
  • (2023 Arxiv) DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving [Paper][Code] (GigaAI)
  • (2023 Arxiv) ADriver-I: A General World Model for Autonomous Driving [Paper] (MEGVII)
  • (2023 Arxiv) DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model [Paper] (Baidu)
  • (2023 Arxiv) Panacea: Panoramic and Controllable Video Generation for Autonomous Driving [Paper][Code] (MEGVII)
  • (2024 CVPR) Drive-WM: Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving [Paper][Code] (CASIA)
  • (2023 Arxiv) WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation [Paper] (Fudan)
  • (2024 Arxiv) DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation [Paper][Code] (GigaAI)
  • (2024 CVPR) GenAD: Generalized Predictive Model for Autonomous Driving [Paper][Code] (Shanghai AI Lab)
  • (2024 Arxiv) SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control [Paper] (MEGVII)

3D Scene Generation

  • (2024 ICLR) Copilot4D:Learning unsupervised world models for autonomous driving via discrete diffusion [Paper] (Waabi)
  • (2023 Arxiv) OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving [Paper][Code] (THU)
  • (2023 Arxiv) MUVO: A Multimodal Generative World Model for Autonomous Driving with Geometric Representations [Paper] (KIT)
  • (2024 Arxiv) LidarDM: Generative LiDAR Simulation in a Generated World [Paper][Code] (MIT)

4D Pre-training for Autonomous Driving

  • (2023 Arxiv) UniWorld: Autonomous Driving Pre-training via World Models [Paper] (PKU)
  • (2024 CVPR) ViDAR: Visual Point Cloud Forecasting enables Scalable Autonomous Driving [Paper][Code] (Shanghai AI Lab)
  • (2024 CVPR) DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving [Paper] (PKU)

End-to-end Driving based on World Models

  • (2022 NeurIPS) Iso-Dream: Isolating and Leveraging Noncontrollable Visual Dynamics in World Models [Paper] (SJTU)
  • (2022 NeurIPS) MILE: Model-Based Imitation Learning for Urban Driving [Paper][Code] (Wayve)
  • (2022 NeurIPS Deep RL Workshop) SEM2: Enhance Sample Efficiency and Robustness of End-to-end Urban Autonomous Driving via Semantic Masked World Model [Paper] (HIT & THU)
  • (2023 ICRA) TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction [Paper] (ETH Zurich)
  • (2024 Arxiv) Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2) [Paper] (SJTU)

Others

  • (1989) Using Occupancy Grids for Mobile Robot Perception and Navigation [paper]

Contact

If you find our survey is useful in your research or applications, please consider giving us a star ๐ŸŒŸ and citing it by the following BibTeX entry.

@article{generalworldmodelsurvey,
  title={Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond},
  author={Zheng Zhu and Xiaofeng Wang and Wangbo Zhao and Chen Min and Nianchen Deng and Min Dou and Yuqi Wang and Botian Shi and Kai Wang and Chi Zhang and Yang You and Zhaoxiang Zhang and Dawei Zhao and Liang Xiao and Jian Zhao and Jiwen Lu and Guan Huang}, 
  journal={arXiv preprint arXiv:TODO},
  year={2024}
}

awesome-papers-world-models-autonomous-driving's People

Contributors

chaytonmin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.