Giter Site home page Giter Site logo

yuxiaoyang23 / prolificdreamer Goto Github PK

View Code? Open in Web Editor NEW

This project forked from thu-ml/prolificdreamer

0.0 0.0 0.0 51.61 MB

Official implementation of ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation (NeurIPS 2023 Spotlight)

Home Page: https://ml.cs.tsinghua.edu.cn/prolificdreamer/

License: Apache License 2.0

Shell 0.30% C++ 0.47% Python 79.47% C 1.31% Cuda 18.45%

prolificdreamer's Introduction

ProlificDreamer

Official implementation of ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation, published in NeurIPS 2023 (Spotlight).

Installation

The codebase is built on stable-dreamfusion. For installation,

pip install -r requirements.txt

Training

ProlificDreamer includes 3 stages for high-fidelity text-to-3d generation.

# --------- Stage 1 (NeRF, VSD guidance) --------- #
# This costs approximately 27GB GPU memory, with rendering resolution of 512x512
CUDA_VISIBLE_DEVICES=0 python main.py --text "A pineapple." --iters 25000 --lambda_entropy 10 --scale 7.5 --n_particles 1 --h 512  --w 512 --workspace exp-nerf-stage1/
# If you find the result is foggy, you can increase the --lambda_entropy. For example
CUDA_VISIBLE_DEVICES=0 python main.py --text "A pineapple." --iters 25000 --lambda_entropy 100 --scale 7.5 --n_particles 1 --h 512  --w 512 --workspace exp-nerf-stage1/
# Generate with multiple particles. Notice that generating with multiple particles is only supported in Stage 1.
CUDA_VISIBLE_DEVICES=0 python main.py --text "A pineapple." --iters 100000 --lambda_entropy 10 --scale 7.5 --n_particles 4 --h 512  --w 512 --t5_iters 20000 --workspace exp-nerf-stage1/

# --------- Stage 2 (Geometry Refinement) --------- #
# This costs <20GB GPU memory
CUDA_VISIBLE_DEVICES=0 python main.py --text "A pineapple." --iters 15000 --scale 100 --dmtet --mesh_idx 0  --init_ckpt /path/to/stage1/ckpt --normal True --sds True --density_thresh 0.1 --lambda_normal 5000 --workspace exp-dmtet-stage2/
# If the results are with maney floaters, you can increase --density_thresh. Notice that the value of --density_thresh must be consistent in stage2 and stage3.
CUDA_VISIBLE_DEVICES=0 python main.py --text "A pineapple." --iters 15000 --scale 100 --dmtet --mesh_idx 0  --init_ckpt /path/to/stage1/ckpt --normal True --sds True --density_thresh 0.4 --lambda_normal 5000 --workspace exp-dmtet-stage2/

# --------- Stage 3 (Texturing, VSD guidance) --------- #
# texturing with 512x512 rasterization
CUDA_VISIBLE_DEVICES=0 python main.py --text "A pineapple." --iters 30000 --scale 7.5 --dmtet --mesh_idx 0  --init_ckpt /path/to/stage2/ckpt --density_thresh 0.1 --finetune True --workspace exp-dmtet-stage3/

We also provide a script that can automatically run these 3 stages.

bash run.sh gpu_id text_prompt

For example,

bash run.sh 0 "A pineapple."

Limitations: (1) Our work ultilizes the original Stable Diffusion without any 3D data, thus the multi-face Janus problem is prevalent in the results. Ultilizing text-to-image diffusion which has been finetuned on multi-view images will alleviate this problem. (2) If the results are not satisfactory, try different seeds. This is helpful if the results have a good quality but suffer from the multi-face Janus problem.

TODO List

  • Release our code.
  • Combine MVDream with VSD to alleviate the multi-face problem.

Related Links

BibTeX

If you find our work useful for your project, please consider citing the following paper.

@inproceedings{wang2023prolificdreamer,
  title={ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation},
  author={Zhengyi Wang and Cheng Lu and Yikai Wang and Fan Bao and Chongxuan Li and Hang Su and Jun Zhu},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2023}
}

prolificdreamer's People

Contributors

thuwzy avatar luchengthu avatar yikaiw avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.