Giter Site home page Giter Site logo

adamdad / hash3d Goto Github PK

View Code? Open in Web Editor NEW
146.0 4.0 8.0 73.27 MB

Hash3D: Training-free Acceleration for 3D Generation

Jupyter Notebook 0.37% Python 97.54% Dockerfile 0.04% Shell 0.31% CMake 0.02% C++ 0.32% Cuda 1.38% C 0.02%
3d diffusion-models efficiency image-to-3d text-to-3d

hash3d's Introduction


Training-free Acceleration
for 3D Generation ๐ŸŽ๏ธ๐Ÿ’จ

Introduction

This repository contains the official implementation for our paper

Hash3D: Training-free Acceleration for 3D Generation

๐Ÿฅฏ[Project Page] ๐Ÿ“[Paper] </>[code]

Xingyi Yang, Xinchao Wang

National University of Singapore

pipeline

We present Hash3D, a universal solution to acclerate score distillation samplin (SDS) based 3D generation. By effectively hashing and reusing these feature maps across neighboring timesteps and camera angles, Hash3D substantially prevents redundant calculations, thus accelerating the diffusion model's inference in 3D generation tasks.

What we offer:

  • โญ Compatiable to Any 3D generation method using SDS.
  • โญ Inplace Accerlation for 1.3X - 4X.
  • โญ Training-Free.

Results Visualizations

Image-to-3D Results

Input Image Zero-1-to-3 Hash3D + Zero-1-to-3 $${\color{red} \text{(Speed X4.0)}}$$

baby_phoenix_on_ice (1)

phoenix_zero123.mp4
phoenix_hash_zero123.mp4

grootplant_rgba (1)

grootplant_zero123.mp4
grootplant_hash_zero123.mp4

Text-to-3D Results

Prompt Gaussian-Dreamer Hash3D + Gaussian-Dreamer $${\color{red}\text{(Speed X1.5)}}$$
A bear dressed as a lumberjack
a.bear.dressed.as.a.lumberjack.mp4
a.bear.dressed.as.a.lumberjack_hash.mp4
A train engine made out of clay
a.train.engine.made.out.of.clay.mp4
a.train.engine.made.out.of.clay_hash.mp4

Project Structure

The repository is organized into three main directories, each catering to a different repo that Hash3D can be applied on:

  1. threesdtudio-hash3d: Contains the implementation of Hash3D tailored for use with the threestudio.
  2. dreamgaussian-hash3d: Focuses on integrating Hash3D with the DreamGaussian for image-to-3D generation.
  3. gaussian-dreamer-hash3d: Dedicated to applying Hash3D to GaussianDreamer for faster text-to-3D tasks.

What we add?

The core implementation is in the guidance_loss for each SDS loss computation. We

See hash3D/threestudio-hash3d/threestudio/models/guidance/zero123_unified_guidance_cache.py for example. The code for the hash table implementation is in hash3D/threestudio-hash3d/threestudio/utils/hash_table.py.

Getting Started

Installation

Navigate to each of the specific directories for environment-specific installation instructions.

Usage

Refer to the README within each directory for detailed usage instructions tailored to each environment.

For example, to run Zero123+SDS with hash3D

cd threestudio-hash3d
python launch.py --config configs/stable-zero123_hash3d.yaml --train --gpu 0 data.image_path=https://adamdad.github.io/hash3D/load/images/dog1_rgba.png

Evaliation

  1. Image-to-3D: GSO dataset GT meshes and renderings can be found online. With the rendering of the reconstructed 3D objects at pred_dir and the gt rendering at gt_dir, run
python eval_nvs.py --gt $gt_dir --pr $pred_dir 
  1. Text-to-3D: Run all the prompts in assets/prompt.txt. And compute the CLIP score between text and rendered image as
python eval_clip_sim.py "$gt_prompt" $pred_dir --mode text

Acknowledgement

We borrow part of the code from DeepCache for feature extraction from diffusion models. We also thanks the implementation from threestudio, DreamGaussian, Gaussian-Dreamer, and the valuable disscussion with @FlorinShum and @Horseee.

Citation

@misc{yang2024hash3d,
      title={Hash3D: Training-free Acceleration for 3D Generation}, 
      author={Xingyi Yang and Xinchao Wang},
      year={2024},
      eprint={2404.06091},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

hash3d's People

Contributors

adamdad avatar eltociear avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

hash3d's Issues

what's the version of diffusers you use in threestudio directory

/DeepCache/sd/unet_2d_condition.py", line 25, in
from diffusers.models.attention_processor import (
ImportError: cannot import name 'ADDED_KV_ATTENTION_PROCESSORS' from 'diffusers.models.attention_processor' (/workspace/assethub-ml-server/venv_lgm/lib/python3.10/site-packages/diffusers/models/attention_processor.py)

KeyError: 'zero123-unified-guidance-cache'

I run this command.

python launch.py --config configs/stable-zero123_hash3d.yaml --train --gpu 0 data.image_path=load/images/dog1_rgba.png

I got this error.

$ python launch.py --config configs/stable-zero123_hash3d.yaml --train --gpu 0 data.image_path=load/images/dog1_rgba.png
Global seed set to 0
find:  single-image-datamodule
find:  zero123-system
find:  implicit-volume
find:  diffuse-with-point-light-material
find:  solid-color-background
find:  nerf-volume-renderer
[INFO] ModelCheckpoint(save_last=True, save_top_k=-1, monitor=None) will duplicate the last checkpoint saved.
[INFO] GPU available: True (cuda), used: True
[INFO] TPU available: False, using: 0 TPU cores
[INFO] IPU available: False, using: 0 IPUs
[INFO] HPU available: False, using: 0 HPUs
[INFO] You are using a CUDA device ('NVIDIA GeForce RTX 4090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
[INFO] single image dataset: load image load/images/dog1_rgba.png torch.Size([1, 128, 128, 3])
[INFO] single image dataset: load image load/images/dog1_rgba.png torch.Size([1, 128, 128, 3])
[INFO] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[INFO] 
  | Name       | Type                          | Params
-------------------------------------------------------------
0 | geometry   | ImplicitVolume                | 12.6 M
1 | material   | DiffuseWithPointLightMaterial | 0     
2 | background | SolidColorBackground          | 0     
3 | renderer   | NeRFVolumeRenderer            | 0     
-------------------------------------------------------------
12.6 M    Trainable params
0         Non-trainable params
12.6 M    Total params
50.450    Total estimated model params size (MB)
[INFO] Validation results will be saved to outputs/zero123-sai-hash3d/[64, 128, 256]_dog1_rgba.png@20240418-085259/save
find:  zero123-unified-guidance-cache
Traceback (most recent call last):
  File "/home/dreamer/threestudio/custom/hash3D/threestudio-hash3d/launch.py", line 301, in <module>
    main(args, extras)
  File "/home/dreamer/threestudio/custom/hash3D/threestudio-hash3d/launch.py", line 244, in main
    trainer.fit(system, datamodule=dm, ckpt_path=cfg.resume)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 532, in fit
    call._call_and_handle_interrupt(
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 571, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 961, in _run
    call._call_lightning_module_hook(self, "on_fit_start")
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 146, in _call_lightning_module_hook
    output = fn(*args, **kwargs)
  File "/home/dreamer/threestudio/custom/hash3D/threestudio-hash3d/threestudio/systems/zero123.py", line 40, in on_fit_start
    self.guidance = threestudio.find(self.cfg.guidance_type)(self.cfg.guidance)
  File "/home/dreamer/threestudio/custom/hash3D/threestudio-hash3d/threestudio/__init__.py", line 33, in find
    return __modules__[name]
KeyError: 'zero123-unified-guidance-cache'

What's the problem ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.