adamdad / hash3d Goto Github PK

View Code? Open in Web Editor NEW

146.0 4.0 8.0 73.27 MB

Hash3D: Training-free Acceleration for 3D Generation

Jupyter Notebook 0.37% Python 97.54% Dockerfile 0.04% Shell 0.31% CMake 0.02% C++ 0.32% Cuda 1.38% C 0.02%

3d diffusion-models efficiency image-to-3d text-to-3d

hash3d's Introduction

Training-free Acceleration
for 3D Generation 🏎️💨

Introduction

This repository contains the official implementation for our paper

Hash3D: Training-free Acceleration for 3D Generation

🥯[Project Page] 📝[Paper] </>[code]

Xingyi Yang, Xinchao Wang

National University of Singapore

We present Hash3D, a universal solution to acclerate score distillation samplin (SDS) based 3D generation. By effectively hashing and reusing these feature maps across neighboring timesteps and camera angles, Hash3D substantially prevents redundant calculations, thus accelerating the diffusion model's inference in 3D generation tasks.

What we offer:

⭐ Compatiable to Any 3D generation method using SDS.
⭐ Inplace Accerlation for 1.3X - 4X.
⭐ Training-Free.

Results Visualizations

Image-to-3D Results

Input Image	Zero-1-to-3	Hash3D + Zero-1-to-3 $${\color{red} \text{(Speed X4.0)}}$$
	phoenix_zero123.mp4	phoenix_hash_zero123.mp4
	grootplant_zero123.mp4	grootplant_hash_zero123.mp4

Text-to-3D Results

Prompt	Gaussian-Dreamer	Hash3D + Gaussian-Dreamer $${\color{red}\text{(Speed X1.5)}}$$
A bear dressed as a lumberjack	a.bear.dressed.as.a.lumberjack.mp4	a.bear.dressed.as.a.lumberjack_hash.mp4
A train engine made out of clay	a.train.engine.made.out.of.clay.mp4	a.train.engine.made.out.of.clay_hash.mp4

Project Structure

The repository is organized into three main directories, each catering to a different repo that Hash3D can be applied on:

threesdtudio-hash3d: Contains the implementation of Hash3D tailored for use with the threestudio.
dreamgaussian-hash3d: Focuses on integrating Hash3D with the DreamGaussian for image-to-3D generation.
gaussian-dreamer-hash3d: Dedicated to applying Hash3D to GaussianDreamer for faster text-to-3D tasks.

What we add?

The core implementation is in the guidance_loss for each SDS loss computation. We

See hash3D/threestudio-hash3d/threestudio/models/guidance/zero123_unified_guidance_cache.py for example. The code for the hash table implementation is in hash3D/threestudio-hash3d/threestudio/utils/hash_table.py.

Getting Started

Installation

Navigate to each of the specific directories for environment-specific installation instructions.

Usage

Refer to the README within each directory for detailed usage instructions tailored to each environment.

For example, to run Zero123+SDS with hash3D

cd threestudio-hash3d
python launch.py --config configs/stable-zero123_hash3d.yaml --train --gpu 0 data.image_path=https://adamdad.github.io/hash3D/load/images/dog1_rgba.png

Evaliation

Image-to-3D: GSO dataset GT meshes and renderings can be found online. With the rendering of the reconstructed 3D objects at pred_dir and the gt rendering at gt_dir, run

python eval_nvs.py --gt $gt_dir --pr $pred_dir

Text-to-3D: Run all the prompts in assets/prompt.txt. And compute the CLIP score between text and rendered image as

python eval_clip_sim.py "$gt_prompt" $pred_dir --mode text

Acknowledgement

We borrow part of the code from DeepCache for feature extraction from diffusion models. We also thanks the implementation from threestudio, DreamGaussian, Gaussian-Dreamer, and the valuable disscussion with @FlorinShum and @Horseee.

Citation

@misc{yang2024hash3d,
      title={Hash3D: Training-free Acceleration for 3D Generation}, 
      author={Xingyi Yang and Xinchao Wang},
      year={2024},
      eprint={2404.06091},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

hash3d's People

Contributors

Stargazers

Watchers

Forkers

hiyyg paperwave eltociear alexandor91 hengle dungeonmassster lenubolim yopla38

hash3d's Issues

what's the version of diffusers you use in threestudio directory

/DeepCache/sd/unet_2d_condition.py", line 25, in
from diffusers.models.attention_processor import (
ImportError: cannot import name 'ADDED_KV_ATTENTION_PROCESSORS' from 'diffusers.models.attention_processor' (/workspace/assethub-ml-server/venv_lgm/lib/python3.10/site-packages/diffusers/models/attention_processor.py)

KeyError: 'zero123-unified-guidance-cache'

I run this command.

python launch.py --config configs/stable-zero123_hash3d.yaml --train --gpu 0 data.image_path=load/images/dog1_rgba.png

I got this error.

$ python launch.py --config configs/stable-zero123_hash3d.yaml --train --gpu 0 data.image_path=load/images/dog1_rgba.png
Global seed set to 0
find:  single-image-datamodule
find:  zero123-system
find:  implicit-volume
find:  diffuse-with-point-light-material
find:  solid-color-background
find:  nerf-volume-renderer
[INFO] ModelCheckpoint(save_last=True, save_top_k=-1, monitor=None) will duplicate the last checkpoint saved.
[INFO] GPU available: True (cuda), used: True
[INFO] TPU available: False, using: 0 TPU cores
[INFO] IPU available: False, using: 0 IPUs
[INFO] HPU available: False, using: 0 HPUs
[INFO] You are using a CUDA device ('NVIDIA GeForce RTX 4090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
[INFO] single image dataset: load image load/images/dog1_rgba.png torch.Size([1, 128, 128, 3])
[INFO] single image dataset: load image load/images/dog1_rgba.png torch.Size([1, 128, 128, 3])
[INFO] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[INFO] 
  | Name       | Type                          | Params
-------------------------------------------------------------
0 | geometry   | ImplicitVolume                | 12.6 M
1 | material   | DiffuseWithPointLightMaterial | 0     
2 | background | SolidColorBackground          | 0     
3 | renderer   | NeRFVolumeRenderer            | 0     
-------------------------------------------------------------
12.6 M    Trainable params
0         Non-trainable params
12.6 M    Total params
50.450    Total estimated model params size (MB)
[INFO] Validation results will be saved to outputs/zero123-sai-hash3d/[64, 128, 256]_dog1_rgba.png@20240418-085259/save
find:  zero123-unified-guidance-cache
Traceback (most recent call last):
  File "/home/dreamer/threestudio/custom/hash3D/threestudio-hash3d/launch.py", line 301, in <module>
    main(args, extras)
  File "/home/dreamer/threestudio/custom/hash3D/threestudio-hash3d/launch.py", line 244, in main
    trainer.fit(system, datamodule=dm, ckpt_path=cfg.resume)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 532, in fit
    call._call_and_handle_interrupt(
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 571, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 961, in _run
    call._call_lightning_module_hook(self, "on_fit_start")
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 146, in _call_lightning_module_hook
    output = fn(*args, **kwargs)
  File "/home/dreamer/threestudio/custom/hash3D/threestudio-hash3d/threestudio/systems/zero123.py", line 40, in on_fit_start
    self.guidance = threestudio.find(self.cfg.guidance_type)(self.cfg.guidance)
  File "/home/dreamer/threestudio/custom/hash3D/threestudio-hash3d/threestudio/__init__.py", line 33, in find
    return __modules__[name]
KeyError: 'zero123-unified-guidance-cache'

What's the problem ?

ImportError: cannot import name 'GridBasedHashTable_Key' from 'threestudio.utils.hash_table'

Seems like the GridBasedHashTable_Key is missing

Also, had to manually add the 'zero123_unified_guidance_cache' into threestudio/guidance/init.py.

No file here https://adamdad.github.io/hash3D/load/images/dog1_rgba.png

We can't use this file.
https://adamdad.github.io/hash3D/load/images/dog1_rgba.png

You have to contain this image to source code.