Giter Site home page Giter Site logo

idea-research / dreamwaltz Goto Github PK

View Code? Open in Web Editor NEW
170.0 170.0 8.0 214.06 MB

[NeurIPS 2023] Official implementation of the paper "DreamWaltz: Make a Scene with Complex 3D Animatable Avatars".

Home Page: https://idea-research.github.io/DreamWaltz/

License: Other

Python 67.20% C++ 0.64% Cuda 30.18% C 1.63% Shell 0.35%

dreamwaltz's Issues

[bug] Train a personalized SD1.5 model by DreamBooth Lora. Execute stage 1, but there is no personalized effect

Great job.
my code backbone:threestudio-dreamwaltz
I want to try civitai model, as you mainly mentioned make personalized avatar by load lora weight, but I found it doesn't work,

# guidance/multi_controlnet_guidance.py
# when I download or train a lora weight, my calling method refers to diffusers method as shown below
self.pipe = StableDiffusionControlNetPipeline.from_pretrained(
            self.cfg.pretrained_model_name_or_path,
            **pipe_kwargs,
        ).to(self.device)

# after
self.pipe.load_lora_weights(self.cfg.lora_weights_path",weight_name="pytorch_lora_weights.safetensors")

To verify whether lora is effective, you can test it through the following code

from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, UniPCMultistepScheduler
from diffusers.utils import load_image
import numpy as np
import torch

import cv2
from PIL import Image

# download an image
image = load_image(
    "https://hf.co/datasets/huggingface/documentation-images/resolve/main/diffusers/input_image_vermeer.png"
)
image = np.array(image)

# get canny image
image = cv2.Canny(image, 100, 200)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
canny_image = Image.fromarray(image)

# load control net and stable diffusion v1-5
controlnet = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny", torch_dtype=torch.float16)
pipe = StableDiffusionControlNetPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16
)
pipe.load_lora_weights("pretrained_models/s15_girl_character_lora_v3_c1500",weight_name="pytorch_lora_weights.safetensors")
# speed up diffusion process with faster scheduler and memory optimization
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
# remove following line if xformers is not installed
pipe.enable_xformers_memory_efficient_attention()

pipe.enable_model_cpu_offload()

# generate image
generator = torch.manual_seed(0)
image = pipe(
    "a zoomed out DSLR photo of sks female anime character", num_inference_steps=200, generator=generator, image=canny_image
).images[0]
image.save("infer_test_lora.png")

The result is as follows, it is valid
infer_test_lora-1500

pytorch_lora_weights.zip

By the way, I found it works when I didn't use lora finetune sd, but full finetune sd。

Training Time

How many gpu hours are needed to train your model on a single NVIDIA 3090?

Program stuck

image image

When I was running DreamWaltz, the program got stuck. After debugging, I found that the code was stuck in the red box. Do you know what the reason is?

Where do I get the pretrained ControlNet?

Thanks for your great work! I met a problem when I want to download the pretrained model. I can not find the below models on the Hugging Face.

MODEL_CARDS = {
    'pose': "fusing/stable-diffusion-v1-5-controlnet-openpose",
    'depth': "fusing/stable-diffusion-v1-5-controlnet-depth",
    'canny': "fusing/stable-diffusion-v1-5-controlnet-canny",
    'seg': "fusing/stable-diffusion-v1-5-controlnet-seg",
    'normal': "fusing/stable-diffusion-v1-5-controlnet-normal",
}

SMPL Initialization

Why is the depth map used to initialize a NeRF of canonical SMPL? Use color map rendered from mesh will be efficient.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.