Giter Site home page Giter Site logo

postech-ami / paint-it Goto Github PK

View Code? Open in Web Editor NEW
163.0 163.0 6.0 10.54 MB

[CVPR'24] Official PyTorch Implementation of "Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering"

Home Page: https://kim-youwang.github.io/paint-it

License: MIT License

Python 61.42% Cuda 19.00% C 5.13% C++ 14.45%

paint-it's Issues

Code human meshes?

Hello,
Great work!

When is the expected release date for the code related to 3D human meshes?

Thanks.

how I can get the normal texture with the ks,kd

When you import the mesh and the generated texture maps in Blender, you have to define the same shading pipeline as NVDiffrast (that we used to render & train our texture).

You can refer to this #5 (comment). Here, you can find the Blender Python script to import your mesh and texture, similar to the NVDiffrast.

Please let us know if you have more questions.
Thanks.

Originally posted by @Youwang-Kim in #7 (comment)

how to get the depth image

Hi, now I want to use the depth controlnet as a guide, so how can I get the depth form the render material, I want to get such image as it
iter_100_depth

replace the uv random noise

Hi, if I take this input uv and replace the random noise with a texture that is locally close to the 3D template model, what other parameters do I need to change to make the result more robust?

CODE?

Would love to try this, do you plan on releasing the code?

how to change the size ,if the UV size is 1024 or 2048

The DC-PBR representation uses U-Net as its architecture. Thus, if you increase the resulting texture map resolution, you would have to modify some architectural hyperparameters such as num_channels_down, num_channels_up, num_channels_skip, filter_size_up, filter_size_down. You can adjust those parameters here.

Paint-it/paint_it.py

Lines 81 to 87 in 407c55b

net = skip(input_depth, 9,
num_channels_down=[128] * 5,
num_channels_up=[128] * 5,
num_channels_skip=[128] * 5,
filter_size_up=3, filter_size_down=3,
upsample_mode='nearest', filter_skip_size=1,
need_sigmoid=True, need_bias=True, pad='reflection', act_fun='LeakyReLU').type(torch.cuda.FloatTensor)

Please let us know if you have more questions.
Thanks.

Originally posted by @Youwang-Kim in #8 (comment)

Viewing in Blender?

Although the results look great with the nvdiff_renderer, not so much inside Blender. Greens appearing as florescent etc... I suspect this has to do with enabling the correct options inside blender or perhaps it isn't using one of the textures. Anybody else seen this and know how to fix it?

Will there be any issues increasing the resolution of the texture

So far I haven't tried this for this technique, but so far the biggest issue I've run into with text to texture is not being able to generate fine enough textures (detailed 2048x2048 or larger for example). I expect I'll run into memory issues, but is there any fundamental limitation? Do you see any issues with this using paint-it?

Running out of VRAM

I'm trying to run the code on a cloud machine with an NVIDIA A10 gpu (24gb vram) and it is getting the CUDA out of memory error. Do you have any suggestions for running this with less gpu memory usage?

Here is the full error:
Traceback (most recent call last): File "/app/paint-it/paint_it.py", line 320, in <module> main(args, guidance) File "/app/paint-it/paint_it.py", line 226, in main sd_loss = guidance.batch_train_step(text_embedding, obj_image, File "/app/paint-it/sd.py", line 135, in batch_train_step noise_pred = self.unet(latent_model_input, tt, encoder_hidden_states=text_embeddings).sample File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/diffusers/models/unets/unet_2d_condition.py", line 1121, in forward sample, res_samples = downsample_block( File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/diffusers/models/unets/unet_2d_blocks.py", line 1199, in forward hidden_states = attn( File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/diffusers/models/transformers/transformer_2d.py", line 391, in forward hidden_states = block( File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/diffusers/models/attention.py", line 400, in forward ff_output = self.ff(norm_hidden_states, scale=lora_scale) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/diffusers/models/attention.py", line 672, in forward hidden_states = module(hidden_states, scale) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/diffusers/models/activations.py", line 103, in forward return hidden_states * self.gelu(gate) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 160.00 MiB (GPU 0; 23.73 GiB total capacity; 20.89 GiB already allocated; 53.62 MiB free; 21.23 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Default fp16 inference and potentially missing arguments

Hi, thank you for releasing the code. I have a few issues encountered during testing.
One is that, could you please cast the model weight and input to fp16 by default so the model will work on consumer GPUs with around 20 GB memory? It will be more friendly for users without NV AI cards.
Another thing is that seems that some arguments are not set when running paint_it.py, such as objaverse_id (is it the same thing as obj_id), and the learn_lights argument. May I ask if you have tested your program by purely installing from the git repo, especially when some files are excluded from uploading.

About generated textures

Hi, could you please explain how the rgb channels in Ks texture should be interpreted? If I understand the paper correctly, there are roughness and metallic in that texture, but I am not sure which channels stores which property. In addition, are there any processing needed, before these textures can be used as albedo, roughness and metallic input for shaders in common softwares like Blender or Unity?

Some questions about Text-to-Texture task

Very nice work!
Resolution: What's the resolution of UV map? When I zoom in, I observe some blurry results. Is this from low resolution UV map or low power of Stable Diffusion v1.5 or others (I cannot find what model used in the paper)? I think SDXL can improve a lot so I'm looking forward to the code.
SDS vs TEXTure: Is SDS a good choice for Text-to-Texture task? Although the quality of official TEXTure is not good enough, this text-to-image-to-texture method have been optimized in industry such as Meshy. It can generate very high quailty texture in 2 minutes. In contrast, SDS needs 15-30 min using A6000. In my experiments, Fantasia3D+SDXL can generate better results than Meshy in some cases but it takes 36 min using A6000.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.