threedle / text2mesh Goto Github PK
View Code? Open in Web Editor NEW3D mesh stylization driven by a text input in PyTorch
Home Page: https://threedle.github.io/text2mesh/
License: MIT License
3D mesh stylization driven by a text input in PyTorch
Home Page: https://threedle.github.io/text2mesh/
License: MIT License
Instead of using the pip installation method of yml, install it according to the methods on the clip and kaolin homepages, just make sure that there are clip1.0 and kaolin 0.12.0 in the conda list.
1. Install clip1.0:
pip install git+https://github.com/openai/CLIP.git
2. Install kaolin0.12.0:
vi ~/.bashrc
export CUDA_HOME=/usr/local/cuda
source ~/.bashrc
conda activate text2mesh
git clone --recursive https://github.com/NVIDIAGameWorks/kaolin
cd kaolin
git checkout v0.12.0
python setup.py develop
Please refer to the following URL
clip 1.0
kaolin 0.12.0
For more details please refer to my blog
My Blog
Not an issue per se, but wow; this is an impressive result. Great work!
Hi, thnak you nice work!
Can I stylize a source mesh using 2D Images or meshes in your code?
I keep getting these errors
`error Traceback (most recent call last)
Cell In[6], line 16
14 plt.figure(figsize=(20, 4))
15 plt.axis("off")
---> 16 plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
17 plt.show()
error: OpenCV(4.6.0) C:\b\abs_74oeeuevib\croots\recipe\opencv-suite_1664548340488\work\modules\imgproc\src\color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cv::cvtColor'`
from this section of code
`#@title export the results
import matplotlib.pyplot as plt
import importlib
import PIL
importlib.reload(PIL.TiffTags)
import cv2
import os
frames = []
for i in range(0, n_iter, 100):
img = cv2.imread(os.path.join(output_dir, f"iter_{i}.jpg"))
frames.append(img)
plt.figure(figsize=(20, 4))
plt.axis("off")
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.show()`
Hello,
I have no errors when I run conda env create --file text2mesh.yml
and conda activate text2mesh
commands.
But when I execute the ./demo/run_candle.sh
file, I get this error :
Traceback (most recent call last): File "C:\Users\matth\text2mesh\main.py", line 3, in <module> import kaolin.ops.mesh ModuleNotFoundError: No module named 'kaolin'
I have also the same error with the clip module.
I am supposed to be in the env with everything perfectly installed.
Please help me
Thanks for your excellent work!
When I run your demo, I notice that the input points are sent to progressive encoding. If I disable this module, the final result dosen't change a lot. So why do we need this module?
Here is the two picture of results( promt: an image of a shoe made of cactus, obj: shoe), the first one is WITH progressive encoding, the second one is WITHOUT progressive encoding.
Hi! Do you know, maybe, how to run the code on GPU with CUDA 11.3 or 11.2? AFAIU, one would need pytorch 1.10 for this, but then kaolin asks for pytorch <= 1.9
Hi,
Thanks for the amazing work! I am working on an extended object based on text2mesh and found that the current save_model
function in neural_style_field.py
doesn't quite work. This is because the intermediate variables for the ProgressiveEncoding
and FourierFeatureTransform
layers are not saved. These variables have critical impacts on model performances and will lead to inconsistent performance if not saved.
The solution I find is pretty simple: just register _t
in ProgressiveEncoding
, and _B
in FourierFeatureTransform
as nn.Parameter
s. They will then be saved by PyTorch's model.state_dict()
correctly. It works for me when I try to load a saved model and conduct inference now.
Think it is worth sharing it here if anyone has met similar issues. If you think the above makes sense, I can open a pull request to fix this.
Love to hear your thoughts. Thanks!
Hi! Thanks for sharing this amazing project. I wonder if there is a way to improve the quality or the resolution of the final mesh?
Is there a way to export a texture map for the final obj? I would like to use these objects in 3d rendering software
I did not see a Discussion section, so asking here in Issues: (1) Has anyone published a Google Colab notebook? and (2) how was the animation created of the vase?
For (1), I can whip one up assuming that the CUDA matches what Colab gives. Otherwise, this needs tweaking
For (2), it looks like you are varying the seed and/or the text prompt and then perhaps creating keyframes which are then interpolated? It is a cool effect. The other would be to explore 3D printing of things like "the cactus shoe"
Hello, thank you for sharing this amazing tool.
I wonder if I can use my own model as the init model. I tried mine on Kaggle notebook, here is my version. Is there any requirement (like vertices count) for a model...?
my notebook
It came up following issues when running main.py:
The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 2
full error:
/opt/conda/lib/python3.7/site-packages/clip/clip.py:23: UserWarning: PyTorch version 1.7.1 or higher is recommended
warnings.warn("PyTorch version 1.7.1 or higher is recommended")
ModuleList(
(0): FourierFeatureTransform()
(1): Linear(in_features=515, out_features=256, bias=True)
(2): ReLU()
(3): Linear(in_features=256, out_features=256, bias=True)
(4): ReLU()
(5): Linear(in_features=256, out_features=256, bias=True)
(6): ReLU()
(7): Linear(in_features=256, out_features=256, bias=True)
(8): ReLU()
(9): Linear(in_features=256, out_features=256, bias=True)
(10): ReLU()
)
ModuleList(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): ReLU()
(2): Linear(in_features=256, out_features=256, bias=True)
(3): ReLU()
(4): Linear(in_features=256, out_features=3, bias=True)
)
ModuleList(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): ReLU()
(2): Linear(in_features=256, out_features=256, bias=True)
(3): ReLU()
(4): Linear(in_features=256, out_features=1, bias=True)
)
0%| | 0/750 [00:00<?, ?it/s]/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1628: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
0%| | 0/750 [00:00<?, ?it/s]
Traceback (most recent call last):
File "main.py", line 481, in
run_branched(args)
File "main.py", line 167, in run_branched
update_mesh(mlp, network_input, prior_color, sampled_mesh, vertices)
File "main.py", line 407, in update_mesh
sampled_mesh.faces)
RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 2
My model is here:
ufo.zip
Vertices: 90600
Faces: 90598
Triangles: 181196
Hello,
thank you for sharing this incredible project. I am testing text2mesh and do not see any textures as output beside some PT files. How shall I go in order to add textures to the final .obj? We are using Blender or any similar 3d software.
Thank you so much for your help!
Mattia
I have anaconda correctly installed, I have an RTX3090 but the conda command just does not work :(
This problem always occurs when installing "kaolin". Who can help me solve it?
I removed --prompt from the settings of run_shoe.sh and set values for --no_prompt and --image, and the quality of results were really bad. Details are as follows:
case1
run_shoe.sh
python main.py --run branch --obj_path data/source_meshes/shoe.obj --output_dir results/demo/shoe/texture/brick --no_prompt --image data/target_texture/brick_texture.jpg --sigma 5.0 --clamp tanh --n_normaugs 4 --n_augs 1 --normmincrop 0.1 --normmaxcrop 0.1 --geoloss --colordepth 2 --normdepth 2 --frontview --frontview_std 4 --clipavg view --lr_decay 0.9 --clamp tanh --normclamp tanh --maxcrop 1.0 --save_render --seed 11 --n_iter 1500 --learning_rate 0.0005 --normal_learning_rate 0.0005 --background 1 1 1 --frontview_center 0.5 0.6283
case2
run_shoe.sh
python main.py --run branch --obj_path data/source_meshes/shoe.obj --output_dir results/demo/shoe/texture2/cactus --no_prompt --image data/target_texture/cactus_texture.jpg --sigma 5.0 --clamp tanh --n_normaugs 4 --n_augs 1 --normmincrop 0.1 --normmaxcrop 0.1 --geoloss --colordepth 2 --normdepth 2 --frontview --frontview_std 4 --clipavg view --lr_decay 0.9 --clamp tanh --normclamp tanh --maxcrop 1.0 --save_render --seed 11 --n_iter 1500 --learning_rate 0.0005 --normal_learning_rate 0.0005 --background 1 1 1 --frontview_center 0.5 0.6283
Did I set the parameter wrong? Or is there something in main.py that needs to be modified? Or is it a randomness issue in optimization?
When --no_prompt is set to True and --image is set to image path string in main.py, the loss code corresponding to 'local to global' and 'local to displacement' in the paper is not understood. Should I change this part?
Hi! This is a really exciting work, and thanks for your effort in this implementation.
Unfortunately, I cannot recreate the results on my own server equipped with a NVIDIA A100 (which would require >= CUDA 11.1). As the text2mesh.yml
would prepare an unavailable environment with CUDA 10.2, I set up the environment by myself with the following commands:
conda create -n text2mesh python=3.7
conda activate text2mesh
export TORCH_CUDA_ARCH_LIST="8.6"
cd kaolin/
git checkout v0.9.1
python setup.py develop
pip install git+https://github.com/openai/CLIP.git
pip install matplotlib
However, when I follow the instruction and run ./demo/run_vase.sh
, I cannot observe any non-trivial results. Each 100 iterations would generate the same intermediate results:
and vase_final.obj
doesn't show any styled details.
The output of the codebase:
Is there any workaround? Thanks!
Is there a simple way to export the final mesh from the colab notebook?
when I run the sh script, I got an error:
./demo/run_alien_cobble.sh
Traceback (most recent call last):
File "/home/zqx/text2mesh/main.py", line 499, in
run_branched(args)
File "/home/zqx/text2mesh/main.py", line 140, in run_branched
encoded_text = clip_model.encode_text(prompt_token)
File "/home/zqx/miniconda3/envs/text2mesh/lib/python3.9/site-packages/clip/model.py", line 348, in encode_text
x = self.transformer(x)
File "/home/zqx/miniconda3/envs/text2mesh/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zqx/miniconda3/envs/text2mesh/lib/python3.9/site-packages/clip/model.py", line 203, in forward
return self.resblocks(x)
File "/home/zqx/miniconda3/envs/text2mesh/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zqx/miniconda3/envs/text2mesh/lib/python3.9/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/home/zqx/miniconda3/envs/text2mesh/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zqx/miniconda3/envs/text2mesh/lib/python3.9/site-packages/clip/model.py", line 190, in forward
x = x + self.attention(self.ln_1(x))
File "/home/zqx/miniconda3/envs/text2mesh/lib/python3.9/site-packages/clip/model.py", line 187, in attention
return self.attn(x, x, x, need_weights=False, attn_mask=self.attn_mask)[0]
File "/home/zqx/miniconda3/envs/text2mesh/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zqx/miniconda3/envs/text2mesh/lib/python3.9/site-packages/torch/nn/modules/activation.py", line 1153, in forward
attn_output, attn_output_weights = F.multi_head_attention_forward(
File "/home/zqx/miniconda3/envs/text2mesh/lib/python3.9/site-packages/torch/nn/functional.py", line 5066, in multi_head_attention_forward
q, k, v = _in_projection_packed(query, key, value, in_proj_weight, in_proj_bias)
File "/home/zqx/miniconda3/envs/text2mesh/lib/python3.9/site-packages/torch/nn/functional.py", line 4745, in _in_projection_packed
return linear(q, w, b).chunk(3, dim=-1)
RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP)
Anyone knows how to fix it, please tell me.
Great work!
I would like to know what kind of sentences are reasonable and valid for CLIP. Are there any specific prompt rules for style sentences in your paper?
I try the 'an image of a car of wood' for an input car.
final:
But it turned out that the geometry of the car became disorganized and even self-intersecting, the shape of the original geometry was invisible, and the texture did not take on the texture of wood. May I ask where is the problem?
Hope you can answer my two questions: the prompt rules ahd the effect. Thanks you very mush.
Best.
I train the code and use the default script 'python main.py --run branch --obj_path data/source_meshes/person.obj --output_dir results/demo/people/hulk --prompt "a 3D rendering of the Hulk in unreal engine" --sigma 12.0 --clamp tanh --n_normaugs 4 --n_augs 1 --normmincrop 0.1 --normmaxcrop 0.4 --geoloss --colordepth 2 --normdepth 2 --frontview --frontview_std 4 --clipavg view --lr_decay 0.9 --clamp tanh --normclamp tanh --maxcrop 1.0 --save_render --seed 23 --n_iter 1500 --learning_rate 0.0005 --normal_learning_rate 0.0005 --standardize --no_pe --symmetry --background 1 1 1'
Finally, I got the following result which is different from the paper. The face is very unclear, the color of the pants is green, and the overall clarity is not as good as in the paper. Where did I go wrong and what should I do?
Best
When I run this command from the directory: conda env create --file text2mesh.yml
I get this error below. I have a CUDA GPU machine and it's enabled. I'm using Anaconda. How to fix?
==========================
Collecting package metadata (repodata.json): done
Solving environment: failed
ResolvePackageNotFound:
I want to know how to fix the geometry of my input, so I can only optimize the color and not optimize the geometry. Specifically, I want to know how to modify the codes?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.