gsgen3d / gsgen Goto Github PK

View Code? Open in Web Editor NEW

746.0 746.0 42.0 5.02 MB

[CVPR 2024] Text-to-3D using Gaussian Splatting

Home Page: https://arxiv.org/abs/2309.16585

License: MIT License

Python 82.89% CMake 0.11% Shell 0.01% C++ 3.68% C 10.57% Cuda 2.74%

gsgen's People

Contributors

Stargazers

Watchers

Forkers

sometimesacoder camenduru peterzs livingsparks10 pipinstallyp megagrump cederron sorokinvld yuchenlichuck ardabck phylliida ctavolazzi multipath techthiyanes navezjt renatomedev linecode octag0no rakyat-game stevenshaw1999 soxunlocks w1jyun johncruyff14 digitalpotter lrq3000 lenubolim zchenshy kristofe sfidea jaedukseo sorcererq fkcptlst rickzhang716 contimatteo automation-terry margerx cxh0519 hp-wuyongxing whuhxb occlete baizongliang viridityzhu

gsgen's Issues

W.R.T. training speed

Hi, thanks for the great efforts in gsgen. I'm wondering how do you compare the speed of the reimplemented GS with the original GS implementation? I've noticed that the concurrent works are quite fast in training.

For example, it's 2 mins when using dreamgaussian, as they use 500 training iterations for GS in the 1st stage. To be comparable, 15k steps in their case will take 0.2h. Additionally, in gaussiandreamer, it takes 20mins to train. Differently, gsgen takes 2h to train 15k steps. So I'm wondering what causes the speed difference? For example, what's the percentage of time in gsgen is used for rendering, for optimization, etc.

Will GSGEN support other prompts

Thank you for your wonderful work. I want to know, although the title mentions it as a text to 3D method, I would like to know if this GitHub will support other prompts in the future, such as images.

id like to test this awsome text to 3D but i cant install ipython version 8.14.0

hello while doing the comand pip install -r requirements.txt it get this error message ERROR: Could not find a version that satisfies the requirement ipython==8.14.0 (from -r requirements.txt (line 39)) (from versions: 0.10, 0.10.1, 0.10.2, 0.11, 0.12, 0.12.1, 0.13, 0.13.1, 0.13.2, 1.0.0, 1.1.0, 1.2.0, 1.2.1, 2.0.0, 2.1.0, 2.2.0, 2.3.0, 2.3.1, 2.4.0, 2.4.1, 3.0.0, 3.1.0, 3.2.0, 3.2.1, 3.2.2, 3.2.3, 4.0.0b1, 4.0.0, 4.0.1, 4.0.2, 4.0.3, 4.1.0rc1, 4.1.0rc2, 4.1.0, 4.1.1, 4.1.2, 4.2.0, 4.2.1, 5.0.0b1, 5.0.0b2, 5.0.0b3, 5.0.0b4, 5.0.0rc1, 5.0.0, 5.1.0, 5.2.0, 5.2.1, 5.2.2, 5.3.0, 5.4.0, 5.4.1, 5.5.0, 5.6.0, 5.7.0, 5.8.0, 5.9.0, 5.10.0, 6.0.0rc1, 6.0.0, 6.1.0, 6.2.0, 6.2.1, 6.3.0, 6.3.1, 6.4.0, 6.5.0, 7.0.0b1, 7.0.0rc1, 7.0.0, 7.0.1, 7.1.0, 7.1.1, 7.2.0, 7.3.0, 7.4.0, 7.5.0, 7.6.0, 7.6.1, 7.7.0, 7.8.0, 7.9.0, 7.10.0, 7.10.1, 7.10.2, 7.11.0, 7.11.1, 7.12.0, 7.13.0, 7.14.0, 7.15.0, 7.16.0, 7.16.1, 7.16.2, 7.16.3, 7.17.0, 7.18.0, 7.18.1, 7.19.0, 7.20.0, 7.21.0, 7.22.0, 7.23.0, 7.23.1, 7.24.0, 7.24.1, 7.25.0, 7.26.0, 7.27.0, 7.28.0, 7.29.0, 7.30.0, 7.30.1, 7.31.0, 7.31.1, 7.32.0, 7.33.0, 7.34.0, 8.0.0a1, 8.0.0b1, 8.0.0rc1, 8.0.0, 8.0.1, 8.1.0, 8.1.1, 8.2.0, 8.3.0, 8.4.0, 8.5.0, 8.6.0, 8.7.0, 8.8.0, 8.9.0, 8.10.0, 8.11.0, 8.12.0, 8.12.1, 8.12.2, 8.12.3, 8.13.0) ERROR: No matching distribution found for ipython==8.14.0 (from -r requirements.txt (line 39))

Results with too many Floaters

Congrats on your great work! I have tried the corgi sample using your code, but there are too many floaters in the results, I didn't change anything, can you give me some advice?

how can I switch to a new sd checkpoint model?

Dear authors,
Thanks for your awesome work, it's very helpful. I am working on some artistic generation. Although the vanilla setting outputs are splendid, but I still want to move forward. I changed the conf - base.yaml -> pretrained_model_name_or_path (line 97). I replaced it (runwayml/sd15) by my own huggingface model. It works, but I‘m not sure the validity since the output still little general (I chose a stylized ckpt file). Would you mind offering some insights for me？
best regards

Why doesn't the default config (`conf/base.yaml`) enable full functions

It seems that the default configuration (conf/base.yaml) does not enable full functions, particularly the compactness-based densification and pruning, which are major contributions of the paper. Why are they not enabled? Wouldn't this lead to sub-optimal results?

Originally posted by @viridityzhu in #34 (comment)

2D SDS

Hi, I'm wondering if you ran experiments on 2D SDS using gaussian splatting (such as using a single camera and a random initialization of gsplats. I'm trying to mimic some of the SDS experiments that the dreamfusion authors ran using gaussian splatting as the 3D representation and it's not really working. Wonder if you had any luck on the same. Thanks!

[Request] Preview

Key to stable diffusion is trying over and over till the picture is good enough for it to be used for further manipulation till it's finally good enough.

Here, every try takes 3 hours even on colab.

Yes, there is the wandb preview, but it also takes some time. Would be better to continue generating pictures, till one is accepted by the user to then create a low res 3D model which should be granted by the user to then go into the 3 hours.

VSD

Hello, I'm curious if you've tried VSD from Prolific Dreamer instead of SDS for guidance

RuntimeError: p1 must be a CUDA tensor.

When I try to export mesh using this script:
python utils/export.py checkpoints/a_high_quality_photo_of_a_corgi/2023-10-24/140914/ckpts/step_14000.pt --type mesh --batch_size 65536 --reso 256 --K 200 --thresh 0.1
I meet this error:

  File "/mnt/sda//gsgen/utils/export.py", line 138, in to_mesh
    density_val_grid, L = get_density_val_grid_from_ckpt(
  File "/mnt/sda//gsgen/utils/export.py", line 94, in get_density_val_grid_from_ckpt
    _, nn_idx, dist = K_nearest_neighbors(
  File "/mnt/sda//anaconda3/envs/gsgen/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/sda//gsgen/utils/ops.py", line 129, in K_nearest_neighbors
    dist, idx, nn = knn_points(query[None, ...], mean[None, ...], K=K, return_nn=True)
  File "/mnt/sda//anaconda3/envs/gsgen/lib/python3.9/site-packages/pytorch3d/ops/knn.py", line 187, in knn_points
    p1_dists, p1_idx = _knn_points.apply(
  File "/mnt/sda//anaconda3/envs/gsgen/lib/python3.9/site-packages/pytorch3d/ops/knn.py", line 72, in forward
    idx, dists = _C.knn_points_idx(p1, p2, lengths1, lengths2, norm, K, version)
RuntimeError: p1 must be a CUDA tensor.

[Question] Reproduction problem about Compactness-based Densification

Dear authors, thanks for your great work, but I met problems while training with the Compactness-based Densification proposed in section 4.2 of your paper, like the figure below:

I found that in your conf/base.yaml:

renderer:
  ...
  densify:
    enabled: true
    type: official
    ...
    use_legacy: true

, which means that in function def densify() at gs/gaussian_splatting.py:

    def densify(self, step, verbose=True):
        # check if densify is enabled and triggered, if true, do densify
        if not self.densify_cfg.enabled:
            return
        if step < self.densify_cfg.warm_up or step > self.densify_cfg.end:
            return

        if step_check(step, self.densify_cfg.period, True):
            if self.densify_cfg.use_legacy:
                self.densify_legacy(step, verbose)
                if "compatness" in self.densify_cfg.type:
                    self.densify_by_compatness(K=self.densify_cfg.get("K", 3))

, the function self.densify_by_compatness() will not be executed and only self.densify_legacy() will be ran.

And then I set renderer.densify.type="compatness", to run it, but the proformance is not quite good with prompt "a zoomed out photo of a 3D model of an adorable cottage with a thatched roof" with initial prompt "a cottage", at iter7000 (because after including the self.densify_by_compatness(), the training stopped at iter7000)

with compatness
without compatness

There might be several reasons, the first that I came up with that in defalult, the pruning operation is disabled in config. I will report the result after adding the pruning.

But the most important is authors' explaination, cause you are the most familiar with the code. Could you kindly provide some advice, since we are also doing Text-to-3D tasks, and the Compactness-based Densification really attracting us.

Thanks! 😄

Cuda Debug

HI there,
As I'm new to CUDA, may I ask how you debugged on the CUDA side? Though I called printf and added cudaDeviceSynchronize() it still printed nothing. Just in case you had encountered the same issue, there are nan values as a result of scale rendering, do you know the reason by any chance? Thank you.

About training results

Excellent project.
The quality of the build is good, but the results don't seem that normal.
In the results I got, the corgi has 5 legs, is this normal?

cmd used：
python main.py --config-name=base prompt.prompt="A high quality photo of a corgi" init.prompt="a corgi"

About loss function in code

Hi, there. Great work! I'm wondering if you can point out all the loss functions in code as written in your paper? Thx in advance.

when training, it looks like sleeping, I don't know what the process should look like

I use python main.py --config-name=base prompt.prompt="a white dog in the river" to train, but the output is like sleeping because gpu usage is 0% and cpu is also very low.

Using Point-E on device: cuda
creating base model...
100%|███████████████████████████████████████| 890M/890M [01:10<00:00, 13.3MiB/s]
creating upsample model...
downloading base checkpoint...
100%|███████████████████████████████████████████████████████████████████| 161M/161M [01:20<00:00, 2.00MiB/s]
downloading upsampler checkpoint...
100%|███████████████████████████████████████████████████████████████████| 162M/162M [01:12<00:00, 2.23MiB/s]
will align the point cloud to the x axis
will use random color

A small kind advice about `init.prompt` in `conf/base.yaml`

the init.prompt in conf/base.yaml should be set None, instead of "a human face".

Currently text-to-3d algorithms are sensitive to initialization, and in 3D Gaussians, Point-E is often used. Here is an example: "a DSLR photo of a corgi wearing a green suit and a top hat"

init.prompt = None

init.prompt = a dog

init.prompt = a corgi

LICENSE file?

I recently came across gsgen and was really impressed with its results. I can see that you guys have put a lot of effort into it, and I think it would be a valuable addition to a project I'm currently working on. However, I'm curious if this is code is viable for commercial or research use. Can you please clarify the terms of using this code with a LICENSE file to this repository? Thanks!

Runing build.sh, get error

Hi, i'm trying to run build.sh ,but get following error:
My environment: Ubuntu1804, 3090

cub_home: None
/root/szd/gsgen/gs/setup.py:32: UserWarning: The environment variable `CUB_HOME` was not found.Installation will fail if your system CUDA toolkit version is less than 11.NVIDIA CUB can be downloaded from `https://github.com/NVIDIA/cub/releases`. You can unpack it to a location of your choice and set the environment variable `CUB_HOME` to the folder containing the `CMakeListst.txt` file.
  warnings.warn(
'nvalid command name 'clean
ERROR: . is not a valid editable requirement. It should either be a path to a local project or a VCS URL (beginning with bzr+http, bzr+https, bzr+ssh, bzr+sftp, bzr+ftp, bzr+lp, bzr+file, git+http, git+https, git+ssh, git+git, git+file, hg+file, hg+http, hg+https, hg+ssh, hg+static-http, svn+ssh, svn+http, svn+https, svn+svn, svn+file).

Process for exporting OBJ/GLTF

Hey there! I love the work and extremely impressed. I am just curious, is there a way I can process the rendering or maybe a form of post processing to convert the models into GLTF or OBJ? I have used Shap-e in recent for research and I was able to process the output from Shap-e into an Obj format. is there a way we can do this here? Maybe process into a gif as oppose to mp4 and then try to post process the GIF? just asking and any feedback would be greatly appreciated!

Issue of Consistency.

I am profoundly grateful for your contribution; you have not only implemented the integration of 3D Gaussian into the AIGC pipeline but also resolved the issue I previously perceived, where the performance of 3D Gaussian in AIGC was inferior to NeRF.

I have a query regarding a detail in the paper. It is mentioned in the paper that, to ensure the generated consistency, Point-E, a pre-trained large-scale model for text-to-point-cloud, is additionally utilized to optimize the position of point clouds in 3D Gaussians. My question is, how is the result of Point-E ensured to be consistent with the result of 2D Diffusion?

Looking forward to the author's reply!

traing error

when i run------python main.py --config-name=base prompt.prompt="A DSLR photo of Car made out of sushi.", i got so many "................................................." in trainrr.py(self.guidance = get_guidance(cfg.guidance) and can't run trainer.train_loop()
.can you help me ? please

The final rendering results are in the directory "checkpoints"?

Hi, I am wondering if the the final rendering results are in the directory "checkpoints" ?

Python versioning issue during installation: it seems like some versions are unresolvable

Hi all, I'm installing it on Windows. When I have a Python version >= 3.10 it says:

ERROR: Ignored the following versions that require a different python version: 1.6.2 Requires-Python >=3.7,<3.10; 1.6.3 Requires-Python >=3.7,<3.10; 1.7.0 Requires-Python >=3.7,<3.10; 1.7.1 Requires-Python >=3.7,<3.10

And when I downgraded to 3.9:

ERROR: Ignored the following versions that require a different python version: 8.19.0 Requires-Python >=3.10; 8.20.0 Requires-Python >=3.10

Could anyone help?

Running main.py looks like doing training?

This shows up in log, I wonder if I did something wrong or just misunderstood. Isn't it supposed to be an inference?

Still occurs severe Janus problem

After install gsgen and run with command

python main.py --config-name=base prompt.prompt="a cute corgi"

here is what I got

and below is my run summary, why there's still some severe Janus problem?

Run summary:
wandb:               auxiliary/total 0.0
wandb:              data/azimuth_max 90.0
wandb:              data/azimuth_min -90.0
wandb:            data/elevation_max 90.0
wandb:            data/elevation_min -20.0
wandb:                data/focal_max 1.35
wandb:                data/focal_min 0.75
wandb:                     data/reso 512.0
wandb:                   global_step 0
wandb:             guidance/max_step 500.0
wandb:             guidance/min_step 20.0
wandb:                      loss/sds 14181.02441
wandb:                    loss/total 1418.10242
wandb:              loss_weights/sds 0.1
wandb:                      lr/alpha 0.003
wandb:                         lr/bg 0.003
wandb:                      lr/color 0.01
wandb:                       lr/mean 3e-05
wandb:                       lr/qvec 0.003
wandb:                       lr/svec 0.001
wandb:       renderer/alpha/grad_max 0.47007
wandb:       renderer/alpha/grad_min 0.0
wandb:            renderer/alpha/max 0.99894
wandb:           renderer/alpha/mean 0.7428
wandb:            renderer/alpha/min 0.00393
wandb:       renderer/color/grad_max 0.636
wandb:       renderer/color/grad_min 0.0
wandb:            renderer/color/max 1.0
wandb:           renderer/color/mean 0.55208
wandb:            renderer/color/min 0.0
wandb:        renderer/mean/grad_max 406.59451
wandb:        renderer/mean/grad_min 0.0
wandb:             renderer/mean/max 1.39963
wandb:            renderer/mean/mean 0.02452
wandb:             renderer/mean/min 0.0
wandb: renderer/n_gaussians_with_dub 707780.0
wandb:        renderer/num_gaussians 249447.0
wandb:        renderer/qvec/grad_max 12.67941
wandb:        renderer/qvec/grad_min 0.0
wandb:             renderer/qvec/max 2.72277
wandb:            renderer/qvec/mean 0.22615
wandb:             renderer/qvec/min 0.0
wandb:        renderer/svec/grad_max 6.686
wandb:        renderer/svec/grad_min 0.0
wandb:             renderer/svec/max 0.04836
wandb:            renderer/svec/mean 0.01166
wandb:             renderer/svec/min 0.00083

[Question] Run viewer on private ip than local host

Hi,

i am having an issue with viewing my rendering of the prompt using the viewer script when I run it on my server. Since I cannot access the localhost link through my server's console, I forward it using ngrok and view it on my local machine (macbook pro). However, the checkpoint doesn't render. As such, I am wondering if there is a way to run the viewer on a private ip than localhost.

I am able to look at the gifs used for eval in the wandb folder so I know the model trained properly but there seems to be some issue with the viewing.

If using the splat viewer is a better option, how and what do I upload to view it in the browser?

where is position loss and opacity loss?

In paper, about Loss_refine = L_sds+L_mean+Lopacity。 so where is mean loss and opacity loss in trainer code?

[BUG???] Correct way to calculate distance_to_gaussian_surface

In script: utils/ops.py#L153
Inside function distance_to_gaussian_surface(mean, svec, rotmat, query)
It calculates squared Mahalanobis distance by :
r2 = svec[..., 2] ** 2 * cos_theta**2 + d2**2 * sin_theta**2
But according to Mahalanobis distance formula it should be :
r2 = svec[..., 2] ** 2 * cos_theta**2 + d2 * sin_theta**2

Reference:
https://en.wikipedia.org/wiki/Mahalanobis_distance
https://math.stackexchange.com/questions/18776/mean-distance-between-a-fixed-point-and-a-gaussian-distributed-random-variable
http://www.open3d.org/docs/latest/tutorial/geometry/distance_queries.html

I have tested both formula, and both seem to work fine
Is this merely a trivial bug or is there any reason that I missed for your specific formula?
Please let me know :)

ControlNet Openpose?

When generating 3d models of characters, instead of modifying prompt for generation at different angles, it makes more sense to use controlnet openpose to directly control the character pose. At a minimum just using the poses from here, but it could be made more flexible than that

About the installation of CLIP

I successfully installed CLIP and CLIP 1.0 is also present in pip list. But when I execute the training script, I get the following error

/home/twx/anaconda3/envs/gsgen/lib/python3.9/site-packages/torch/utils/tensorboard/init.py:4: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
if not hasattr(tensorboard, "version") or LooseVersion(
/home/twx/anaconda3/envs/gsgen/lib/python3.9/site-packages/torch/utils/tensorboard/init.py:6: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
) < LooseVersion("1.15"):
/home/twx/anaconda3/envs/gsgen/lib/python3.9/site-packages/networkx/utils/backends.py:135: RuntimeWarning: networkx backend defined more than once: nx-loopback
backends.update(_get_backends("networkx.backends"))
/home/twx/anaconda3/envs/gsgen/lib/python3.9/site-packages/lightning_utilities/core/imports.py:14: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
import pkg_resources
/home/twx/anaconda3/envs/gsgen/lib/python3.9/site-packages/pkg_resources/init.py:2871: DeprecationWarning: Deprecated call to pkg_resources.declare_namespace('mpl_toolkits').
Implementing implicit namespace packages (as specified in PEP 420) is preferred to pkg_resources.declare_namespace. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
/home/twx/anaconda3/envs/gsgen/lib/python3.9/site-packages/torchmetrics/utilities/imports.py:24: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
_PYTHON_LOWER_3_8 = LooseVersion(_PYTHON_VERSION) < LooseVersion("3.8")
/home/twx/anaconda3/envs/gsgen/lib/python3.9/site-packages/torchmetrics/utilities/imports.py:24: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
_PYTHON_LOWER_3_8 = LooseVersion(_PYTHON_VERSION) < LooseVersion("3.8")
Traceback (most recent call last):
File "/media/twx/ch/code/generate/gsgen-main/main.py", line 3, in
from trainer import Trainer
File "/media/twx/ch/code/generate/gsgen-main/trainer.py", line 43, in
from guidance import get_guidance
File "/media/twx/ch/code/generate/gsgen-main/guidance/init.py", line 27, in
from .make_it_3d import MakeIt3DGuidance
File "/media/twx/ch/code/generate/gsgen-main/guidance/make_it_3d.py", line 17, in
import clip
ModuleNotFoundError: No module named 'clip'

Two-stage or singe-stage training?

Hello, may I ask, the two-stage training proposed in the paper, then the two stages are trained separately (how many steps are trained in each stage), or unified training?
thank you

utils/export.py in wrong folder and cannot access libraries

Hi,

While trying to convert the .pt to .ply file using the script in utils/export.py, I encountered a few import related errors.

Traceback (most recent call last):
  File "/home/jovyan/gsgen/utils/export.py", line 2, in <module>
    import numpy as np
  File "/opt/conda/envs/gsgen/lib/python3.10/site-packages/numpy/__init__.py", line 144, in <module>
    from . import lib
  File "/opt/conda/envs/gsgen/lib/python3.10/site-packages/numpy/lib/__init__.py", line 25, in <module>
    from . import index_tricks
  File "/opt/conda/envs/gsgen/lib/python3.10/site-packages/numpy/lib/index_tricks.py", line 12, in <module>
    import numpy.matrixlib as matrixlib
  File "/opt/conda/envs/gsgen/lib/python3.10/site-packages/numpy/matrixlib/__init__.py", line 4, in <module>
    from . import defmatrix
  File "/opt/conda/envs/gsgen/lib/python3.10/site-packages/numpy/matrixlib/defmatrix.py", line 12, in <module>
    from numpy.linalg import matrix_power
  File "/opt/conda/envs/gsgen/lib/python3.10/site-packages/numpy/linalg/__init__.py", line 73, in <module>
    from . import linalg
  File "/opt/conda/envs/gsgen/lib/python3.10/site-packages/numpy/linalg/linalg.py", line 20, in <module>
    from typing import NamedTuple, Any
  File "/home/jovyan/gsgen/utils/typing.py", line 12, in <module>
    from typing import (
ImportError: cannot import name 'Any' from partially initialized module 'typing' (most likely due to a circular import) (/home/jovyan/gsgen/utils/typing.py)

Traceback (most recent call last):
  File "/home/jovyan/gsgen/utils/export.py", line 6, in <module>
    from gs.gaussian_splatting import GaussianSplattingRenderer
  File "/home/jovyan/gsgen/utils/gs/gaussian_splatting.py", line 6, in <module>
    from utils.camera import PerspectiveCameras
ModuleNotFoundError: No module named 'utils'

Moving export.py to /gsgen and then installing the plyfile library seemed to fix this issue.

Windows Support

Currently my build of gs errors on Windows saying these symbols are not defined, so I need to replace them in `gs/src/include/common.h with defines like this and then it compiles ok

#define EPS 1e-6
#define MIN_RADIAL -85.0f
#define gs_coeff_3d 0.06349363593424097f // 1 / (2 * pi) ** (3/2)
#define gs_coeff_2d 0.15915494309189535f // 1 / (2 * pi)

#define MAX_N_FLOAT_SM 12000 // 48KB
#define MIN_RENDER_ALPHA 0.00392156862745098f // 1 / 255.0f

About 3D-SDS loss

Hello, I am curious why the loss of 3D-SDS does not have a term similar to the partial derivative in the 2DSDS formula in the previous line. For example, ∂p/∂θ.

OSError: [WinError 123] 文件名、目录名或卷标语法不正确。: 'checkpoints\\a_high_quality_photo_of_a_rabbit\\2024-04-10\\172943\\code\\"main - \\345\\211\\257\\346\\234'

Using Point-E on device: cuda
creating base model...
creating upsample model...
downloading base checkpoint...
downloading upsampler checkpoint...
will align the point cloud to the x axis
will use random color
unet\diffusion_pytorch_model.safetensors not found
Loading pipeline components...: 100%|████████████████████████████████████████████████████| 4/4 [00:43<00:00, 10.88s/it]
[side]:[A high quality photo of a rabbit, side view]
[front]:[A high quality photo of a rabbit, front view]
[back]:[A high quality photo of a rabbit, back view]
[overhead]:[A high quality photo of a rabbit, overhead view]
['A high quality photo of a rabbit', 'A high quality photo of a rabbit, side view', 'A high quality photo of a rabbit, front view', 'A high quality photo of a rabbit, back view', 'A high quality photo of a rabbit, overhead view']
wandb: Currently logged in as: westbrook (motion-magic). Use wandb login --relogin to force relogin
wandb: wandb version 0.16.6 is available! To upgrade, please run:
wandb: $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.15.8
wandb: Run data is saved locally in D:\AI\CG4\gsgen\wandb\run-20240410_170241-z1ati0p3
wandb: Run wandb offline to turn off syncing.
wandb: Syncing run 0|164816|2024-04-10|a_high_quality_photo_of_a_rabbit
wandb: View project at https://wandb.ai/motion-magic/gsgen
wandb: View run at https://wandb.ai/motion-magic/gsgen/runs/z1ati0p3
Error executing job with overrides: ['prompt.prompt=A high quality photo of a rabbit']
Traceback (most recent call last):
File "D:\AI\CG4\gsgen\main.py", line 25, in main
trainer = Trainer(cfg)
File "D:\AI\CG4\gsgen\trainer.py", line 222, in init
self.save_code_snapshot()
File "D:\AI\CG4\gsgen\trainer.py", line 277, in save_code_snapshot
dst.parent.mkdir(parents=True, exist_ok=True)
File "D:\AI\Anaconda3\envs\gsgenPy3.10\lib\pathlib.py", line 1175, in mkdir
self._accessor.mkdir(self, mode)
OSError: [WinError 123] 文件名、目录名或卷标语法不正确。: 'checkpoints\a_high_quality_photo_of_a_rabbit\2024-04-10\164816\code\"main - \345\211\257\346\234'

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
wandb: Waiting for W&B process to finish... (failed 1). Press Ctrl-C to abort syncing.
wandb:
wandb: Run history:
wandb: global_step ▁
wandb:
wandb: Run summary:
wandb: global_step 0
wandb:
wandb: View run 0|164816|2024-04-10|a_high_quality_photo_of_a_rabbit at: https://wandb.ai/motion-magic/gsgen/runs/z1ati0p3
wandb: View job at https://wandb.ai/motion-magic/gsgen/jobs/QXJ0aWZhY3RDb2xsZWN0aW9uOjE2MTIwNjk0Mw==/version_details/v0wandb: Synced 6 W&B file(s), 0 media file(s), 3 artifact file(s) and 2 other file(s)
wandb: Find logs at: .\wandb\run-20240410_170241-z1ati0p3\logs

训练过程中出现错误

运行main.py命令如下：
python main.py --config-name=base prompt.prompt="xx"
报错如下：

  File "/data0/gsgen/main.py", line 19, in main
    trainer = Trainer(cfg)
  File "/data0/gsgen/trainer.py", line 222, in __init__
    self.save_code_snapshot()
  File "/data0/gsgen/trainer.py", line 274, in save_code_snapshot
    files = get_file_list()
  File "/data0/gsgen/utils/misc.py", line 281, in get_file_list
    subprocess.check_output(
  File "/home/miniconda3/envs/gsgen/lib/python3.9/subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/home/miniconda3/envs/gsgen/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'git ls-files -- ":!:load/*"' returned non-zero exit status 128.

经检查应该是gegen/utils/misc.py文件中的如下代码在运行中出了问题：

def get_file_list():
    return [
        b.decode()
        for b in set(
            subprocess.check_output(
                'git ls-files -- ":!:load/*"', shell=True
            ).splitlines()
        )
        | set(  # hard code, TODO: use config to exclude folders or files
            subprocess.check_output(
                "git ls-files --others --exclude-standard", shell=True
            ).splitlines()
        )
    ]

怀疑可能是某个文件没能成功下载，请问作者这种情况下应该怎么处理呀，谢谢！！

where is the two stages?

训练迭代14500次,为什么效果很差还存在多面问题?

how long does a training session take

great job!But I have a question, how long does a training session take, because it's been three days since I started training, but I'm still

"will align the point cloud to the x axis.
will use random color".

It's not clear to me what's going on.

<frozen importlib._bootstrap>:488

Hello，when i run the code with the command
python main.py --config-name=base prompt.prompt="" -W ignore,
it shows the issues:
:488: DeprecationWarning: Type google.protobuf.pyext._message.ScalarMapContainer uses PyType_Spec with a metaclass that has custom tp_new. This is deprecated and will no longer be allowed in Python 3.14.
:488: DeprecationWarning: Type google.protobuf.pyext._message.MessageMapContainer uses PyType_Spec with a metaclass that has custom tp_new. This is deprecated and will no longer be allowed in Python 3.14.
I haven't find a solution about this .Anybody know about it?

Explanation of distance_to_gaussian_surface

Hi, I was going through the codebase and came across the following function that computes the distance between points and the gaussian surface. Is there any derivation or explanation available for this code?

For instance:

Why rotate the difference of the mean position of the 3D gaussians?
Why not compute distance between gaussians as the simple euclidian distance between the means of the 3D gaussians (as visualized in the paper)

def distance_to_gaussian_surface(mean, svec, rotmat, query):

    # mean - N x 3 - torch.Tensor (mean position)
    # svec - N x 3 - torch.Tensor (scale vector)
    # rotmat - N x 3 x 3 - torch.Tensor (rotation matrix)
    # query - N x 3 - torch.Tensor (query points)

    xyz = query - mean
    # TODO: check here
    # breakpoint()

    xyz = torch.einsum("bij,bj->bi", rotmat.transpose(-1, -2), xyz)
    xyz = F.normalize(xyz, dim=-1)

    z = xyz[..., 2]
    y = xyz[..., 1]
    x = xyz[..., 0]

    r_xy = torch.sqrt(x**2 + y**2 + 1e-10)
    
    cos_theta = z
    sin_theta = r_xy
    cos_phi = x / r_xy
    sin_phi = y / r_xy

    d2 = svec[..., 0] ** 2 * cos_phi**2 + svec[..., 1] ** 2 * sin_phi**2
    r2 = svec[..., 2] ** 2 * cos_theta**2 + d2**2 * sin_theta**2

    return torch.sqrt(r2 + 1e-10)

How to load a pre-trained .ply file to continue training?

Nice work! How to load a pre-trained .ply file to continue training?

Errors on setting Environmentig

When i try to running './build.sh',
error occurs:
`RuntimeError:
The detected CUDA version (12.1) mismatches the version that was used to compile
PyTorch (11.7). Please make sure to use the same CUDA versions.

note: This error originates from a subprocess, and is likely not a problem with pip.
`
How can i solve it?
THX!