The banmo's discuss from facebookresearch

Question for Nerfies experiment

Thanks for your great work!
I have a question about the Nerfies experiment mentioned in your paper.
Nerfies uses colmap to conduct camera registration and scene-related calculation (scene scale and scene center), but banmo doesn't use colmap.
I also want to conduct the experiment mentioned in your paper. Is the related code released? Or any other guidance?

Volume rendering

Hi, thanks for your code. I want to ask the question about the calculation of sil_coarse in rendering.py

banmo/nnutils/rendering.py

Line 359 in c449803

sil_coarse = weights_coarse[:,:-1].sum(1)

Why does it need to add [:, :-1] to weights while others do not? Thanks!

banmo/nnutils/rendering.py

Line 219 in c449803

rgb_final = torch.sum(weights.unsqueeze(-1)*rgbs, -2) # (N_rays, 3)

I failed to run “template-short.sh“

It seems to be a data set problem

near_far, obj_scale, bound

Hi, thanks again for your awesome work!

Could you please tell me the means and relationships of parameter 'near_far', 'obj_bound', 'obj_scale', 'bound', and 'bound_factor'? I am so confused about these parameters.

Also, I can understand you set the object center localized at z=0.3 in the world space, and design the 'warmp_shape' to initialize the object as a small sphere by training the SDF. But why set the near_far (initialized by 0-0.6) as a learning parameter (reset_nf) instead of a fixed hyperparameter?

ext_utils not found problem

Hello there! Thanks for sharing the amazing work!

I have some problems about the file ext_utils. I can't find it in the repo. Consequently I can't run the code. I would like to know where I can get this file. Any response will be greatly appreciated!

some problems when i try pre-optimized models according to the suggestion

When I enter “bash scripts/render_nvs.sh 0 $seqname tmp/cat-pikachiu.pth 5 0”, the following log comes out. The error causes the command to exit abnormally.I am sorry to bother you again.

-------------------------------logs--------------------------------------------------------------------

I have uploaded the logs to the attachment.
Untitled 2.odt
--------------------------------command--------------------------------------------------------------------

Problems about synthetic data

Hi. I'm trying to use the synthetic generator. I want to know what's the unit of the focal length and depth in render_synthetic.py?

some bugs, when I render the results

Thanks for your great jobs.
There are some bugs when I render the results using pyrender.
My version of ubuntu is 20.04, and install all library according to your banmo.yml
the backtrack of error logger is:

libEGL warning: DRI2: failed to create dri screen
libEGL warning: DRI2: failed to create dri screen
Traceback (most recent call last):
  File "scripts/visualize/render_vis.py", line 537, in <module>
    main()
  File "scripts/visualize/render_vis.py", line 329, in main
    r = OffscreenRenderer(img_size, img_size)
  File "/home/linxia/anaconda3/envs/banmo/lib/python3.8/site-packages/pyrender/offscreen.py", line 31, in __init__
    self._create()
  File "/home/linxia/anaconda3/envs/banmo/lib/python3.8/site-packages/pyrender/offscreen.py", line 149, in _create
    self._platform.init_context()
  File "/home/linxia/anaconda3/envs/banmo/lib/python3.8/site-packages/pyrender/platforms/egl.py", line 177, in init_context
    assert eglInitialize(self._egl_display, major, minor)
  File "/home/linxia/anaconda3/envs/banmo/lib/python3.8/site-packages/OpenGL/platform/baseplatform.py", line 402, in __call__
    return self( *args, **named )
  File "/home/linxia/anaconda3/envs/banmo/lib/python3.8/site-packages/OpenGL/error.py", line 228, in glCheckError
    raise GLError(
OpenGL.error.GLError: GLError(
        err = 12289,
        baseOperation = eglInitialize,
        cArguments = (
                <OpenGL._opaque.EGLDisplay_pointer object at 0x7fad0deeb040>,
                c_long(0),
                c_long(0),
        ),
        result = 0
)

I don't know how to fix it.
If I change 'egl' to 'osmesa', which is too slow as it is cpu-only.

Generation of synthetic datasets

Hi Gengshan,

Thanks for your great work!

I have a couple of questions about the synthetic datasets (Eagle).

How to generate the 120 time-sequence obj from the one-instance model?
Is there any motion sequence used?
I want to build cat synthetic datasets to evaluate results. Have you ever tried this and held these kinds of datasets?

SOS, the cat is so cute~~~~~ I cannot focused on codes! o(=•ェ•=)m

Question about result : a difference between the results in the paper and my results

Thank you for nice work.

I have a question about the results in the paper.
I just followed your instruction for data processing, training, rendering and evaluation, which is described at scripts/README.md

Desired results should be same as

BUT my models shows poor performance results.
I will describe my results below.

AMA

training model by using T_swing and T_samba simultaneously --> evaluation
:: ave 11.4 chamfer distance about T_swing
:: ave 10.7 chamfer distance about T_samba
download some files by cmd wget https://www.dropbox.com/sh/n9eebife5uovg2m/AAA1BsADDzCIsTSUnJyCTRp7a -O tmp.zip --> evaluation (=same as your instruction)
:: ave 9.1 chamfer distance about T_swing
train model by using only T_swing videos --> evaluation (same process about T_samba, too)
:: ave 8.6 chamfer distance about T_swing
:: ave 8.2 chamfer distance about T_samba

Synthetic (hand, eagle)
:: ave 6.4 chamfer distance about eagle
:: ave 5.3 chamfer distance about hands

I used 2xA100 for training, and same conda environment as yours.

As you see, My all results show poor performance.
Here, My question is How can I get same results described in the paper?

Best regards.

Canonical embeddings matching

Hi, thanks for your impressive work.

I am trying to understand your 2d-2d matching part. However, when I print the learned matching matrix (prob_vol), it seems all the elements are very close. As I know, it may mean that Banmo does not learn a matching between 2d and 3d features. So what does this part really do? Thanks!

Question regarding using pre-trained models.

Dear GengShan,
Hi! I am trying to use your pre-optimized models for evaluation. Unfortunately when I call the scripts, it says

FATAL Flags parsing error: ERROR:: Unable to open flagfile: [Errno 2] No such file or directory: 'tmp/cat-pikachiu/opts.log

I don't know how the arg-parsing works for the absl system and do you have any suggestions to bypass the need of opts.log?

Thanks.

Optimized models on dropbox

Thanks for your great work!
I would like to download the optimized models from your dropbox.
However, it shows that the files are deleted.
May I know if you can re-upload the optimized models again?
For example:
https://www.dropbox.com/s/qzwuqxp0mzdot6c/cat-pikachiu.npy
https://www.dropbox.com/s/dnob0r8zzjbn28a/cat-pikachiu.pth

Training BANMo using my own videos

Hi, thanks for your work!

I try to train BANMo using my own videos, I preprocess the videos as you instructed. However, when I see the segmentation results in DAVIS/Annotations, the results seem not to be preprocessed like adult7 and cat-pikachiu. Like the following image, there are different colors on the cat. When I check my own videos, there is only one color and the detection result is not good. Did I miss something? Thanks!

left multiply or right multiply?

Thanks for your great work!

May I ask why you sometimes apply left multiply on Rmat, and sometimes right multiply? Why not use the same formation? What's the difference? I'm really confused about that.

Qusetion about the Caluculation of the Mahalanobis Distance

Hello, thanks for your great work!
I have one question.
In

banmo/nnutils/geom_utils.py

Line 237 in 2bf3c0a

mdis = mdis*100*log_scale.exp() # TODO accound for scaled near-far plane

and

banmo/nnutils/geom_utils.py

Line 238 in 2bf3c0a

mdis = (-10 * mdis.sum(3)) # bs,N,B

when you are calculating Mahalanobis Distance, what are you doing here?
Why mdis need to first multiply 100 and log_scale.exp() and then (-10 * mdis.sum(3))?
What is the log_scale from skin_aux?

Thanks! :)

Sharing pretrained weights on Hugging Face

Hello there!

First of all, thank you for open-sourcing your work! I really enjoyed reading your paper and learning about your work, plus I'm a big fan of the coauthors Pikachiu, Tetres, Haru, Coco, and Socks 🐕🐈 Would you be interested in sharing your models on the Hugging Face Hub?

The Hub makes it easy to freely download and upload models, and it can make models more accessible and visible to the rest of the ML community. It's good way to share useful metadata and metrics, and we also support features like TensorBoard visualizations and PapersWithCode integrations. Since models are hosted as Git repos, they're also automatically versioned with a commit history and diffs. You could even upload it to the already-existing Facebook AI organization.

We have a step-by-step guide that explains the process for uploading the model to the Hub, in case you're interested. We also have a library for programmatic access to uploading and downloading models, which includes features like caching for downloaded models.

Please let us know if you have any questions, and we'd be happy to guide you through the process!

Nima and the Hugging Face team

cc @osanseviero @lhoestq

Training pipeline of PoseNet

Hello there!

Thanks a lot for sharing your work!

I have a couple of questions:

1.What is the dataset you used to train the PoseNet for root pose initialization?
2.What is the occ from the optical flow model? It seems that it's loaded in dataloader but is not used anywhere in the training.

Pre-trained model?

Hi Gengshan,

First of all, great work! Do you plan to release pretrained model on the demo sequences? I was trying to understand your code better by running some demo (for example, NVS on cat-pikachiu as you suggested on the README page). It will make it much easier to use your code!

Thank you!

Details on bone initialization, skinning function and training schedule

Hi Gengshan,

Following up on my last issue, I have some trouble understanding the way you initialize the "bones" and your skinning function. It will be wonderful if you can shed some light on some details .

Bone initialization
I am looking at your generate_bones method and it seems that bones are always initialized at origins. Is that intended? And if so, how do you center/scale the original cat-pikachu scene? In another word, what is the "unit" for PoseNet output? My understanding is that it is in the object space since it is trained on synthetic data with 360 camera sampling.
Bone reinitialization
It seems that bones are reinitialized at 2/3 num_epochs where bones are resampled by K-Means based on the canonical shape. I am wondering how do you reinitialize the bones at each video frame. Thank you!

Skinning function
I am a bit confusesd by the skinning function here:

banmo/nnutils/geom_utils.py

Lines 226 to 236 in ff5df1d

    
           mdis = center.view(bs,1,B,3) - pts.view(bs,N,1,3) # bs,N,B,3 
        
           if True:#B<50: 
        
               mdis = axis_rotate(orient.view(bs,1,B,3,3), mdis[...,None]) 
        
               #mdis = orient.view(bs,1,B,3,3).matmul(mdis[...,None]) # bs,N,B,3,1 
        
               mdis = mdis[...,0] 
        
               mdis = scale.view(bs,1,B,3) * mdis.pow(2) 
        
           else: 
        
               # for efficiency considerations 
        
               mdis = mdis.pow(2) 
        
           mdis = mdis*100*log_scale.exp() # TODO accound for scaled near-far plane 
        
           mdis = (-10 * mdis.sum(3)) # bs,N,B

Specifically I am wondering why is there 100 and -10 in the blending? Is it effectively the same if we initialize log_scale to log(1000) and keep the -1 part for RBF weight?

Training schedule
Thank you for kindly sharing your training script. I would really appreciate it if you can explain a bit more -- I cannot find any description in the original paper. Here is where I am looking at:

banmo/scripts/template.sh

Lines 18 to 71 in ff5df1d

    
           # mode: line load 
        
           savename=${model_prefix}-init 
        
           bash scripts/template-mgpu.sh $gpus $savename \ 
        
               $seqname $addr --num_epochs $num_epochs \ 
        
             --pose_cnn_path $pose_cnn_path \ 
        
             --warmup_shape_ep 5 --warmup_rootmlp \ 
        
             --lineload --batch_size $batch_size\ 
        
             --${use_symm}symm_shape \ 
        
             --${use_human}use_human 
        
           # mode: pose correction 
        
           # 0-80% body pose with proj loss, 80-100% gradually add all loss 
        
           # freeze shape/feature etc 
        
           loadname=${model_prefix}-init 
        
           savename=${model_prefix}-ft1 
        
           num_epochs=$((num_epochs/4)) 
        
           bash scripts/template-mgpu.sh $gpus $savename \ 
        
               $seqname $addr --num_epochs $num_epochs \ 
        
             --pose_cnn_path $pose_cnn_path \ 
        
             --model_path logdir/$loadname/params_latest.pth \ 
        
             --lineload --batch_size $batch_size \ 
        
             --warmup_steps 0 --nf_reset 1 --bound_reset 1 \ 
        
             --dskin_steps 0 --fine_steps 1 --noanneal_freq \ 
        
             --freeze_proj --proj_end 1\ 
        
             --${use_symm}symm_shape \ 
        
             --${use_human}use_human 
        
           # mode: fine tunning without pose correction 
        
           loadname=${model_prefix}-ft1 
        
           savename=${model_prefix}-ft2 
        
           num_epochs=$((num_epochs/2)) 
        
           bash scripts/template-mgpu.sh $gpus $savename \ 
        
               $seqname $addr --num_epochs $num_epochs \ 
        
             --pose_cnn_path $pose_cnn_path \ 
        
             --model_path logdir/$loadname/params_latest.pth \ 
        
             --lineload --batch_size $batch_size \ 
        
             --warmup_steps 0 --nf_reset 0 --bound_reset 0 \ 
        
             --dskin_steps 0 --fine_steps 0 --noanneal_freq \ 
        
             --${use_symm}symm_shape \ 
        
             --${use_human}use_human 
        
           # mode: final tunning with larger rgb loss wt and reset beta 
        
           loadname=${model_prefix}-ft2 
        
           savename=${model_prefix}-ft3 
        
           bash scripts/template-mgpu.sh $gpus $savename \ 
        
               $seqname $addr --num_epochs $num_epochs \ 
        
             --pose_cnn_path $pose_cnn_path \ 
        
             --model_path logdir/$loadname/params_latest.pth \ 
        
             --lineload --batch_size $batch_size \ 
        
             --warmup_steps 0 --nf_reset 0 --bound_reset 0 \ 
        
             --dskin_steps 0 --fine_steps 0 --noanneal_freq \ 
        
             --img_wt 1 --reset_beta --eikonal_loss \ 
        
             --${use_symm}symm_shape \ 
        
             --${use_human}use_human

Why are there four stages? And what does these four stages do exactly? Specifically, I would like to understand how bones are reinitialized in these context. I can roughly see that you are trying to fine-tune the root motion, cameras and bones, respectively.

I realized these are a lot of questions to answer (sorry!). But I guess there might be other folks share similar confusion since it is quite a chunk of code :) And I have not managed to run the code so far, all the questions are purely from plain code reviewing -- apologies if there's any I have missed!

Thank you!!!
Hang

Visualize of articulated shape

Many thanks for your great job!

Could you tell me how to generate dynamic 3d models like the "Articulated shape:" of cat-pikachiu on this page?

What is the warmup stage doing exactly?

Hi, thanks for releasing the code!

I attempted training on the human video, and saw that nerf_coarse is trained for 5 epochs to warmup using an SMPL mesh loaded from mesh_material folder.

However, when I visualized it from tmp/smpl_27554.obj in blender it just seemed to be an ellipsoid [image below], is this expected?

Appreciate your thoughts, thanks!

Adaptation to a new video

Hi, thanks for your work!

I am trying to adapt a pre-trained model to a new video. But if I did not use template-prior-model.sh and just used the pre-trained model, I cannot load cameras from init-cam. The camera parameters will be wrong. Do you have any suggestions if I do not want to retrain the pre-trained models? Thanks!

Questions about the synthetic datasets

Hi Gengshan,

Thanks for the great work!

I have a couple of questions regarding the synthetic datasets (Eagle and Hands) and the other results on your website:

The instructions on synthetic datasets use the ground truth camera poses in training. However, the paths to the rtk files are commented out in the config. If I directly use this config, it won't use the ground truth camera poses in the training right?
I followed the same instructions for Eagle dataset preparation, but it does not save the rtk files to the locations specified in the config, should I manually change the paths?
Have you tried running BANMo optimizations on Eagle and Hands without the ground truth camera poses? And if so, how's the result visually and quantitatively (in terms of Chamfer Distance and F-scores)?
I noticed that you have results of more objects such as Penguins, Robot-Laikago etc. on your website. Do you know where I can get access to these datasets as well?

Does rest shape have to fit any specific frame in video ?

Hello, thank you for your great work.

I am wondering if the rest shape has to fit a pose in a specific frame of the video, e.g., first frame or last frame ?
If so, can I know which part in the source code does that ?
If not, how can you control the rest shape to be in rest pose like human or cat standing straight like in the visualization ?

Question about CSE and root pose

Hi there! Thanks for sharing this amazing work!

I have been reading your work for many days and I'm still confused about CSE:

You have shown the good result of hand, eagle, laikago robot. Did you apply the CSE model trained for quadruped animals to these non-quadruped objects? If not, how did you train the CSE model for these objects? (especially for hands) Did you modify the dataset of hands so that it can be used to training CSE?
Like what you said in CSE embedding for other object categories and banmo for reconstruction of cars, Does it mean that if I want to reconstruct a rigid object such as a cup or a car, I could get good results as long as a good enough initial root pose of it is given? So CSE and PoseNet are not necessary as long as I can get a good root pose by other means?

Any response will be greatly appreciated!

Minimum amount of cameras /videos?

Hello! Thanks a lot for sharing this!
I’m very curious, what is the minimum amount of cameras / videos to recreate animated human shape? (I really hope it is okay to use 4x GoPro)

RuntimeError: std::bad_alloc

Optimization failed when the memory is available
our setup rtx2080ti * 2, Intel(R) Core(TM) i9-10900X CPU @ 3.70GHz, 32GB memory
log file attached

/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/numpy/core/fromnumeric.py:3440: RuntimeWarning: Mean of empty slice. return _methods._mean(a, axis=axis, dtype=dtype, /home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars ret = ret.dtype.type(ret / rcount) /home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/numpy/core/fromnumeric.py:3440: RuntimeWarning: Mean of empty slice. return _methods._mean(a, axis=axis, dtype=dtype, /home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars ret = ret.dtype.type(ret / rcount) Traceback (most recent call last): File "/mnt/banmo/main.py", line 42, in <module> app.run(main) File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/absl/app.py", line 312, in run _run_main(main, args) File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "/mnt/banmo/main.py", line 39, in main trainer.train() File "/mnt/banmo/nnutils/train_utils.py", line 684, in train self.train_one_epoch(epoch, log) File "/mnt/banmo/nnutils/train_utils.py", line 922, in train_one_epoch total_loss,aux_out = self.model(batch) File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 886, in forward output = self.module(*inputs[0], **kwargs[0]) File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/mnt/banmo/nnutils/banmo.py", line 650, in forward_default mesh_rest = pytorch3d.structures.meshes.Meshes( File "/mnt/banmo/third_party/pytorch3d/pytorch3d/structures/meshes.py", line 406, in __init__ if len(self._num_faces_per_mesh.unique()) == 1: File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/_tensor.py", line 530, in unique return torch.unique(self, sorted=sorted, return_inverse=return_inverse, return_counts=return_counts, dim=dim) File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/_jit_internal.py", line 422, in fn return if_false(*args, **kwargs) File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/_jit_internal.py", line 422, in fn return if_false(*args, **kwargs) File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/functional.py", line 821, in _return_output output, _, _ = _unique_impl(input, sorted, return_inverse, return_counts, dim) File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/functional.py", line 735, in _unique_impl output, inverse_indices, counts = torch._unique2( RuntimeError: std::bad_alloc WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 32696 closing signal SIGTERM ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 1 (pid: 32697) of binary: /home/hyang/.conda/envs/banmo-cu113/bin/python Traceback (most recent call last): File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/distributed/launch.py", line 193, in <module> main() File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/distributed/launch.py", line 189, in main launch(args) File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/distributed/launch.py", line 174, in launch run(args) File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/distributed/run.py", line 710, in run elastic_launch( File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 131, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/home/hyang/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 259, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
log.log

Canonical mesh degenerates after epoch 1

Hi Gengshan,

Thanks for your great work!

After epoch 1 in the last training stage, I noticed that the canonical mesh always breaks down and needs a few epochs to get back to the correct shape (roughly) again. This happens in the trainings on all the datasets (cat, AMA, etc.). Do you know what causes this effect at epoch 1? Thanks!

For example, these are the canonical meshes after epoch 0 to epoch 3 in the last training stage on AMA human dataset (notice that after epoch 1 the mesh degenerates):


epoch 0 canonical mesh in last training stage


epoch 1 canonical mesh in last training stage


epoch 2 canonical mesh in last training stage


epoch 3 canonical mesh in last training stage

technical discussion

I want to apply banmo to car modeling, but I find it takes too much time to retrain every reconstruction, even if there is only a slight difference between the two cars.Is there a way to make banmo train only once, and then infer on various car videos to get 3D models of different cars.I can provide a car video and hope to have a category-level end-to-end 3D reconstruction method. If you have relevant research papers, can you share them? Thank you.

batch load or line load

Hi Gengshan,

Thank you so much for your great work.

I noticed that you use batch_load for evaluation and line_load for training in BANMo. I want to learn information from the whole image, so I try to use batch_load in training, but the performance will be degraded compared to the original line_load training. Do you have any suggestions? Thanks a lot!!

No "from pytorch3d import _C"

I have ran your colab demos , but is seems no _C under /content.banmo/third_party/pytorch3d/pytorch3d/

Can u pls tell me how to fix that? Thanks

update_delta_rts

Hi, I have a question about the following function, update_delta_rts.

banmo/nnutils/banmo.py

Line 1207 in c449803

def update_delta_rts(self, rays):

Why do you compute rays['bone_rts'] by multiplying (bone_rts_rst)^(-1) * bone_rts_fw in correct_rest_pose function?
Is it right that you intend to multiply bone_rts_fw first and then multiply the inverse of bone rest transformation?

What I understand is that you first transform bones using bone_rts_rst, which results in restpose bones.
(restpose_bones = bones_rts_rst * bones)

Then, you make time t transformation using bone_rts_fw and bone_rts_rst in correct_rest_pose_function.
(bone_rts_fw = (bone_rts_rst)^(-1) * (bone_rts_fw) )

As shown in the rendering code, bones_dfm is computed using those two values.
(bone_dfm = bone_rts_fw * bones_rest
= (bone_rts_rst)^(-1) * bone_rts_fw * bones_rts_rst * bones

What I'm curious about is that why you multiply the inverse of bone_rts_rst and bone_rts_rst on before and after of bone_rts_fw matrix? What's the meaning of this multiplication?

Hope to hear from you soon.
Thank you.

About the 2D keypoint transfer metric

Hi, thanks for publishing this wonderful code.

I want to ask that is it possible to evaluate the reconstructed 3D model using the 2D keypoint transfer metric defined in ViSER and LASR? Because BANMo has the ability to register the correspondences between frames so I thought it is possible. But because in the main paper you use Chamfer distance instead of 2D keypoint transfer, so I am also wondering why we can't use 2D keypoint transfer metric, is it inappropriate in this case?
Thank you very much.

Questions by using pre-optimized models

Dear GengShan,

Hi! Many thanks for your awesome job! I am trying to use your pre-optimized models for evaluation.

bash scripts/render_mgpu.sh 0 human-cap logdir/human-cap/human-cap.pth "0" 64

However, the results seem very strange as below, including the viewing directions and model motions.

Could you please tell me what's happening to these questions?

Thanks!

Questions on the evaluation on root pose prediction

Hi Gengshan,

Thanks for the great work!

I was wondering how the rotation error in degrees are computed in Table 4 in Appendix C.1. The caption says " Rotations are aligned to the ground-truth by a global rotation under chordal L2 distances", does this mean that the rotation error reported in table is the angle of the "global rotation" about a single axis from prediction to ground-truth?

It would be super helpful for me to understand it if you could point me to the script that computes this rotation error. Thanks!

no camera path and keypoints.json in database

In I/O.py, you have variables 'rtklist' and ‘kplist’ which contain paths like 'database/DAVIS/Cameras/Full-Resolution/cat-pikachiu00/00000.txt' and 'database/DAVIS/KP/Full-Resolution/cat-pikachiu00/00000_keypoints.json'. But there are no relative files found after unzipping the dropbox files.

Custom videos for testing

Can I use the pre-trained models on videos containing food items?

some problems when I run ./script/template.sh

Thanks for your great job.
When I run your code to fit my video.
there has some problem, in train_one_epoch()
I debug it and find the bug happened when

self._num_faces_per_mesh.unique() ==1 403 line in pytorch3d/meshes.py

I got *** RuntimeError: std::bad_alloc*** issue.
I don't know why? all the environments are installed according to your yaml.

2*3090 are used when I run your code.

Question about mesh of Viser in Banmo

Thank you for your extremely good work.
I had a question reproducing viser in that the mesh generated for the cat was smoother but poorly shaped than the mesh you put in the banmo. (Compare below, left is the mesh in banmo, right is the mesh I reran viser to generate)

As well as in ama-female, and the one we generated would tend to have no hands (as shown in the comparison below)

The images above are all n_faces=8000 in the code and are fully trained by the official viser script.

Data preprocess

#Thanks for your excellent work!
##An error is displayed during data preprocessing.The environment configuration is configured according to the “banmo.yml” you provided. Tested with video of my own data, but could not preprocess. Using the series of videos(“cat-pikachiu.mov”) to process, it still reports an error.
the backtrack of error logger is:
bash preprocess/preprocess.sh ”cat-pikachiu“ .MOV n 10

bones_dfm or bones_rst when using gauss_mlp_skinning?

Hi, many thanks for your excellent work!

May I ask (1) why using 'bones_dfm' instead of 'bones_rst' in the below line? Whether (2) the 'skin_backward' is the LBS weight applied to sample pts in the camera coordinate and transfer them to the root coordinate? and (3) the 'xyz_coarse_sampled' is the sample pts in the root coordinate? If so, it seems like using 'bones_rst' here makes sense.

banmo/nnutils/rendering.py

Line 291 in 177a390

bones_dfm, time_embedded, nerf_skin, skin_aux=skin_aux)

How to generate the canonical rest shape of 3d file that it have real skins? not only output canonical rest shape

thks

technical problem

Hello! I'm having some problems reproducing the thesis work:
Example: Motion retargeting

Traceback (most recent call last):
File "preprocess/img2lines.py", line 111, in
app.run(main)
File "/opt/anaconda3/envs/banmo/lib/python3.8/site-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/opt/anaconda3/envs/banmo/lib/python3.8/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "preprocess/img2lines.py", line 53, in main
data_info = trainer.init_dataset()
File "/tmp/pycharm_project_829/./nnutils/train_utils.py", line 129, in init_dataset
self.dataloader = frameloader.data_loader(opts_dict)
File "/tmp/pycharm_project_829/./dataloader/frameloader.py", line 38, in data_loader
data_inuse = config_to_dataloader(opts_dict)
File "/tmp/pycharm_project_829/./utils/io.py", line 311, in config_to_dataloader
dataset = torch.utils.data.ConcatDataset(datalist)
File "/opt/anaconda3/envs/banmo/lib/python3.8/site-packages/torch/utils/data/dataset.py", line 199, in init
assert len(datasets) > 0, 'datasets should not be an empty iterable' # type: ignore
AssertionError: datasets should not be an empty iterable

**opts_dict={} in "./nnutils/train_utils.py" seems to require adding paths manually

When I modify the path, the following error occurs**

Traceback (most recent call last):
File "preprocess/img2lines.py", line 111, in
app.run(main)
File "/opt/anaconda3/envs/banmo/lib/python3.8/site-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/opt/anaconda3/envs/banmo/lib/python3.8/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "preprocess/img2lines.py", line 53, in main
data_info = trainer.init_dataset()
File "/tmp/pycharm_project_829/./nnutils/train_utils.py", line 113, in init_dataset
opts_dict['n_data_workers'] = '1'
TypeError: 'str' object does not support item assignment

How should the path be modified?

Question about rendering

Hello,

Thank you for you brilliant job and sharing your code.

I am trying to run your project on my own computer. Until optimization, it works well under the auto-built environment. But in the rendering process, I met lots of "no mesh found" and error report No such file or directory: 'logdir/cat-pikachiu-e120-b256-ft3/opts.log'.

I can see that the problem is that in the rendering command: bash scripts/render_mgpu.sh 0 $seqname logdir/$seqname-e120-b256-ft3/params_latest.pth "0 1 2 3 4 5 6 7 8 9 10" 256, files in the folder with suffix "-ft3" is needed. But only "-ft1" and "-ft2" folder is generated in optimization step, according to template.sh, "-ft3" folder is likely not generated.

By the way, by searching globally in the whole project, "-ft3" only appears in scripts/abalations/template-human-noactive.sh and scripts/abalations/template-human-nosymm.sh. Since I haven't understand fully of your code yet, I am not sure when and where I could reach the generation of "ft3" folder. If it is convenience, could you simply explain what is the meaning of "ft"?

Could you please give me some advice on the problem I met now in rendering step? Thank you!

About the fraction occupied

Hi, thanks for your great work.

I want to ask about the meaning of the fraction occupied. Could you please introduce the relationship between the fraction occupied and the quality of the reconstructed result? It doesn't look like the smaller the better. Thanks!

hi，can banmo be used for 3D reconstruction of cars？

CSE embedding for other object categories

Hi, thanks for sharing the code. I really like your work. I want to try BANMO on objects that are neither human nor four-legged animals but it seems like BANMO assumes that. Would the results be degenerative if the CSE embeddings are not pre-trained. I noticed Tab. 5 mentioned "pre-trained embedding" and said the pre-training is not too important as long as the initial pose is good?

Bone reinitialization

Hi, thank you for sharing your code.

I have a question on the following line, correct_bones in the process of bone reinitialization.

banmo/nnutils/geom_utils.py

Line 698 in c449803

bones,_ = correct_bones(model, bones, inverse=True)

Why do you correct bones using the inverse of rest pose transformation?
Then, are the bones in the rest pose or another state?

As far as I know, you transform bones again into the rest pose bones using the rest pose transformation afterwards.
(

banmo/nnutils/banmo.py

Line 1212 in c449803

bones_rst, bone_rts_rst = correct_bones(self, self.nerf_models['bones'])

)

I’m wondering what is the purpose of rest pose?
Also, why did not you use the default bones(before multiplying the inverse of rest pose transformation) as rest pose?

Thank you.
Hope to hear from you soon!

	mdis = center.view(bs,1,B,3) - pts.view(bs,N,1,3) # bs,N,B,3
	if True:#B<50:
	mdis = axis_rotate(orient.view(bs,1,B,3,3), mdis[...,None])
	#mdis = orient.view(bs,1,B,3,3).matmul(mdis[...,None]) # bs,N,B,3,1
	mdis = mdis[...,0]
	mdis = scale.view(bs,1,B,3) * mdis.pow(2)
	else:
	# for efficiency considerations
	mdis = mdis.pow(2)
	mdis = mdis100log_scale.exp() # TODO accound for scaled near-far plane
	mdis = (-10 * mdis.sum(3)) # bs,N,B

	# mode: line load
	savename=${model_prefix}-init
	bash scripts/template-mgpu.sh $gpus $savename \
	$seqname $addr --num_epochs $num_epochs \
	--pose_cnn_path $pose_cnn_path \
	--warmup_shape_ep 5 --warmup_rootmlp \
	--lineload --batch_size $batch_size\
	--${use_symm}symm_shape \
	--${use_human}use_human

	# mode: pose correction
	# 0-80% body pose with proj loss, 80-100% gradually add all loss
	# freeze shape/feature etc
	loadname=${model_prefix}-init
	savename=${model_prefix}-ft1
	num_epochs=$((num_epochs/4))
	bash scripts/template-mgpu.sh $gpus $savename \
	$seqname $addr --num_epochs $num_epochs \
	--pose_cnn_path $pose_cnn_path \
	--model_path logdir/$loadname/params_latest.pth \
	--lineload --batch_size $batch_size \
	--warmup_steps 0 --nf_reset 1 --bound_reset 1 \
	--dskin_steps 0 --fine_steps 1 --noanneal_freq \
	--freeze_proj --proj_end 1\
	--${use_symm}symm_shape \
	--${use_human}use_human

	# mode: fine tunning without pose correction
	loadname=${model_prefix}-ft1
	savename=${model_prefix}-ft2
	num_epochs=$((num_epochs/2))
	bash scripts/template-mgpu.sh $gpus $savename \
	$seqname $addr --num_epochs $num_epochs \
	--pose_cnn_path $pose_cnn_path \
	--model_path logdir/$loadname/params_latest.pth \
	--lineload --batch_size $batch_size \
	--warmup_steps 0 --nf_reset 0 --bound_reset 0 \
	--dskin_steps 0 --fine_steps 0 --noanneal_freq \
	--${use_symm}symm_shape \
	--${use_human}use_human

	# mode: final tunning with larger rgb loss wt and reset beta
	loadname=${model_prefix}-ft2
	savename=${model_prefix}-ft3
	bash scripts/template-mgpu.sh $gpus $savename \
	$seqname $addr --num_epochs $num_epochs \
	--pose_cnn_path $pose_cnn_path \
	--model_path logdir/$loadname/params_latest.pth \
	--lineload --batch_size $batch_size \
	--warmup_steps 0 --nf_reset 0 --bound_reset 0 \
	--dskin_steps 0 --fine_steps 0 --noanneal_freq \
	--img_wt 1 --reset_beta --eikonal_loss \
	--${use_symm}symm_shape \
	--${use_human}use_human

facebookresearch / banmo Goto Github PK

banmo's Issues

Recommend Projects

Recommend Topics

Recommend Org