Giter Site home page Giter Site logo

sanghunhan92 / 2k2k Goto Github PK

View Code? Open in Web Editor NEW
217.0 13.0 5.0 22.16 MB

Official Code and Dataset for "High-fidelity 3D Human Digitization from Single 2K Resolution Images" (CVPR 2023 Highlight)

Home Page: https://sanghunhan92.github.io/conference/2K2K/

License: Other

Dockerfile 0.53% Python 99.47%
3d-reconstruction cvpr2023 human-data

2k2k's People

Contributors

sanghunhan92 avatar zhenhuil1n avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

2k2k's Issues

Rendering memory limit

Hello!
I am trying to render 2K2K dataset via render.py. But in the process of performing the script, I have the end of the CPU memory, although I have about 80GB free space in RAM. Despite this, if I reduce the resolution of the picture to 256, everything works. Please tell me how much memory of the CPU is needed to make a script with an image with a resolution of 2048.

Question about keypoints

Hello,
Your paper is a great work that can generate high quality human mesh and texture! I have just finished rendering the image and want to know how to obtain the keypoints. Should I use OpenPose or download the keypoints from https://github.com/ketiVision/2K2K? By the way, only 909 .mat files were downloaded from the dropbox.

Seek assitance

Hi,

Here is my training setting of phase1 & phase2

image: shading image of Thuman2.0 + 2K2K dataset
lr: 1e-5
epoch: 30

I turn off background augmentation and image_blur in ReconDataset.py (With background augmentation, the model predicts black down normal maps during training in phase1. So I turn them off as you mentioned in #11 ) And the setting of colorjitter in transform_human in fetch_data is same as the default.

After the training of phase2, most of the predictions are not the ideal result.
To make the inference comparable, I change the colorjitter from (0.85,0.85) to (0.7,1.0) which is same as training, but the point cloud of one of training images is like
withcolorjitter

If I turn off the colorjitter, the result is better, but some points on left chest are abnormal
withoutcolorjitter

I think these points are related to the region in red circle of the depth map during training.
depthmap

So I am confused by two questions:
(1) Why does the result of turning colorjitter off seems better than using the training setting of colorjitter?
(2) What causes the inaccurate depth prediction in the red circle of depth map?

I will appreciate for your suggestions and expertise.

Memory Error

I got an error when I run render.py to process the dataset. I ran the program with 16G RAM and a single RTX3060 GPU, and it can't work in colab with free T4 GPU as well.
Do I need a more powerful device?

Error:"numpy.core._exceptions._ArrayMemoryError: Unable to allocate 2.63 GiB for an array with shape (117683045, 3) and data type float64"

Render Dataset 2k2k

Could you share the code for rendering image and depth on the 2K2K dataset?

Training is ok, but there is a problem during inference

Hi
I try to train the model with 2k2k dataset and use black background only. It seems ok in tensorboard during the phase 1.
image
upper normal:
image
During the inference, it is work well when I set the model.train() and the model output fail with model.eval()
image
Could you give me some advices. Thank you!

2K2K dataset rendering

Hello, thanks for the sharing your code and dataset.

I have downloaded 2K2K dataset, and ran the rendering code.

For the back-side image and depth generation, the code looks assigning the back-side color to the silhouette from front's view's mask.

But, In actual perspective setting, I think If the back-side is rendered from back camera view, front image and back image may not be matched in their silhouette.

Why did you not rendered back-side image from another back camera view, but generated using the implementation you provided in render_util.py?

If both side's images are not aligned, back normal generation does not work?

How to get 2K2K datasets keypoints file(.npy) ?

Hello, thank you for your nice work! I have some quesions about the keypoints file:

  1. There are 2,000 training meshes, but there are only 1,000 Keypoints files(.mat) provided in https://github.com/ketiVision/2K2K
  2. I found in the ReconDataset code that I need to load the keypoints file in npy format, but I did not find the code on how to convert the keypoints file in mat format to npy format.

I want to know when the rest of the dataset will be available, and can you give me some tips on how to convert the mat files into npy files? Thank you!

Question for Texture

Hello I already run these two commands and got the result but the ply or obj file seems to don't have texture, did I do something wrong ?

python test_02_model.py --load_ckpt {checkpoint_file_name} --save_path {result_save_folder}
python test_03_model.py --save_path {result_save_folder}

Results
image

Getting a 3d model from a photo from behind

Hello!
Thank you for your wonderful work!
Can you please tell me if I can somehow get a 3d model of person from a photo from behind? I want to try to combine the results from two photos of one person (front and back) to get more complete information for my 3d model.

time consuming to render

Thanks a lot for your work! I have a question about the time consuming to render.
"It takes about 2-3 days to render a 2048×2048 resolution images" whether means to render a group of image and depth for one obj file, it may task 2-3 days?

How to align 2K2K scans with SMPL-X?

Hello,

Thanks for updating the dataset. I've tried the latest SMPL-X parameters in the dataset repo.

I used the following code to create the body mesh

body_model = SMPLX(model_path=os.path.join(MODEL_PATH,'smplx'), use_face_contour=True, use_pca=False, gender='neutral')
smpl_data = json.load(open(os.path.join(smplx_folder, smpl_file)))
output = body_model(  global_orient= torch.tensor(smpl_data['global_orient']).unsqueeze(0).contiguous(),
                          transl = torch.tensor(smpl_data['transl']).unsqueeze(0).contiguous(),
                          body_pose = torch.tensor(smpl_data['body_pose']).unsqueeze(0).contiguous(),
                          betas = torch.tensor(smpl_data['betas']).unsqueeze(0).contiguous(),
                          jaw_pose=torch.tensor(smpl_data['jaw_pose']).unsqueeze(0).contiguous(),
                          left_hand_pose = torch.tensor(smpl_data['left_hand_pose']).unsqueeze(0).contiguous(),
                          right_hand_pose = torch.tensor(smpl_data['right_hand_pose']).unsqueeze(0).contiguous(),
                          expression=torch.tensor(smpl_data['expression']).unsqueeze(0).contiguous(),
                          return_verts=True
                        )

However, there are still translation and rotation offsets between the PLY scans and the SMPLX body mesh.

Screen Shot 2023-09-28 at 11 36 38 PM

Also, it seems that the scale (0.986) is not correct. Could you please describe how you registered the body model parameters?

Thank you very much.

3D obj texture

Thank you for your excellent work. I have two questions:

  1. Does the exported obj come with texture? Or does the network inference include a texture part?
  2. When will your pre-trained model be available for testing?

Can we use custom data?

Hi, can I use my own data to test on your model? If I can, what are the requirements for my data?

Another question is that does your model can output texture and apply on the obj?

Thanks.

Troubleshooting issues with Model Training and Unexpected Results in Phase 1

Description:
Hi, Thank you for your valuable work on this model. I have encountered an issue while training the provided model with the 2k2k dataset during Phase 1. The expected results are not being achieved, and I am seeking assistance in troubleshooting the problem.
Examples of normal images after training phase 1 for 30 epochs: (on 13812 training samples generated by render.py - excluded ~200 samples for inv_affine failure issue)
000116_front
000116_back
son_front
son_back

Details:

  • No modifications have been made to the model architecture for the training.
  • I fixed the orientation of the models in the dataset that were top view, then ran render.py to generate color and depth maps, then used openpose to get the keypoints and then used the op_json_to_numpy() function within the test_02_model.py script to convert the keypoints to numpy arrays to be saved in the keypoints folder under PERS .
  • I noticed that around 200 out of 14,000 samples in the train list were causing failures in the inv_affine function. Consequently, I decided to exclude these problematic samples from the training process.

Training Command:
NCCL_P2P_LEVEL=NVL OMP_NUM_THREADS=1 NCCL_SHM_DISABLE=1 CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.run --nnodes=1 --nproc_per_node=4 --rdzv_backend=c10d train.py --use_ddp=True --data_path ./data --phase 1 --local_rank 0 --batch_size 1 --is_master True --num_epoch 30 --load_ckpt '' --exp_name exp1 --workers 0

Loss on final epoch: 1.2074

Possible Reasons for the Issue?
While I have identified the problematic samples and excluded them from training, I am uncertain if there might be other factors contributing to the unexpected results apart from the 200 lesser samples. Could you please provide insights on other potential reasons for the discrepancy between the expected and observed results during Phase 1 training?

Sub note: Also training the phase 2 on the checkpoint from phase 1 results in a 'checkpoint.pth.tar' file that is only ~300mb compared to ~900mb pretrained checkpoint that comes with this repository and the ~900 mb checkpoint from phase 1. Can you also provide some insights on why this is happening?

Your guidance and expertise will be greatly appreciated in resolving this issue and achieving the desired training outcomes. Thank you for your assistance.

dataset request

hi, I have sent the email attached with "2K2K Dataset Realease Agreement" for a few days. So how long will it take for me to get accessible to the datasets. thanks!

Reitrained model can not achieve very good quality

Hi,

I retrained the model using the 2K2K dataset and tried and finished stage 1 (30 epochs) and stage 2 (10 epochs) with the same setting illustrated in the paper. The results are not as good as the ones provided, and I have a few questions about the retraining.

  1. Firstly, I want to ask in the training if you used background augmentation in both phase 1 and phase 2. If I add the background augmentation in the tensorboard, the model can not predict good results. I can show the results when I add background argumentation in phase 2. The result is shown below:
    image

  2. I trained the model without background augmentation; some of the test results are good, but some of them are very bad, did you experience this kind of artifacts before?

    I found that the results for most of the captured photos are not very good; there are artifacts shown in the below picture, they are mostly locate on the edge of the human and points inside out from the image plane if we look from front.
    image
    image
    image

    The rendering people and THuman testing results are good, and Hanni is good for some reason.

image
image
image

  1. I only trained the model with no shading rendering; does training the model with shading help?

down_normal_back_pred and down_normal_front_pred are black images

Hi,

I am retraining 2K2K using the 2K2K dataset in tensorboard. Everything works fine except the down_normal_front_pred and down_normal_back_pred are black images, as shown here, and the ground truth for it seems fine. Did you encounter this problem during training?

image
image

Training settings in phase1 and phase 2

Hi,
Thanks for sharing your great work!
As mentioneed in paper, the learning rate, epoch, and batch size are 1e-4, 30, and 2. So do you use the same setting in phase1 and phase2 ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.