Giter Site home page Giter Site logo

Comments (8)

zerchen avatar zerchen commented on July 2, 2024

Hi Eric,

Thanks for your interests.
The scale factor is used to scale the hand-object meshes to a unit cube. Then, the marching cubes operates on grids of points in this unit cube. To compute the scale, you need to first create your SDF training data. Then, in the training dataset, you could compute the max distance from any negative points (points inside the mesh) to the origin of you defined coordinate system. The inverse of this max distance can be the desired scale factor.
Hope it helps.

Best,
Zerui

from alignsdf.

Eric-Gty avatar Eric-Gty commented on July 2, 2024

Hi Zerui,

Really thanks for your reply. However, I'm still very confused about the way to calculate the fixed scale number corresponding to different datasets.

I already finished the construction of SDF training data. You mentioned to "compute the max distance from any negative points to the origin of your defined coordinate system". So, within your dataloader, after this line of code

hand_samples[:, 0:3] = hand_samples[:, 0:3] / scale - offset
, the scale of the SDF should be recovered to the original scale of the mesh that utilized to create the SDF training data. To me, this process looks like this answer: marian42/mesh_to_sdf#23 (comment). In this case, if we jointly visualize the SDF with the mesh, it should be look like the following:

After recovered the SDF scale and transfer it with the root-relative coordinate (suppose we define the wrist point as the origin), I think the SDF is already in the "defined coordinate system" as you mentioned. I would like to know if I understand this correctly.

If this is correct, I think based on your description, the following calculation method should be: iteration on all negative samples and calculate the L-2 norm for their recovered (x, y, z) points. Lastly, take the inverse of the maximum as the SDFScaleFactor of this dataset.

If the above description is wrong, I would like to know if you can provide your script for calculating the SDFScaleFactor for any of the dataset as a reference?

Hope this won't bother you too much.

Best regards,
Eric

from alignsdf.

zerchen avatar zerchen commented on July 2, 2024

Hi Eric,

Thanks for your detailed descriptions. Your understanding is correct! I also attach my code (may not compatiable with this codebase) to compute this for a reference. In the code, I think scale_hand is the thing you want.

Best,
Zerui

from distutils import debug
import os
import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
from tqdm import tqdm
import pickle
from fire import Fire
import json

def data_analysis(dataset):
    data_dir = f'data/{dataset}/train/'
    norm_dir = data_dir + 'norm/'
    meta_dir = data_dir + 'meta/'
    hand_dir = data_dir + 'sdf_hand/'
    obj_dir = data_dir + 'sdf_obj/'
    
    if 'obman' in dataset or 'ho3d' in dataset:
        cam_extr = np.array([[1.0, 0.0, 0.0], [0.0, -1.0, 0.0], [0.0, 0.0, -1.0]])
    else:
        cam_extr = np.array([[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]])
    
    dist_hand_points = []
    dist_obj_points = []
    sample_idx = []
    filenames = os.listdir(norm_dir)
    for idx, filename in tqdm(enumerate(filenames)):
        sample_idx.append(filename.split('.')[0])
        scale = np.load(os.path.join(norm_dir, filename))['scale']
        offset = np.load(os.path.join(norm_dir, filename))['offset']
    
        hand_data = np.load(os.path.join(hand_dir, filename))
        hand_pos_xyz = hand_data['pos'][:, :3]
        hand_neg_xyz = hand_data['neg'][:, :3]
    
        obj_data = np.load(os.path.join(obj_dir, filename))
        obj_pos_xyz = obj_data['pos'][:, :3]
        obj_neg_xyz = obj_data['neg'][:, :3]
    
        # transform all points into camera space
        hand_pos_xyz_cam = hand_pos_xyz / scale - offset
        hand_neg_xyz_cam = hand_neg_xyz / scale - offset
        obj_pos_xyz_cam = obj_pos_xyz / scale - offset
        obj_neg_xyz_cam = obj_neg_xyz / scale - offset
    
        with open(os.path.join(meta_dir, filename.replace('npz', 'pkl')), 'rb') as f:
            meta_data = pickle.load(f)
        
        cam_joints = np.dot(cam_extr, meta_data['coords_3d'].transpose(1, 0)).transpose(1, 0)
        hand_neg_dist_wrist = np.linalg.norm(hand_neg_xyz_cam - cam_joints[0], axis=1)
        obj_neg_dist_wrist = np.linalg.norm(obj_neg_xyz_cam - cam_joints[0], axis=1)

        dist_hand_points.append(np.max(hand_neg_dist_wrist))
        dist_obj_points.append(np.max(obj_neg_dist_wrist))

        np.savez(os.path.join(norm_dir, filename), scale=scale, offset=offset, scale_hand= 1 / np.max(hand_neg_dist_wrist))

from alignsdf.

Eric-Gty avatar Eric-Gty commented on July 2, 2024

Hi Eric,

Thanks for your detailed descriptions. Your understanding is correct! I also attach my code (may not compatiable with this codebase) to compute this for a reference. In the code, I think scale_hand is the thing you want.

Best, Zerui

from distutils import debug
import os
import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
from tqdm import tqdm
import pickle
from fire import Fire
import json

def data_analysis(dataset):
    data_dir = f'data/{dataset}/train/'
    norm_dir = data_dir + 'norm/'
    meta_dir = data_dir + 'meta/'
    hand_dir = data_dir + 'sdf_hand/'
    obj_dir = data_dir + 'sdf_obj/'
    
    if 'obman' in dataset or 'ho3d' in dataset:
        cam_extr = np.array([[1.0, 0.0, 0.0], [0.0, -1.0, 0.0], [0.0, 0.0, -1.0]])
    else:
        cam_extr = np.array([[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]])
    
    dist_hand_points = []
    dist_obj_points = []
    sample_idx = []
    filenames = os.listdir(norm_dir)
    for idx, filename in tqdm(enumerate(filenames)):
        sample_idx.append(filename.split('.')[0])
        scale = np.load(os.path.join(norm_dir, filename))['scale']
        offset = np.load(os.path.join(norm_dir, filename))['offset']
    
        hand_data = np.load(os.path.join(hand_dir, filename))
        hand_pos_xyz = hand_data['pos'][:, :3]
        hand_neg_xyz = hand_data['neg'][:, :3]
    
        obj_data = np.load(os.path.join(obj_dir, filename))
        obj_pos_xyz = obj_data['pos'][:, :3]
        obj_neg_xyz = obj_data['neg'][:, :3]
    
        # transform all points into camera space
        hand_pos_xyz_cam = hand_pos_xyz / scale - offset
        hand_neg_xyz_cam = hand_neg_xyz / scale - offset
        obj_pos_xyz_cam = obj_pos_xyz / scale - offset
        obj_neg_xyz_cam = obj_neg_xyz / scale - offset
    
        with open(os.path.join(meta_dir, filename.replace('npz', 'pkl')), 'rb') as f:
            meta_data = pickle.load(f)
        
        cam_joints = np.dot(cam_extr, meta_data['coords_3d'].transpose(1, 0)).transpose(1, 0)
        hand_neg_dist_wrist = np.linalg.norm(hand_neg_xyz_cam - cam_joints[0], axis=1)
        obj_neg_dist_wrist = np.linalg.norm(obj_neg_xyz_cam - cam_joints[0], axis=1)

        dist_hand_points.append(np.max(hand_neg_dist_wrist))
        dist_obj_points.append(np.max(obj_neg_dist_wrist))

        np.savez(os.path.join(norm_dir, filename), scale=scale, offset=offset, scale_hand= 1 / np.max(hand_neg_dist_wrist))

Hi Zerui,

Thanks for your detailed reply, it's of a great help to me. I'll go try whether it's bounded on my personalized dataset.

Thanks again :)))

Best regards,
Eric

from alignsdf.

zerchen avatar zerchen commented on July 2, 2024

You are welcome!

from alignsdf.

Eric-Gty avatar Eric-Gty commented on July 2, 2024

Hi Zerui,

Sorry for disturbing you again after few days. After finish building the SDF training data, I tried to train the SDF network on my personalized dataset but failed to reconstruct the hand shape.

So I set up the Obman dataset and run your codebase to compare its training process with mine. However, after I visualize the result, some problems are observed and I hope to receive some valuable feedback from you.

I followed the original experimental setup defined in https://github.com/zerchen/AlignSDF/blob/master/experiments/obman/30k_1e2d_mlp5.json. Since I only care about the hand part, so I simply ignore the object branch, I think this shouldn't have major influence on the reconstruction quality of hand. As a result, I run two experiments on obman, one only utilize the Hand SDF Decoder, and the other one include the MANO Decoder, which are the (a) and (b) ablation experiments defined in your paper as below:

To save time, I tried the reconstruction process on test set after 110 epochs of training. As a result, all reconstructed samples looks very similar with each other with non-plausible hand as below:

I would like to know whether this is caused by the limited number of training epoch? Or, it's because of the lack of MANO prior involved (even with ablation b, the MANO prior is not embedded into the SDF decoder, it only add another three loss terms: 3d joint, beta, theta). I even try to overfit the training samples to see whether it can create a reasonable hand shape, but still, it failed and the shape is similar to the above one.

During your experiments, have you observe a relatively satisfied hand shape with SDF decoder only? My guess of this problem is that: either is caused by the wrong experimental setup, or is because of the lack of MANO prior. Hope to receive a empirical answer from you.

Another question is regarding the data construction. After we multiply the SDF_Scale_Factor, the SDF sample is further divide by 2 as follows:

hand_samples[:, 0:5] = hand_samples[:, 0:5] / 2

This is a very minor question, but I just curious why should we do this. Is it because we try to further bound the xyz into (-0.5, 0.5) to form a accurate unit cube around the origin?

Thanks a lot for your time :)

from alignsdf.

zerchen avatar zerchen commented on July 2, 2024

Hi,

I think the main issue for producing such a blurry hand is the limited training epochs. Using MANO prior could alleviate ths issue and maybe could produce a relatively more clear hand at a early training stage (but 110 epochs maybe are still not enough). In my experiments, it is also difficult to produce a clear hand without MANO prior.
Yes, you are right. The reason that I divide sdf samples by 2 is to scale points into the unit cube (since I want to convert all object negative points into the unit cube). If you want to reconstruct hands soley, I think you don't need to divide hand_samples by 2, and you could have a try.
Hope it answers your question.

Best,
Zerui

from alignsdf.

Eric-Gty avatar Eric-Gty commented on July 2, 2024

Hi,

I think the main issue for producing such a blurry hand is the limited training epochs. Using MANO prior could alleviate ths issue and maybe could produce a relatively more clear hand at a early training stage (but 110 epochs maybe are still not enough). In my experiments, it is also difficult to produce a clear hand without MANO prior. Yes, you are right. The reason that I divide sdf samples by 2 is to scale points into the unit cube (since I want to convert all object negative points into the unit cube). If you want to reconstruct hands soley, I think you don't need to divide hand_samples by 2, and you could have a try. Hope it answers your question.

Best, Zerui

Hi Zerui,

Really thanks for your answer. The reason why I asked about the training setup is that I'm currently having limited computational resources. After receiving your feedback, I do continue training with it and obtain promising results. Thanks again for this.

Regarding the hand reconstruction solely, on my personal dataset, I set the wrist as the root joint and bounded the sample within the unit cube. However, no matter whether I divide by 2, the reconstruction just fails, it's not even a hand. The augmentation I used simply randomly rotates the hand sample along the wrist joint on the x-y plane (z keep unchanged). So it should always be within the unit cube no matter the rotation degree.

Just curious whether you've met similar problems when dealing with SDF reconstruction at the beginning (sometimes the reconstruction result is even a flat plane). I've clarified that the data should be all correct, so the problem maybe is because the model fails to learn the shape information.

I'll go continue debugging this, but if you have met similar problems before, please kindly let me know some possible mistakes :))

Really thanks for your valuable time in this conversation, it helps me a lot!

Best regards,
Eric

from alignsdf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.