Comments (8)
Hi Eric,
Thanks for your interests.
The scale factor is used to scale the hand-object meshes to a unit cube. Then, the marching cubes operates on grids of points in this unit cube. To compute the scale, you need to first create your SDF training data. Then, in the training dataset, you could compute the max distance from any negative points (points inside the mesh) to the origin of you defined coordinate system. The inverse of this max distance can be the desired scale factor.
Hope it helps.
Best,
Zerui
from alignsdf.
Hi Zerui,
Really thanks for your reply. However, I'm still very confused about the way to calculate the fixed scale number corresponding to different datasets.
I already finished the construction of SDF training data. You mentioned to "compute the max distance from any negative points to the origin of your defined coordinate system". So, within your dataloader, after this line of code
Line 177 in 5dcb6cf
After recovered the SDF scale and transfer it with the root-relative coordinate (suppose we define the wrist point as the origin), I think the SDF is already in the "defined coordinate system" as you mentioned. I would like to know if I understand this correctly.
If this is correct, I think based on your description, the following calculation method should be: iteration on all negative samples and calculate the L-2 norm for their recovered (x, y, z) points. Lastly, take the inverse of the maximum as the SDFScaleFactor of this dataset.
If the above description is wrong, I would like to know if you can provide your script for calculating the SDFScaleFactor for any of the dataset as a reference?
Hope this won't bother you too much.
Best regards,
Eric
from alignsdf.
Hi Eric,
Thanks for your detailed descriptions. Your understanding is correct! I also attach my code (may not compatiable with this codebase) to compute this for a reference. In the code, I think scale_hand is the thing you want.
Best,
Zerui
from distutils import debug
import os
import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
from tqdm import tqdm
import pickle
from fire import Fire
import json
def data_analysis(dataset):
data_dir = f'data/{dataset}/train/'
norm_dir = data_dir + 'norm/'
meta_dir = data_dir + 'meta/'
hand_dir = data_dir + 'sdf_hand/'
obj_dir = data_dir + 'sdf_obj/'
if 'obman' in dataset or 'ho3d' in dataset:
cam_extr = np.array([[1.0, 0.0, 0.0], [0.0, -1.0, 0.0], [0.0, 0.0, -1.0]])
else:
cam_extr = np.array([[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]])
dist_hand_points = []
dist_obj_points = []
sample_idx = []
filenames = os.listdir(norm_dir)
for idx, filename in tqdm(enumerate(filenames)):
sample_idx.append(filename.split('.')[0])
scale = np.load(os.path.join(norm_dir, filename))['scale']
offset = np.load(os.path.join(norm_dir, filename))['offset']
hand_data = np.load(os.path.join(hand_dir, filename))
hand_pos_xyz = hand_data['pos'][:, :3]
hand_neg_xyz = hand_data['neg'][:, :3]
obj_data = np.load(os.path.join(obj_dir, filename))
obj_pos_xyz = obj_data['pos'][:, :3]
obj_neg_xyz = obj_data['neg'][:, :3]
# transform all points into camera space
hand_pos_xyz_cam = hand_pos_xyz / scale - offset
hand_neg_xyz_cam = hand_neg_xyz / scale - offset
obj_pos_xyz_cam = obj_pos_xyz / scale - offset
obj_neg_xyz_cam = obj_neg_xyz / scale - offset
with open(os.path.join(meta_dir, filename.replace('npz', 'pkl')), 'rb') as f:
meta_data = pickle.load(f)
cam_joints = np.dot(cam_extr, meta_data['coords_3d'].transpose(1, 0)).transpose(1, 0)
hand_neg_dist_wrist = np.linalg.norm(hand_neg_xyz_cam - cam_joints[0], axis=1)
obj_neg_dist_wrist = np.linalg.norm(obj_neg_xyz_cam - cam_joints[0], axis=1)
dist_hand_points.append(np.max(hand_neg_dist_wrist))
dist_obj_points.append(np.max(obj_neg_dist_wrist))
np.savez(os.path.join(norm_dir, filename), scale=scale, offset=offset, scale_hand= 1 / np.max(hand_neg_dist_wrist))
from alignsdf.
Hi Eric,
Thanks for your detailed descriptions. Your understanding is correct! I also attach my code (may not compatiable with this codebase) to compute this for a reference. In the code, I think scale_hand is the thing you want.
Best, Zerui
from distutils import debug import os import numpy as np import matplotlib matplotlib.use('Agg') import matplotlib.pyplot as plt from tqdm import tqdm import pickle from fire import Fire import json def data_analysis(dataset): data_dir = f'data/{dataset}/train/' norm_dir = data_dir + 'norm/' meta_dir = data_dir + 'meta/' hand_dir = data_dir + 'sdf_hand/' obj_dir = data_dir + 'sdf_obj/' if 'obman' in dataset or 'ho3d' in dataset: cam_extr = np.array([[1.0, 0.0, 0.0], [0.0, -1.0, 0.0], [0.0, 0.0, -1.0]]) else: cam_extr = np.array([[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]]) dist_hand_points = [] dist_obj_points = [] sample_idx = [] filenames = os.listdir(norm_dir) for idx, filename in tqdm(enumerate(filenames)): sample_idx.append(filename.split('.')[0]) scale = np.load(os.path.join(norm_dir, filename))['scale'] offset = np.load(os.path.join(norm_dir, filename))['offset'] hand_data = np.load(os.path.join(hand_dir, filename)) hand_pos_xyz = hand_data['pos'][:, :3] hand_neg_xyz = hand_data['neg'][:, :3] obj_data = np.load(os.path.join(obj_dir, filename)) obj_pos_xyz = obj_data['pos'][:, :3] obj_neg_xyz = obj_data['neg'][:, :3] # transform all points into camera space hand_pos_xyz_cam = hand_pos_xyz / scale - offset hand_neg_xyz_cam = hand_neg_xyz / scale - offset obj_pos_xyz_cam = obj_pos_xyz / scale - offset obj_neg_xyz_cam = obj_neg_xyz / scale - offset with open(os.path.join(meta_dir, filename.replace('npz', 'pkl')), 'rb') as f: meta_data = pickle.load(f) cam_joints = np.dot(cam_extr, meta_data['coords_3d'].transpose(1, 0)).transpose(1, 0) hand_neg_dist_wrist = np.linalg.norm(hand_neg_xyz_cam - cam_joints[0], axis=1) obj_neg_dist_wrist = np.linalg.norm(obj_neg_xyz_cam - cam_joints[0], axis=1) dist_hand_points.append(np.max(hand_neg_dist_wrist)) dist_obj_points.append(np.max(obj_neg_dist_wrist)) np.savez(os.path.join(norm_dir, filename), scale=scale, offset=offset, scale_hand= 1 / np.max(hand_neg_dist_wrist))
Hi Zerui,
Thanks for your detailed reply, it's of a great help to me. I'll go try whether it's bounded on my personalized dataset.
Thanks again :)))
Best regards,
Eric
from alignsdf.
You are welcome!
from alignsdf.
Hi Zerui,
Sorry for disturbing you again after few days. After finish building the SDF training data, I tried to train the SDF network on my personalized dataset but failed to reconstruct the hand shape.
So I set up the Obman dataset and run your codebase to compare its training process with mine. However, after I visualize the result, some problems are observed and I hope to receive some valuable feedback from you.
I followed the original experimental setup defined in https://github.com/zerchen/AlignSDF/blob/master/experiments/obman/30k_1e2d_mlp5.json. Since I only care about the hand part, so I simply ignore the object branch, I think this shouldn't have major influence on the reconstruction quality of hand. As a result, I run two experiments on obman, one only utilize the Hand SDF Decoder, and the other one include the MANO Decoder, which are the (a) and (b) ablation experiments defined in your paper as below:
To save time, I tried the reconstruction process on test set after 110 epochs of training. As a result, all reconstructed samples looks very similar with each other with non-plausible hand as below:
I would like to know whether this is caused by the limited number of training epoch? Or, it's because of the lack of MANO prior involved (even with ablation b, the MANO prior is not embedded into the SDF decoder, it only add another three loss terms: 3d joint, beta, theta). I even try to overfit the training samples to see whether it can create a reasonable hand shape, but still, it failed and the shape is similar to the above one.
During your experiments, have you observe a relatively satisfied hand shape with SDF decoder only? My guess of this problem is that: either is caused by the wrong experimental setup, or is because of the lack of MANO prior. Hope to receive a empirical answer from you.
Another question is regarding the data construction. After we multiply the SDF_Scale_Factor, the SDF sample is further divide by 2 as follows:
Line 198 in a944dd0
This is a very minor question, but I just curious why should we do this. Is it because we try to further bound the xyz into (-0.5, 0.5) to form a accurate unit cube around the origin?
Thanks a lot for your time :)
from alignsdf.
Hi,
I think the main issue for producing such a blurry hand is the limited training epochs. Using MANO prior could alleviate ths issue and maybe could produce a relatively more clear hand at a early training stage (but 110 epochs maybe are still not enough). In my experiments, it is also difficult to produce a clear hand without MANO prior.
Yes, you are right. The reason that I divide sdf samples by 2 is to scale points into the unit cube (since I want to convert all object negative points into the unit cube). If you want to reconstruct hands soley, I think you don't need to divide hand_samples by 2, and you could have a try.
Hope it answers your question.
Best,
Zerui
from alignsdf.
Hi,
I think the main issue for producing such a blurry hand is the limited training epochs. Using MANO prior could alleviate ths issue and maybe could produce a relatively more clear hand at a early training stage (but 110 epochs maybe are still not enough). In my experiments, it is also difficult to produce a clear hand without MANO prior. Yes, you are right. The reason that I divide sdf samples by 2 is to scale points into the unit cube (since I want to convert all object negative points into the unit cube). If you want to reconstruct hands soley, I think you don't need to divide hand_samples by 2, and you could have a try. Hope it answers your question.
Best, Zerui
Hi Zerui,
Really thanks for your answer. The reason why I asked about the training setup is that I'm currently having limited computational resources. After receiving your feedback, I do continue training with it and obtain promising results. Thanks again for this.
Regarding the hand reconstruction solely, on my personal dataset, I set the wrist as the root joint and bounded the sample within the unit cube. However, no matter whether I divide by 2, the reconstruction just fails, it's not even a hand. The augmentation I used simply randomly rotates the hand sample along the wrist joint on the x-y plane (z keep unchanged). So it should always be within the unit cube no matter the rotation degree.
Just curious whether you've met similar problems when dealing with SDF reconstruction at the beginning (sometimes the reconstruction result is even a flat plane). I've clarified that the data should be all correct, so the problem maybe is because the model fails to learn the shape information.
I'll go continue debugging this, but if you have met similar problems before, please kindly let me know some possible mistakes :))
Really thanks for your valuable time in this conversation, it helps me a lot!
Best regards,
Eric
from alignsdf.
Related Issues (20)
- Question about the training epochs HOT 4
- Issue related to DexYCB dataset cropping HOT 4
- Dataset download HOT 2
- occur "do not support renderer in this machine" problem HOT 2
- occur "do not support renderer in this machine" problem
- The difference between sdf_hand and sdf_hand_mini HOT 6
- Test the model on novel images
- **源 HOT 5
- Training HOT 3
- Preprocessed file are not generated in sdf_hand and sdf_obj but only in norm HOT 9
- Issue related to the result of Ote HOT 2
- mesh_hand and mesh_obj folders HOT 1
- Trained model HOT 7
- 'obj_corners_3d' and 'obj_rest_corners_3d' in obman meta file HOT 4
- Hi, when i try to run the code, i meet some problems. HOT 9
- Missing test/mesh_obj_rest/ for obman test HOT 3
- How to generate dexycb split names? HOT 2
- Dex_YCB dataset organization HOT 2
- Question Regarding the SDF Sampling
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from alignsdf.