Comments (5)
Hi LvZic,
For learning the camera parameters, I have two suggestions.
- Do not adding any direct supervision on cam parameters (scale, tx, and ty). They should only be supervised via 2D keypoint loss.
- To mitigate the focal length / scale issue you mentioned, I suggest to crop the input images to make the hand size roughly the same.
from frankmocap.
Hi LvZic,
For learning the camera parameters, I have two suggestions.
- Do not adding any direct supervision on cam parameters (scale, tx, and ty). They should only be supervised via 2D keypoint loss.
- To mitigate the focal length / scale issue you mentioned, I suggest to crop the input images to make the hand size roughly the same.
thanks for your suggestions.
Now the crop strategy has been added in data-augment, the camera params weight is set by 0, and the scale_gt now range from [0.7, 2.0]. Without direct supervision on cam parameters, the training converge better.
- However, all params seems to converge slowly after 30 epoch, so does the cam params. So the cams loss can now be added in or not? and I wonder why the direct supervision on cam parameters affect the converge performance, as the usage of direct loss of cam params can be found in some research.
- the 3d-2d projection is 2d = s(3d + t_xy) in your project, while in the HMR paper is s * 3d + t_xy. following
discussion is about it: akanazawa/hmr#60 . In my test, your projection formula has better converge performance than origin HMR paper's. Can u explain it more ? thanks.
from frankmocap.
- How do you get the
scale_gt
? According to my knowledge, none of the papers in this area apply such loss. - I suggest not adding the direct camera loss during the whole camera parameters. The model uses weak perspective camera model which only considers scale and translation. The ground-truth camera model is perspective. These two camera models has different definition thus are not the same. Therefore, it makes no sense to use direct camera parameters.
- I believe
s(3d + t_xy)
ands * 3d + t_xy
are the same conceptually. Please feel free to choose any format that works better on your side.
from frankmocap.
teh
- How do you get the
scale_gt
? According to my knowledge, none of the papers in this area apply such loss.- I suggest not adding the direct camera loss during the whole camera parameters. The model uses weak perspective camera model which only considers scale and translation. The ground-truth camera model is perspective. These two camera models has different definition thus are not the same. Therefore, it makes no sense to use direct camera parameters.
- I believe
s(3d + t_xy)
ands * 3d + t_xy
are the same conceptually. Please feel free to choose any format that works better on your side.
- End-to-end Hand Mesh Recovery from a Monocular RGB Image(https://arxiv.org/abs/1902.09305) use camera loss:
- In my test, the scale_gt is obtained by focal_length / global_trans[2], and the global_trans is the last 3 MANO parms. I just checked the uv = scale_gt*(XY + t_xy), the visual 2d result seems normal with little misalignment.
- However, in my latest test which not use camera loss, all the regressed params converge better except cam param, result is following:
total loss: 453.9190 | 2d loss: 11.2612 | 3d loss: 0.037922 | mask loss: 0.2397 | reg loss: 0.0018 | scale loss: 0.1078 | trans loss: 4372.8198 | rvec loss: 0.0184 | pose loss: 0.0418 | shape loss: 0.2389
The scale loss and trans loss are too large, although the scale_gt is not the ture value of weak perspective camera model.
from frankmocap.
@lvZic
I suggest not pay too much attention to "ground-truth" scale/translation as they don't really exist. Just learn scale/translation using 2D keypoint losses.
from frankmocap.
Related Issues (20)
- Undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceIN3c107complexIfEEEEPKNS_6detail12TypeMetaDataEv HOT 2
- what's the difference of ego_centric and third_view hand detector? HOT 1
- sh install hand_detectors.sh when I run this command (in install_frankmocap.sh) I was always met with compiling warning
- How to use "pred_body_pose" data? HOT 6
- Is the axis of joint angle like below? HOT 2
- OpenGL.error.NullFunctionError: Attempt to call an undefined function glutInit, check for bool(glutInit) before calling HOT 1
- Can the parameters of hand be used to output mano? HOT 1
- hand_pose is global or relatation HOT 1
- Using multcamera for hand tracking calibration HOT 1
- Some question about the initial pose of body and hand HOT 1
- how to replace light-openpose with mediapipe for human pose detection HOT 2
- Questions about 3d joint and 2d joint HOT 1
- Implementing the algorithm (training) HOT 1
- About the pose of hand HOT 3
- How to dealing with wrong hand pose? HOT 5
- Reconstruction of the hand HOT 3
- The speed of handtracking. HOT 1
- Problem in hand detection module... HOT 3
- RuntimeError: Not compiled with GPU support HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from frankmocap.