Comments (10)
Hello, excellent work!! By the way, may I please ask you about the Dense-SLAM experiment in the paper?
To my understanding, the weight for the "Droid" part is an off-the-shelf model that is trained on a TartanAir (released here: https://github.com/princeton-vl/DROID-SLAM), and the other hyperparameter is following the demo configuration of that repository. Is this correct?
In addition, I'm just curious about the input image: What the size of the image is fed into the Droid module? Or following one of the the cited papers, reshaped into a bit smaller size, like [192,640]?
Thank you very much for your support!
Thanks for your question! I think @ckLibra should know this. Could you please help illustrate a little bit on this?
from metric3d.
Hello, excellent work!! By the way, may I please ask you about the Dense-SLAM experiment in the paper?
To my understanding, the weight for the "Droid" part is an off-the-shelf model that is trained on a TartanAir (released here: https://github.com/princeton-vl/DROID-SLAM), and the other hyperparameter is following the demo configuration of that repository. Is this correct?
In addition, I'm just curious about the input image: What the size of the image is fed into the Droid module? Or following one of the the cited papers, reshaped into a bit smaller size, like [192,640]?
Thank you very much for your support!
Yeah, DROID's pretrained model in TartanAir is used. The image is rescaled to a smaller size, but not that much as [192, 640]. It depends on your GPU memory.
from metric3d.
Sure, the specific configuration will be given by @JUGGHM since I am not able to access to the codes currently. BTW, maybe you can also check wheather the intrinsic is aligned to the resolution, and visualize the optical flow estimated in DROID to better check the bug. For a quick check, I think the resolution of (288, 960) could be fine.
from metric3d.
I appreciate all your kind and quick support!
If so, @ckLibra, could you share with us the configuration (especially, resolution) to get the number of Table 5 in the paper (e.g. Droid+Ours
shows t_rel=1.63
, etc.)?
[192, 640] for it showed a completely worse result in my experiments so I believe that is too small or is harmed by a domain shift, but not sure what should be.
Thank you!
from metric3d.
Thank you for your time and great feedback, @ckLibra!
I experimented with resolution [288, 960] just now.
That says resolution gives no large difference in metrics, so we may have a 'better' configuration to work this on KITTI.
Anyway, I'd love to wait for @JUGGHM's response for it.
Thank you very much for all great support.
P.S.
Let me share the experimental result: I'm 90% sure that no problem with the intrinsics alignment and t_rel calculation, so started to think like the above.
Result | resolution | t_rel | r_rel | fx,fy,cx,cy |
---|---|---|---|---|
Paper (Droid. w/o Metric3D) | ? | 21.7 | 0.23 | ? |
Replicate A: Just RGB | [288,960] | 79.0 | 32.3 | [553.7, 550.4, 471.3, 142.5] |
Replicate B: Just RGB | [192,640] | 77.1 | 31.4 | [369.1, 366.9, 314.2, 95.0] |
from metric3d.
Thank you for your time and great feedback, @ckLibra!
I experimented with resolution [288, 960] just now. That says resolution gives no large difference in metrics, so we may have a 'better' configuration to work this on KITTI.
Anyway, I'd love to wait for @JUGGHM's response for it. Thank you very much for all great support.
P.S. Let me share the experimental result: I'm 90% sure that no problem with the intrinsics alignment and t_rel calculation, so started to think like the above.
Result resolution t_rel r_rel
fx,fy,cx,cy
Paper (Droid. w/o Metric3D) ? 21.7 0.23 ?
Replicate A: Just RGB [288,960] 79.0 32.3 [553.7, 550.4, 471.3, 142.5]
Replicate B: Just RGB [192,640] 77.1 31.4 [369.1, 366.9, 314.2, 95.0]
I checked the code of resize, it is:
h1 = sqrt(384* 512 * h0 / w0)
w1 = sqrt(384* 512 * w0 / h0)
where h0 and w0 are the original size.
So the size should be something close to (240, 824)
from metric3d.
Thank you for your time and great feedback, @ckLibra!
I experimented with resolution [288, 960] just now. That says resolution gives no large difference in metrics, so we may have a 'better' configuration to work this on KITTI.
Anyway, I'd love to wait for @JUGGHM's response for it. Thank you very much for all great support.
P.S. Let me share the experimental result: I'm 90% sure that no problem with the intrinsics alignment and t_rel calculation, so started to think like the above.
Result resolution t_rel r_rel
fx,fy,cx,cy
Paper (Droid. w/o Metric3D) ? 21.7 0.23 ?
Replicate A: Just RGB [288,960] 79.0 32.3 [553.7, 550.4, 471.3, 142.5]
Replicate B: Just RGB [192,640] 77.1 31.4 [369.1, 366.9, 314.2, 95.0]
I don't remember whether I registered the estimated trajectory to the gt before calculating the metric, but from your figure, it seems that this may be the reason for the poor r_rel.
from metric3d.
Hello, sorry for the late response!
h1 = sqrt(384* 512 * h0 / w0)
w1 = sqrt(384* 512 * w0 / h0)
Thank you @JUGGHM!
That seems similar to the eth3d demo in DROID official implementations.
How about the metrics, such as scale alignment or each rel
calculation?
Though I've just tried two eval scripts by the following repositories, both generate a very similar result and never got close to (21.7, 0.23)
- https://github.com/Huangying-Zhan/kitti-odom-eval
- https://github.com/TRI-ML/KP3D/blob/master/kp3d/externals/cpp/evaluate_odometry.cpp
Thank you!
from metric3d.
Hello, sorry for the late response!
h1 = sqrt(384* 512 * h0 / w0)
w1 = sqrt(384* 512 * w0 / h0)Thank you @JUGGHM! That seems similar to the eth3d demo in DROID official implementations.
How about the metrics, such as scale alignment or each
rel
calculation? Though I've just tried two eval scripts by the following repositories, both generate a very similar result and never got close to(21.7, 0.23)
- https://github.com/Huangying-Zhan/kitti-odom-eval
- https://github.com/TRI-ML/KP3D/blob/master/kp3d/externals/cpp/evaluate_odometry.cpp
Thank you!
Any ideas on that @ckLibra ? How can I provide the information he might need according to our code space?
from metric3d.
Hello, sorry for the late response!
h1 = sqrt(384* 512 * h0 / w0)
w1 = sqrt(384* 512 * w0 / h0)Thank you @JUGGHM! That seems similar to the eth3d demo in DROID official implementations.
How about the metrics, such as scale alignment or each
rel
calculation? Though I've just tried two eval scripts by the following repositories, both generate a very similar result and never got close to(21.7, 0.23)
- https://github.com/Huangying-Zhan/kitti-odom-eval
- https://github.com/TRI-ML/KP3D/blob/master/kp3d/externals/cpp/evaluate_odometry.cpp
Thank you!
Currently I am not available to re-implement the droid-slam experiment myself. It was originally done by @ckLibra. After finishing Metric3D v2 I will closely examine this since many users propose issues on it.
from metric3d.
Related Issues (20)
- Pixel represented focal length or real world scale focal length(mm) HOT 4
- Some problems in Training HOT 3
- Supporting old GPUs? HOT 3
- metric_scale in nyu.py HOT 1
- Speed Up Inference HOT 2
- NYU dataset and json HOT 1
- Inference Speed data
- normals not normal HOT 2
- Unable to adjust scale of depth correctly in the wild-mode HOT 1
- How to convert the DINO2reg-ViT model to an ONNX model HOT 2
- torch.hub.load error HOT 4
- Failed to find function: mono.model.backbones.convnext_large HOT 1
- Fine tune on custom dataset HOT 8
- Sparse GT depth from LiDAR for supervision? HOT 1
- Question regarding losses HOT 1
- Depth scale vs Metric scale HOT 6
- What does the pkl file contain in training with Matterport3D? HOT 1
- generate only a depth matrix without generating a 3D point cloud HOT 2
- Is there any reference code to generate kitti dataset annotation?
- Camera parameters of taskonomy HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from metric3d.