Giter Site home page Giter Site logo

Comments (21)

yxie20 avatar yxie20 commented on September 7, 2024 1

A note on parsing colmap camera K Rt: I used the the handy script provided by the NeRF research group (https://github.com/Fyusion/LLFF/tree/master/llff/poses). The negation (compared to the NeRF/LLFF convention) of the 2nd and 3rd column of camera extrinsics is accounted for.

Ok. We are using OpenCV style coordinates for extrinsics. Do you mean we need to manually change the code to run your Colmap data?

No, not at all! Everything is accounted for and you only need to copy, paste and run!

from nsvf.

yxie20 avatar yxie20 commented on September 7, 2024 1

Thank you for such prompt reply! I've uploaded the data here: https://github.com/yxie20/lego_nsvf. To use it, simply copy the following: pose, bbox.txt, intrinsics.txt to the root folder. The original NSVF folder structure is followed here.
I'll look into your comment 2 and 3. Thanks!

Can you also share some comparison with NeRF where you observe the clear difference?
Thanks!

Here's the NeRF training progress for 40 minutes or 5k steps, with colmap poses on Lego:

from nsvf.

kondela avatar kondela commented on September 7, 2024 1

I have similar experience as @yxie20.

NSVF - default configuration, 62.5k steps
image (2)

NSVF - enabled trainable_extrinsics and lowered learning rate to 0.0001 as suggested by @MultiPath . 25k steps then OOM
image (1)

nerf (@kwea123 implementation) - original, validation image, and depth map.
image (3)

I used same images and poses from COLMAP for all experiments.

from nsvf.

kondela avatar kondela commented on September 7, 2024 1

Hey @MultiPath, you might be right that poses and images are not aligned properly.

I am using @yxie20 script (https://github.com/yxie20/lego_nsvf/blob/master/poses/pose_utils.py) to generate poses, intrinsics and bounding box as well, and it seems that there's assumption that cameras.bin from COLMAP contains sorted poses based on image file names (which it does not). This would explain why both of us have poor results on real datasets.

In original LLFF they use np.argsort() to account for it (see variable perm in https://github.com/Fyusion/LLFF/blob/c6e27b1ee59cb18f054ccb0f87a90214dbe70482/llff/poses/pose_utils.py#L33). I tried to fix it and ran training using default config, however, it seems that results are still poor.
image

You can download the new pose.zip here.

I will take a look at it again, there might be another bug that causes improper alignment between images and poses or something similar.

Meanwhile, if you have time it would be awesome if you could provide your own script to convert COLMAP's model files (cameras.bin, images.bin, points3D.bin) to format that NSVF expects, i.e. bbox.txt, intrinsics.txt and poses folder.

from nsvf.

MultiPath avatar MultiPath commented on September 7, 2024

Hi, thanks for posting this! It is interesting that NSVF will be more sensitive to camera errors compared to the original NeRF. In my view, they should face the same learning issue as the ray-marching part is the same.
Also we did not observe the same phenomena on the real datasets we have tested compared to the original NeRF.

  1. Can you please share the three Lego dataset for us to take a look?
  2. In our code, we also implement an option to fix the camera pose errors (while this function is sometimes not stable)
    by adding --trainable-extrinsics in the training arguments.
  3. Decreasing the learning rate might also be helpful. To speed up learning, we usually used lr=0.001. You can try training with lr=0.0001.

from nsvf.

yxie20 avatar yxie20 commented on September 7, 2024

Thank you for such prompt reply! I've uploaded the data here: https://github.com/yxie20/lego_nsvf. To use it, simply copy the following: pose, bbox.txt, intrinsics.txt to the root folder. The original NSVF folder structure is followed here.

I'll look into your comment 2 and 3. Thanks!

from nsvf.

yxie20 avatar yxie20 commented on September 7, 2024

A note on parsing colmap camera K Rt: I used the the handy script provided by the NeRF research group (https://github.com/Fyusion/LLFF/tree/master/llff/poses). The negation (compared to the NeRF/LLFF convention) of the 2nd and 3rd column of camera extrinsics is accounted for.

from nsvf.

MultiPath avatar MultiPath commented on September 7, 2024

Thank you for such prompt reply! I've uploaded the data here: https://github.com/yxie20/lego_nsvf. To use it, simply copy the following: pose, bbox.txt, intrinsics.txt to the root folder. The original NSVF folder structure is followed here.

I'll look into your comment 2 and 3. Thanks!

Can you also share some comparison with NeRF where you observe the clear difference?
Thanks!

from nsvf.

MultiPath avatar MultiPath commented on September 7, 2024

A note on parsing colmap camera K Rt: I used the the handy script provided by the NeRF research group (https://github.com/Fyusion/LLFF/tree/master/llff/poses). The negation (compared to the NeRF/LLFF convention) of the 2nd and 3rd column of camera extrinsics is accounted for.

Ok. We are using OpenCV style coordinates for extrinsics. Do you mean we need to manually change the code to run your Colmap data?

from nsvf.

yxie20 avatar yxie20 commented on September 7, 2024

To help us better pinpoint the problem, I added my pose parsing code to https://github.com/yxie20/lego_nsvf. This code takes in colmap .bin files and output pose/, bbox.txt and intrinsics.txt in NSVF standard. The initial voxel size (side length) is automatically set to 1/8 of the total volume side length.

from nsvf.

yxie20 avatar yxie20 commented on September 7, 2024

Here I provide some results with --trainable-extrinsics enabled. The result is still far from that achieved by ground truth poses.

Here is the loss curve comparison (purple is gt poses, blue is colmap+trainable-extrinsics)

from nsvf.

kwea123 avatar kwea123 commented on September 7, 2024

Telling from the depth map, it seems that it doesn't learn the correct structure either...
Haven't tried NSVF yet, but I have a doubt on your NeRF result, because in my experiments, it works pretty well on real world objects for which the poses are also estimated by colmap.
From your very first picture already, I doubt that the reconstructed poses are not correct at all, because the gt poses are distributed rather uniformly on the upper hemisphere if I remember correctly like in Fig. 1 ; your reconstructed poses seem very irregular. I suspect that it is due to the white background which makes the pose reconstruction difficult. I would suggest that you align the gt poses and the reconstructed poses to see how much they differ, before digging into either NeRF or NSVF.

from nsvf.

yxie20 avatar yxie20 commented on September 7, 2024

Telling from the depth map, it seems that it doesn't learn the correct structure either...
Haven't tried NSVF yet, but I have a doubt on your NeRF result, because in my experiments, it works pretty well on real world objects for which the poses are also estimated by colmap.
From your very first picture already, I doubt that the reconstructed poses are not correct at all, because the gt poses are distributed rather uniformly on the upper hemisphere if I remember correctly like in Fig. 1 ; your reconstructed poses seem very irregular. I suspect that it is due to the white background which makes the pose reconstruction difficult. I would suggest that you align the gt poses and the reconstructed poses to see how much they differ, before digging into either NeRF or NSVF.

Thanks for the comment. If you do decide to run NSVF, I look forward to hearing how it goes! Note that the same colmap poses were used in NeRF and NSVF, that's the key of this thread here.

from nsvf.

MultiPath avatar MultiPath commented on September 7, 2024

Hi I haven't had time looking into it yet. I will take a look at this issue this weekend

from nsvf.

tau-yihouxiang avatar tau-yihouxiang commented on September 7, 2024

@yxie20 Have you solve this issue? I decided to run colmap on real-world dataset by myself.

from nsvf.

yxie20 avatar yxie20 commented on September 7, 2024

No... Still waiting on the author @MultiPath to provide some insights.

from nsvf.

MultiPath avatar MultiPath commented on September 7, 2024

Sorry about that. I was busy with other papers recently. will debug this for a bit

from nsvf.

MultiPath avatar MultiPath commented on September 7, 2024

Hi @kondela can you also share your dataset? if possible

from nsvf.

kondela avatar kondela commented on September 7, 2024

Hey @MultiPath, sure! You can download it from my google drive. I used first 142 images for training and left last 2 for validation.

from nsvf.

MultiPath avatar MultiPath commented on September 7, 2024

Hi @kondela, just for double check.. Are you sure the pose files and the images are aligned? In my code, I sorted the poses and images separately by string order.

Also, how do you get your boundingbox?

image
image
image

from nsvf.

dedoogong avatar dedoogong commented on September 7, 2024

I also met this problem. I think, to reproduce the original result with other datasets, I would like to ask author to give the code for that( camera.bin/image.bin to bbox.txt, intrinsic.txt)

Please help us! thank you~!

from nsvf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.