Giter Site home page Giter Site logo

Comments (8)

YoYo000 avatar YoYo000 commented on July 23, 2024

@KevinCain MVSNet I did not explicitly handle different image sizes, so you may pay attention to this part when using the provided script.

If you are using the top-left image, I think you only need to change the image size but the intrinsic should remain unchanged?

BTW for the top-right image patch, the intrinsic is:

4806.29 0 0
0 4806.29 1500
0 0 1

from mvsnet.

KevinCain avatar KevinCain commented on July 23, 2024

Thanks, @YoYo000, you're correct in both cases. ;-)

I assume that when using the principal point to shift the ROI, during sparse reconstruction I should not try to estimate the principal point.

from mvsnet.

KevinCain avatar KevinCain commented on July 23, 2024

While the principal point in camera space is near image center, the image coordinates (u, v) origin is often at the top left (OpenCV) or bottom left of the image.

What assumption does MVSNet make for the principle point origin for image space?

Considering 2054x1502 quadrants from 4108x3004 images, for the four quadrants I have the following intrinsics -- which seem to be wrong in assuming a bottom-right origin. However, I can't see another solution that is consistent with the top left and top right values you gave, which seem to be correct:

Top left quadrant:
4714.29 0 2054
0 4714.29 1502
0 0 1

Top right quadrant:
4714.29 0 0
0 4714.29 1502
0 0 1

Bottom left quadrant:
4714.29 0 2054
0 4714.29 0
0 0 1

Bottom right quadrant:
4714.29 0 0
0 4714.29 0
0 0 1

Many thanks for any values, or reference that might help here!

from mvsnet.

YoYo000 avatar YoYo000 commented on July 23, 2024

The estimated u, v are not necessarily equal to W/2, H/2.

Top left quadrant:
f 0 u
0 f v
0 0 1

Top right quadrant:
f 0 u - W/2
0 f v
0 0 1

Bottom left quadrant:
f 0 u
0 f v - H/2
0 0 1

Bottom right quadrant:
f 0 u - W/2
0 f v - H/2
0 0 1

from mvsnet.

KevinCain avatar KevinCain commented on July 23, 2024

Thanks, @YoYo000 for your quick reply!

Assuming W=4108, H=3004, u=2054, v=1502, your formulas match the values I give above.

Using these intrinsic values with MVSNet yields incorrect results except in top left quadrant (same intrinsics, as you pointed out). Note that the results look wrong whether I'm doubling the focal length (see quote below) as I noted before, or using the same focal length for the cropped images as for the original images.

I assume the source image should be cropped to a quadrant we select via principal point, instead of the full sized input for which the camera pose was computed.

Top left - W=4108, H=3004, u=2054, v=1502:
9428.58 0 2054 (u = 2054)
0 9428.58 1502 (v = 1502)
0 0 1

Top right - W=4108, H=3004, u=2054, v=1502:
9428.58 0 0 (u - W/2 = u - 2054 = 2054 - 2054 = 0)
0 9428.58 1502 (v = 1502)
0 0 1

Bottom left - W=4108, H=3004, u=2054, v=1502:
9428.58 0 2054 (u = 2054)
0 9428.58 0 (v - H/2 = 1502 - 3004/2 = 1502 - 1502 = 0)
0 0 1

Bottom right - W=4108, H=3004, u=2054, v=1502:
9428.58 0 0 (u - W/2 = u - 4108/2 = 2054 - 2054 = 0)
0 9428.58 0 (v - H/2 = 1502 - 3004/2 = 1502 - 1502 = 0)
0 0 1

from mvsnet.

KevinCain avatar KevinCain commented on July 23, 2024

To illustrate my above confusion, here are MVSNet results for the familiar DTU 'scan9' dataset.

To prevent problems with differently-sized input images, here all images are cropped/sized identically, so neighbors referenced in the pair list will have the same size.

Image (left), vis (center), and prob-vis (right):
00000000 00000000-vis 00000000_prob-vis

The camera file is as follows (note no attempt to estimate principal point):

extrinsic
0.910293 0.164153 -0.380028 1.48815
-0.226191 0.966094 -0.124497 -0.872257
0.346706 0.199288 0.916558 1.33805
0.0 0.0 0.0 1.0

intrinsic
2888.84 0 800
0 2888.84 600
0 0 1

4.41871 0.0212743

Below, my attempt to process the upper two quadrants of the image above. The top left quadrant uses the same intrinsics, as @YoYo000 notes above.

Top left quadrant
Cropped input image (left), initial vis for readability (center) and prob-vis (right):

00000000 00000000_init-vis 00000000_prob-vis

intrinsic
2888.84 0 800
0 2888.84 600
0 0 1

Top right quadrant
Cropped input image (left), initial vis for readability (center) and prob-vis (right):
00000000 00000000_init-vis 00000000_prob-vis

intrinsic
2888.84 0 0
0 2888.84 600
0 0 1

The top full size and top left quadrant depths look reasonable to me, but not the top right quadrant.

Clearly I'm making a fundamental mistake -- if possible, can you attempt to duplicate?

from mvsnet.

YoYo000 avatar YoYo000 commented on July 23, 2024

Do you crop the source images as well?

from mvsnet.

KevinCain avatar KevinCain commented on July 23, 2024

To @YoYo000's question, yes, I cropped the source images. The problem was elsewhere -- my results above reflect incorrect intrinsics for some of the neighbors involved in the depth computation.

As I noted above, @YoYo000's principal point offsets above are correct, as are my proposed values.

from mvsnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.