Here I have MVSNet camera information for a 4000x3000 image: <div class="snippet-c

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-

To <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

ROI via principal point shift and focal length scaling about mvsnet HOT 8 CLOSED

yoyo000 commented on July 23, 2024

ROI via principal point shift and focal length scaling

from mvsnet.

Comments (8)

YoYo000 commented on July 23, 2024

@KevinCain MVSNet I did not explicitly handle different image sizes, so you may pay attention to this part when using the provided script.

If you are using the top-left image, I think you only need to change the image size but the intrinsic should remain unchanged?

BTW for the top-right image patch, the intrinsic is:

4806.29 0 0
0 4806.29 1500
0 0 1

from mvsnet.

KevinCain commented on July 23, 2024

Thanks, @YoYo000, you're correct in both cases. ;-)

I assume that when using the principal point to shift the ROI, during sparse reconstruction I should not try to estimate the principal point.

from mvsnet.

KevinCain commented on July 23, 2024

While the principal point in camera space is near image center, the image coordinates (u, v) origin is often at the top left (OpenCV) or bottom left of the image.

What assumption does MVSNet make for the principle point origin for image space?

Considering 2054x1502 quadrants from 4108x3004 images, for the four quadrants I have the following intrinsics -- which seem to be wrong in assuming a bottom-right origin. However, I can't see another solution that is consistent with the top left and top right values you gave, which seem to be correct:

Top left quadrant:
4714.29 0 2054
0 4714.29 1502
0 0 1

Top right quadrant:
4714.29 0 0
0 4714.29 1502
0 0 1

Bottom left quadrant:
4714.29 0 2054
0 4714.29 0
0 0 1

Bottom right quadrant:
4714.29 0 0
0 4714.29 0
0 0 1

Many thanks for any values, or reference that might help here!

from mvsnet.

YoYo000 commented on July 23, 2024

The estimated u, v are not necessarily equal to W/2, H/2.

Top left quadrant:
f 0 u
0 f v
0 0 1

Top right quadrant:
f 0 u - W/2
0 f v
0 0 1

Bottom left quadrant:
f 0 u
0 f v - H/2
0 0 1

Bottom right quadrant:
f 0 u - W/2
0 f v - H/2
0 0 1

from mvsnet.

KevinCain commented on July 23, 2024

Thanks, @YoYo000 for your quick reply!

Assuming W=4108, H=3004, u=2054, v=1502, your formulas match the values I give above.

Using these intrinsic values with MVSNet yields incorrect results except in top left quadrant (same intrinsics, as you pointed out). Note that the results look wrong whether I'm doubling the focal length (see quote below) as I noted before, or using the same focal length for the cropped images as for the original images.

I assume the source image should be cropped to a quadrant we select via principal point, instead of the full sized input for which the camera pose was computed.

Top left - W=4108, H=3004, u=2054, v=1502:
9428.58 0 2054 (u = 2054)
0 9428.58 1502 (v = 1502)
0 0 1

Top right - W=4108, H=3004, u=2054, v=1502:
9428.58 0 0 (u - W/2 = u - 2054 = 2054 - 2054 = 0)
0 9428.58 1502 (v = 1502)
0 0 1

Bottom left - W=4108, H=3004, u=2054, v=1502:
9428.58 0 2054 (u = 2054)
0 9428.58 0 (v - H/2 = 1502 - 3004/2 = 1502 - 1502 = 0)
0 0 1

Bottom right - W=4108, H=3004, u=2054, v=1502:
9428.58 0 0 (u - W/2 = u - 4108/2 = 2054 - 2054 = 0)
0 9428.58 0 (v - H/2 = 1502 - 3004/2 = 1502 - 1502 = 0)
0 0 1

from mvsnet.

KevinCain commented on July 23, 2024

To illustrate my above confusion, here are MVSNet results for the familiar DTU 'scan9' dataset.

To prevent problems with differently-sized input images, here all images are cropped/sized identically, so neighbors referenced in the pair list will have the same size.

Image (left), vis (center), and prob-vis (right):

The camera file is as follows (note no attempt to estimate principal point):

extrinsic
0.910293 0.164153 -0.380028 1.48815
-0.226191 0.966094 -0.124497 -0.872257
0.346706 0.199288 0.916558 1.33805
0.0 0.0 0.0 1.0

intrinsic
2888.84 0 800
0 2888.84 600
0 0 1

4.41871 0.0212743

Below, my attempt to process the upper two quadrants of the image above. The top left quadrant uses the same intrinsics, as @YoYo000 notes above.

Top left quadrant
Cropped input image (left), initial vis for readability (center) and prob-vis (right):

intrinsic
2888.84 0 800
0 2888.84 600
0 0 1

Top right quadrant
Cropped input image (left), initial vis for readability (center) and prob-vis (right):

intrinsic
2888.84 0 0
0 2888.84 600
0 0 1

The top full size and top left quadrant depths look reasonable to me, but not the top right quadrant.

Clearly I'm making a fundamental mistake -- if possible, can you attempt to duplicate?

from mvsnet.

YoYo000 commented on July 23, 2024

Do you crop the source images as well?

from mvsnet.

KevinCain commented on July 23, 2024

To @YoYo000's question, yes, I cropped the source images. The problem was elsewhere -- my results above reflect incorrect intrinsics for some of the neighbors involved in the depth computation.

As I noted above, @YoYo000's principal point offsets above are correct, as are my proposed values.

from mvsnet.

ROI via principal point shift and focal length scaling about mvsnet HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent