Comments (8)
Can you use some of ARKit's code directly to generate a rotation matrix from your quaternion, and compare with the get_rot_from_quaternion
implementation? One thing to note is that if ARKits representation is actually a 4-vector axis angle representation, the normalization that happens here [1] is probably not correct (and the conversion is probably not correct either):
https://github.com/simonfuhrmann/mve/blob/master/libs/mve/bundle_io.cc#L53-L54
from mve.
Nice work. I'd be curious to see a fusion of all the depth maps with fssr
.
from mve.
Hi. I'm not entirely sure what causes this. If you are certain that the issue has to do with the rotation values, then maybe ARKit uses a different notation/convention for the quaternion, and get_rot_from_quaternion
cannot be used without modification to match their quaternion conventions.
There are quite a few ways to express rotations unfortunately. Some notations (angle-axis) use a normalized axis, and a rotation value (in degree or radians) around that axis (4 values in total). Others encode the rotation value around the axis as the length of axis itself (3 values in total). So a few things you could check if the axis is properly normalized; if you're using the correct angle encoding (degree vs. radians); if it's rotating the right way; if the forth dimension of the quaternion is at the correct place (some encode the magnitude in position 0, axis being 1-3, others encode axis in 0-2, magnitude being in 3). Just some ideas...
from mve.
Thank you for the quick response!
Indeed, I had tried different ways to convert from quaternion but none gave satisfactory results.
However, following your idea, I re-checked the ARKit API and the camera extrinsic parameters are actually accessible as a matrix so I extracted that to use directly into MVE (just had to invert some axes to fix the used convention) and it works much better!
So thank you, the position/rotation seems good and I can move around and the points overlap for the most part.
Unfortunately, I'm still getting problems:
-
On datasets where the camera did not rotate (only translation), the result is pretty good and the mesh is clean, I just get some kind of spherical distortion. Just to show what I'm talking about, here is a comparison between the mesh produced by MVE and the one produce by ARKit, the edge at the bottom is supposed to be a straight wall.
I suspected the distortion parameter but changing its value (to a fixed average value) in theCameraInfo
seem to have no effect. Should I undistort the image as well as changing the parameter? I thought scene2pset only used images to colorize the pointset.
Or could it be only a side effect of only having point of views in one direction (camera direction vector is quasi-parallel for each view)? -
I'm having even more problematic results in scenes with rotation: although the scene is coherent, lots of artifacts appear. Again, just to show what I'm talking about, on the left is the result from the dataset with no rotation (distorted but clean), on the right I moved freely around the couch (much more messy):
Since translation alone seem fine, I'm suspecting other camera parameters, but I don't see where it could be wrong (apart from the distortion parameter):
mve::CameraInfo cameraInfo;
// focal is the pixel focal length, normalize it using the largest side of the image
cameraInfo.flen = rgbWidth > rgbHeight ? info.focalX / (float) rgbWidth : info.focalY / (float) rgbHeight;
// principalPointOffset is the offset from the top-left corner of the image frame, in pixels
cameraInfo.ppoint[0] = info.principalPointOffsetX / static_cast<float>(rgbWidth);
cameraInfo.ppoint[1] = info.principalPointOffsetY / static_cast<float>(rgbHeight);
// pixels are square on iPhone and iPad but still check properly
cameraInfo.paspect = info.focalX / info.focalY;
// use a fixed value for now: obtained by averaging the lensDistortionCenter value in a PhotoSession, since ARKit does not provide proper lens distortion parameter
// lensDistortionCenter is the offset of the distortion center of the camera lens from the top-left corner of the image.
cameraInfo.dist[0] = 948.0f / static_cast<float>(rgbWidth);
cameraInfo.dist[1] = 720.0f / static_cast<float>(rgbHeight);
I'm sorry for the wall of text! It's just in the odd case you have an idea, it would be a huge help.
from mve.
Okay for my first point: I've found the depthmap_convert_conventions
method which solved the distortion problem!
Now for the second point, I think it's actually due to ARKit positions and rotations not being stable enough throughout the AR sessions so that the recorded poses are not fully coherent. It just doesn't appear in the "translation only" dataset because SLAM algorithms are more sensitive to rotation.
from mve.
Sorry for the late response. Yes, there are two ways to represent depth maps, using "depth" or "range" values. I'm glad you figured that one out. Regarding the remaining issue with the alignment, I am not sure if I have enough context to be of help. Let me throw in some ideas.
If you receive depth maps from ARKit, then radial distortion parameters of the camera model cannot be the issue; all depth maps are assumed to be undistorted. If the color image is used for coloring the depth maps, they also have to be undistorted the same way. Otherwise, color won't align with the geometry.
Does ARKit do some sort of depth map alignment that it doesn't roll into the exported rotation? If that's the case, that would well explain the misalignment. If ARKit is used multiple times (as you said "sessions" in your text above), why/how would ARKit guarantee that multiple sessions have consistent geometry?
from mve.
Hi,
Sorry for not answering earlier myself. As you stated, radial distortion was not the problem and ARKit does the depth map/rgb map alignment for us. So actually, it was just a matter of anchoring the camera poses throughout the session and saving them at the end. I achieved a pretty satisfactory result for now!
I haven't tested yet to run through multiple sessions but I think it would only depend on ARKit's ability to remain successfully consistent using persistent sessions; so that MVE should not see any difference between a single-session scan and a multi-session one.
Anyway, thank you very much for your help and thank you for your amazing work on this library!
from mve.
It's not easy to show a 3d result in 2d but here is a gif of a quick scan (27 seconds ; 53 frames) :
I'm not sure it's the best example since I think the best use case is for scanning bigger surfaces rather than small objects (as the lidar resolution is rather small), however it performs pretty well.
from mve.
Related Issues (20)
- Broken links HOT 1
- Option --intrinsics-from-views does not work HOT 14
- Problema en la instalación HOT 5
- Problema en el paso de build MVE and UMVE HOT 3
- Problem in the build MVE and UMVE step HOT 9
- Problem in build MVE HOT 7
- Problems at make the MVE during installation HOT 4
- Cannot stat path of directory + No such file or directory HOT 3
- How to convert Eigen into .mvei format? HOT 4
- OpenMVG+MVE: dmrecon: command not found HOT 3
- Extract information inside exif.blob HOT 1
- Documentation lacks explanation HOT 1
- How can I compile MVS-Texturing? HOT 1
- Get all visible 3D feature points for a specific image. HOT 7
- mve fails on compile (error: template with C linkage) HOT 3
- possible bug HOT 2
- a way to make the output into gltf/glb instead of PLY. HOT 2
- Compilation issue in MacOS Silicon: This header is only meant to be used on x86 and x64 architecture HOT 7
- Align created mesh with images' position, direction and rotation HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mve.