andyzeng / tsdf-fusion-python Goto Github PK
View Code? Open in Web Editor NEWPython code to fuse multiple RGB-D images into a TSDF voxel volume.
Home Page: http://andyzeng.github.io/
License: BSD 2-Clause "Simplified" License
Python code to fuse multiple RGB-D images into a TSDF voxel volume.
Home Page: http://andyzeng.github.io/
License: BSD 2-Clause "Simplified" License
Has anyone tried this code in a ROS setting? I'm stuck at getting the transformation matrix, and more specifically, the rotation matrix. I am simulating the setup in Gazebo.
ROS uses a coordinate system where
X : forward
Y : left
Z : up
I am obtaining the pose matrices of 'world' to 'camera_depth_optical_link', since that is supposed to have the right transformations. For instance, one of the transformations is:
[[-7.10359030e-01 -5.65690465e-01 4.18789143e-01 5.30712575e-02]
[-7.03839505e-01 5.70930343e-01 -4.22668304e-01 2.73008553e-01]
[-3.93540756e-12 -5.95006589e-01 -8.03720821e-01 4.58332924e-01]
[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00]]
For better understanding, the rotation matrix in degrees corresponds to :
[-1.43486898e+02 2.25488605e-10 -1.35264135e+02]
I am merely rotating my camera about a vertical pole. (so, in this the 3rd element constantly changes) What must I do to get the correct transformation matrices. I have referred to issue #11 and have understood that this transformation is what I need to correct, however, despite several attempts, I am unable to get the right pose. Any solution will be appreciated!
Hi there,
I want to make pose.txt with my own data, but I don’t know how this 4*4 matrix is generated. Some people said that it is to obtain the pose relative to each frame according to a specific key frame.
https://github.com/intel-isl/Open3D/blob/master/examples/Python/ReconstructionSystem/make_fragments.py
I used this method which is talked in previous issues. I dont konw the trans,odo_init which is we want.
Result is bad with original dataset:
Hi,
I found that if you move the np.meshgrid operation to init() function, the speed of CPU mode will up to almost 5 FPS.
HELLO there,
Someone can tell me how can to get RGB-D, Depht frames and camera poses
Hello,
If I wanted to start with fresh set of images to use this code to generate a mesh.ply file, what steps do I need to do to get the new images ready? I'm new to this area of study so any help getting going is appreciated!!
Thanks!
Thank you very much for your nice repo.
How can I increase the resolution?
According to your documentation:
"from the 7-scenes dataset into a 405 x 264 x 289 projective TSDF voxel volume with 2cm resolution at about 30 FPS in GPU mode"
Thanks in Advanced
I try to apply this TSDF Fusion on the ROBI Dataset, but I have troubles with the alignment of the frames and integration. The plane alignment seems to be right but I still have some offset in the plane.
It looks like the camera pose is not right, but I already used the same dataset/camera pose with TSDF from Open3D which worked flawlessly.
Does anyone have an idea, why it's not working correctly?
Hello,
Script works absolutely fine in CPU mode but its very slow. I was trying to debug the code and got stuck on this issue.
I have installed all the required packages and also added below 2 lines to my .bashrc file
export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}$
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
Below is the error log from the console:
Connected to pydev debugger (build 191.7479.30)
Estimating voxel volume bounds...
Initializing voxel volume...
Backend TkAgg is interactive backend. Turning interactive mode on.
Voxel volume size: 405 x 264 x 289
Traceback (most recent call last):
File "/home/mhv7rng/pycharm-community-2019.1.3/helpers/pydev/pydevd.py", line 1758, in
main()
File "/home/mhv7rng/pycharm-community-2019.1.3/helpers/pydev/pydevd.py", line 1752, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/mhv7rng/pycharm-community-2019.1.3/helpers/pydev/pydevd.py", line 1147, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/mhv7rng/PycharmProjects/tsdf_fusion/demo.py", line 30, in
tsdf_vol = fusion.TSDFVolume(vol_bnds,voxel_size=0.02)
File "/home/mhv7rng/PycharmProjects/tsdf_fusion/fusion.py", line 133, in init
}""")
File "/home/mhv7rng/anaconda2/envs/don/lib/python2.7/site-packages/pycuda/compiler.py", line 291, in init
arch, code, cache_dir, include_dirs)
File "/home/mhv7rng/anaconda2/envs/don/lib/python2.7/site-packages/pycuda/compiler.py", line 254, in compile
return compile_plain(source, options, keep, nvcc, cache_dir, target)
File "/home/mhv7rng/anaconda2/envs/don/lib/python2.7/site-packages/pycuda/compiler.py", line 84, in compile_plain
checksum.update(get_nvcc_version(nvcc).encode("utf-8"))
File "</home/mhv7rng/anaconda2/envs/don/lib/python2.7/site-packages/decorator.pyc:decorator-gen-115>", line 2, in get_nvcc_version
File "/home/mhv7rng/.local/lib/python2.7/site-packages/pytools/init.py", line 539, in _deco
result = func(*args)
File "/home/mhv7rng/anaconda2/envs/don/lib/python2.7/site-packages/pycuda/compiler.py", line 16, in get_nvcc_version
result, stdout, stderr = call_capture_output(cmdline)
File "/home/mhv7rng/.local/lib/python2.7/site-packages/pytools/prefork.py", line 227, in call_capture_output
return forker.call_capture_output(cmdline, cwd, error_on_nonzero)
File "/home/mhv7rng/.local/lib/python2.7/site-packages/pytools/prefork.py", line 61, in call_capture_output
% (" ".join(cmdline), e))
pytools.prefork.ExecError: error invoking 'nvcc --version': [Errno 2] No such file or directory
Can you advice on how to solve this issue??
Hi, is there any live visualization for it?
Just like the small movie shows.
When I run my own dataset, I have problems that I don't understand。
What should I do?
Initializing voxel volume...
Voxel volume size: 62 x 39 x 54 - # points: 130,572
Fusing frame 1/228
Traceback (most recent call last):
File "/snap/pycharm-professional/306/plugins/python/helpers/pydev/pydevd.py", line 1496, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/snap/pycharm-professional/306/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/wang/tsdf-fusion-python-master/demo.py", line 55, in <module>
tsdf_vol.integrate(color_image, depth_im, cam_intr, cam_pose, obs_weight=1.)
File "/home/wang/tsdf-fusion-python-master/fusion.py", line 218, in integrate
im_h, im_w = depth_im.shape
ValueError: too many values to unpack (expected 2)
There is double sided effect in the extracted mesh as well as point cloud in the demo code. The other side is not correct, how can this be fixed? Thank you.
I visualized the depth map and it showed something like below:
Does anybody know why is it different from the common depth maps, such as those from Kinect, with all these waves on top of the view?
Thanks.
from the paper KinectFusion, the only raw depth images are enough, so I don't know what the RGB images here are used for?
3D depth sensing
Thanks for the great code!
I was wondering if I could crop our relevant parts of the image and then run TSDF on that. If I do that, would I have to change the camera intrinsics?
I will give out an example scenario. Suppose in the example sequence, I wish to reconstruct only the table, and I have masked out only the table from the sequence, I would then crop the image only to have the table, (and if necessary, resize it back to the original image dimensions). Is this possible? In this case, what changes do I have to make to the camera intrinsic and the pose?
Any link to theory would also be great!
Thanks
Hi there,
Thanks for putting this work in public.
My question may sound silly, but do I need to have the camera pose to be able to use this repository? that's my impression by going through your codes.
What I have is just bunch of RGB-D images and I would like to fuse them to each other get the extended map.
Regards,
Jacob
Hello,Compressed package error, downloaded compressed package decompression error, there is no data
Hello! i'm trying to modify the demo to get data from a folder instead of your data but i'm getting this error
Initializing voxel volume...
Voxel volume size: 5770 x 4123 x 3752
Traceback (most recent call last):
File "getfusion.py", line 48, in
tsdf_vol = fusion.TSDFVolume(vol_bnds,voxel_size=0.02)
File "/home/anigomez/Documents/DepthWork/tsdf-fusion-python/fusion.py", line 34, in init
self._tsdf_vol_cpu = np.ones(self._vol_dim).astype(np.float32)
File "/home/anigomez/.local/lib/python2.7/site-packages/numpy/core/numeric.py", line 223, in ones
a = empty(shape, dtype, order)
MemoryError
I left the voxel params the same as before, can you please tell me if i need to calculate voxel bounds, and if possible, how to do it?
here is my modified demo.py:
How to visualize in real time during the reconstruction process, like the example "/images/fusion-movie-gif"?
Thanks for your good repo.
Would provide the notation of your 4x4 rigid transformation matrix? It will easier to understand, what they are? or any other link?
Hi fellas, I can run demo.py
with data provided here. But I run into trouble when trying with custom data. I use RealSense L515 with ROS to capture RGB and depth images and ORB_SLAM to get poses. If I use demo.py
without any change it will throw MemoryError at TSDFVolume initialization. But if I use bigger values here it will generate messy and non-sense mesh which is very much useless.
So I would really be interested to hear how you use this repo (or C++ version) with your own dataset.
By the way, I can use my data with BundleFusion without issue, even though it may run into an error after processing few hundreds of images.
Hi @andyzeng, I am studying the codes and noticed that there is a comment that was incomplete. Under fusion.py
, the function integrate(self, color_im, depth_im, cam_intr, cam_pose, obs_weight=1.):
has a description for obs_weight
that says:
obs_weight (float): The weight to assign for the current observation. A higher value
I was wondering what was the complete description? Thanks in advance!
python demo.py
Traceback (most recent call last):
File "demo.py", line 9, in
import fusion
File "tsdf-fusion-python/fusion.py", line 346
xyz_t_h = (transform @ xyz_h.T).T
^
SyntaxError: invalid syntax
hello Andy,
thank you for your wonderful contribution.
I tried a different data , and modified the voxel_size to 0.1 to avoid memory problems
but i get this error
ValueError: Surface level must be within volume data range.
from the function
measure.marching_cubes_lewiner(tsdf_vol, level=0)
at
tsdf_vol.get_point_cloud()
thank you for your time
Hi there,
shouldn't the tsdf_vol
be initialized with -1, s.t in the cases when the voxel is outside view frustum or behind the camera they are regarded as 'unobserved'?
self._tsdf_vol_cpu = -1 * np.ones(self._vol_dim).astype(np.float32)
Depending on the context, a zero depth_value might still be best regarded as observed:
if (depth_value == 0)
{
tsdf_vol[voxel_idx] = 1;
return;
}
Hey @andyzeng,
First off thanks for putting this code up here, it's been a really handy reference for getting my head around TSDFs.
I was hoping to use this code in a project I'm working on but I noticed that you haven't specified a license for this repo. If you didn't intend for this to be open source then no problem, that's your prerogative, but if not, any chance you could add a Licence.md? I'm sure you've been through this kind of thing before, but just in case, I found this to be a handy reference when I was choosing a license for a few of my repos: https://choosealicense.com/
Anyway, thanks again!
Why are my numbers so large? But the data set is not large, what causes such a cause, so that the program crashes.
Is it possible to use non RGB images? like black and white? or Alpha8 images comming from UVC Camera?
I used blender to obtain RGBD images and camera internal and external parameters, but the reconstruction results were incorrect. The reconstruction results of each image were separated and could not be well overlapped. The following figure shows the reconstruction results of three groups of data.
I'm currently using the Realsense D415 camera to gather RGBD data of an object, specifically a bottle. This camera is attached to the end-effector of a 6DOF robot. I use the robot to gather real-world pose coordinates of the camera. I use this kinematics data as the pose of the camera for both translation and rotation. This program works if I perform translation across one axis. However, it doesn't work when I perform a rotation.
Reading the papers linked in the README, I can't seem to figure out if their pose data is relative to the world's origin or if they're absolute physical locations, such as ones received from a robot's kinematics data. Do you know if I should perform any additional transformations on my pose data, similar to how the C++ programs calculate the inverse of the base frame and multiply that with each subsequent frame?
The first image is with the camera just performing a translation across one axis. The second image is the camera performing a translation across the same axis, but also performing a 0 to 360 degree rotation along the way.
EDIT: By the way, I have double-checked the camera intrinsics, and have used two different realsense depth cameras and I'm encountering similar issues in both. Also, I have verified the positions of the end-effector (which is carrying the camera) are correct.
Hi, thanks for the implement.
It seems the pointcloud generated has double layers for every surface. There is some kind of thickness related to _trunc_margin. Setting self._trunc_margin = self._voxel_size will lead to many holes.
How can I get a single layer mesh without thickness
Marching cubes algorithm generates fine but irregular faces.
In order to get smooth and regular point faces maybe some more post-processing work needs to be done. I will try poisson surface reconstruction later.
Hi,
first of all, thanks for this comprehensive demonstration of TSDF!
I am struggling a bit to understand the pose.txt files. Could you explain the transformation matrix from the *.pose.txt
files in greater detail?
My assumption is that these 4x4 matrices correspond to the viewpoint transformation, so they represent the transformation from world coordinates to camera view coordinates for each frame?
If this is the case, I would have another question:
I have a set of frames and corresponding depth maps as png files. For each frame I estimated the camera pose using COLMAP, which uses SfM to calculate the correspondence points and estimate the camera position.
The output of the COLMAP reconstruction is the following:
The reconstructed pose of an image is specified as the projection from world to the camera coordinate system of an image using a quaternion (QW, QX, QY, QZ) and a translation vector (TX, TY, TZ). The quaternion is defined using the Hamilton convention, which is, for example, also used by the Eigen library. The coordinates of the projection/camera center are given by -R^t * T, where R^t is the inverse/transpose of the 3x3 rotation matrix composed from the quaternion and T is the translation vector. The local camera coordinate system of an image is defined in a way that the X axis points to the right, the Y axis to the bottom, and the Z axis to the front as seen from the image
source.
So by extracting the 3x3 rotation matrix from the quaternion and concatenating it with the translation vector (TX, TY, TZ) I should get the desired 4x4 matrix, correct?
Maybe I am misinterpreting something, because unfortunately my reconstructed results do not look reasonable.
Hi,
I got the following error while running the demo.py.
Traceback (most recent call last):
File "demo.py", line 52, in <module>
verts,faces,norms,colors = tsdf_vol.get_mesh()
File "/home/anindya/tsdf-fusion-python/fusion.py", line 262, in get_mesh
verts,faces,norms,vals = measure.marching_cubes_lewiner(tsdf_vol,level=0)
AttributeError: 'module' object has no attribute 'marching_cubes_lewiner'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.