andyzeng / tsdf-fusion-python Goto Github PK

View Code? Open in Web Editor NEW

1.2K 25.0 218.0 110.26 MB

Python code to fuse multiple RGB-D images into a TSDF voxel volume.

Home Page: http://andyzeng.github.io/

License: BSD 2-Clause "Simplified" License

Python 100.00%

rgbd tsdf kinect-fusion depth-camera 3d-deep-learning volumetric-data cuda 3d-reconstruction 3d vision

tsdf-fusion-python's Introduction

Volumetric TSDF Fusion of RGB-D Images in Python

This is a lightweight python script that fuses multiple registered color and depth images into a projective truncated signed distance function (TSDF) volume, which can then be used to create high quality 3D surface meshes and point clouds. Tested on Ubuntu 16.04.

An older CUDA/C++ version can be found here.

Requirements

Python 2.7+ with NumPy, PyCUDA, OpenCV, Scikit-image and Numba. These can be quickly installed/updated by running the following:
```
pip install --user numpy opencv-python scikit-image numba
```
[Optional] GPU acceleration requires an NVIDA GPU with CUDA and PyCUDA:
```
pip install --user pycuda
```

Demo

This demo fuses 1000 RGB-D images from the 7-scenes dataset into a 405 x 264 x 289 projective TSDF voxel volume with 2cm resolution at about 30 FPS in GPU mode (0.4 FPS in CPU mode), and outputs a 3D mesh mesh.ply which can be visualized with a 3D viewer like Meshlab.

Note: color images are saved as 24-bit PNG RGB, depth images are saved as 16-bit PNG in millimeters.

python demo.py

Seen In

References

Citing

This repository is a part of 3DMatch Toolbox. If you find this code useful in your work, please consider citing:

@inproceedings{zeng20163dmatch,
    title={3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions},
    author={Zeng, Andy and Song, Shuran and Nie{\ss}ner, Matthias and Fisher, Matthew and Xiao, Jianxiong and Funkhouser, Thomas},
    booktitle={CVPR},
    year={2017}
}

tsdf-fusion-python's People

Contributors

Stargazers

Watchers

Forkers

littlebylittle2 andrewliao11 ml-lab qingsimon foxshuang hoangcuongbk80 momandai pandinosaurus bangyanz hyzcn dingmyu gemlongman konvyzas ndrwmlnk collector-m suyibjut dragoon1989 chris0919 houpoc daydreamer2023 bubblyyi bigdataha hhy5277 ffrivera0 weiyachen nnu-gisa zmurez snailwalkeryc wangziyang182 azuresilent andyzzz shuochen365 robvcc ssteveminq pamzerbhu peterzs davidgillsjo bobyue0118 t141 ztliu62 jinjintang zhongtao-aptiv yenchenlin absorbguo zhou13 lvjiahui kosuke55 zeta1999 super-ruilei alicedingyueming rancheng zhangtiantians yuantao15 badexception liuzhenboo zhenjia-xu shreeyak emiliofidalgo shuishui602 tcrapse folid landy-hu yang-l1 megayeye fabiotosi92 chuong baldrlector hiotto chenxingshensecond kgnamba freegliboracle jingtongli mfkiwl qzane far-command-man ericcousineau-tri stevenskey jmarangola rcffc ericwang0701 ohadmen jake-zhi arg-nctu lemanhtrung darylperalta hooray1164 tuskaw 35p32 glebshevchukk iriyagupta guangyaoshi elena-ssq oknkc8 robotvisionmovement gitwaha liqiang1980 shishenghuang sichitong wangpeng-512 gitshohoku

tsdf-fusion-python's Issues

False Alignment / Camera Pose / ROBI Dataset

I try to apply this TSDF Fusion on the ROBI Dataset, but I have troubles with the alignment of the frames and integration. The plane alignment seems to be right but I still have some offset in the plane.

It looks like the camera pose is not right, but I already used the same dataset/camera pose with TSDF from Open3D which worked flawlessly.

Does anyone have an idea, why it's not working correctly?

TSDF Volume from Open3D:

TSDF Volume/Mesh from this repo:

wrong demo result

There is double sided effect in the extracted mesh as well as point cloud in the demo code. The other side is not correct, how can this be fixed? Thank you.

Real time visualization

How to visualize in real time during the reconstruction process, like the example "/images/fusion-movie-gif"?

ValueError: Surface level must be within volume data range

Excuse me everyone, I encountered the following error while rebuilding using my own dataset:
Traceback (most recent call last): File "D:/Learning_Tool/Pycharm/Project/TSDF/demo.py", line 72, in <module> point_cloud = tsdf_vol.get_point_cloud() File "D:\Learning_Tool\Pycharm\Project\TSDF\fusion.py", line 307, in get_point_cloud verts = measure.marching_cubes_lewiner(tsdf_vol, level=0)[0] File "D:\Learning_Tool\AnacondaEP\envs\Realsense\lib\site-packages\skimage\measure\_marching_cubes_lewiner.py", line 129, in marching_cubes_lewiner raise ValueError("Surface level must be within volume data range.") ValueError: Surface level must be within volume data range.
I would be extremely grateful if anyone could provide some suggestions or methods for a solution?

Compressed package error

Hello，Compressed package error, downloaded compressed package decompression error, there is no data

How to interpret the depth map here?

I visualized the depth map and it showed something like below:

Does anybody know why is it different from the common depth maps, such as those from Kinect, with all these waves on top of the view?

Thanks.

How to get camera poses for each frame ?

HELLO there,
Someone can tell me how can to get RGB-D, Depht frames and camera poses

Truncated comment

Hi @andyzeng, I am studying the codes and noticed that there is a comment that was incomplete. Under fusion.py, the function integrate(self, color_im, depth_im, cam_intr, cam_pose, obs_weight=1.): has a description for obs_weight that says:

obs_weight (float): The weight to assign for the current observation. A higher value

I was wondering what was the complete description? Thanks in advance!

Visualization?

Hi, is there any live visualization for it?
Just like the small movie shows.

Camera Pose

Hi there,

Thanks for putting this work in public.

My question may sound silly, but do I need to have the camera pose to be able to use this repository? that's my impression by going through your codes.

What I have is just bunch of RGB-D images and I would like to fuse them to each other get the extended map.

Regards,
Jacob

The reconstruction results of each input cannot overlap

I used blender to obtain RGBD images and camera internal and external parameters, but the reconstruction results were incorrect. The reconstruction results of each image were separated and could not be well overlapped. The following figure shows the reconstruction results of three groups of data.

What causes this? I hope you can answer it. Thank you

Surface level must be within volume data range.

hello Andy,
thank you for your wonderful contribution.

I tried a different data , and modified the voxel_size to 0.1 to avoid memory problems
but i get this error

ValueError: Surface level must be within volume data range.

from the function

measure.marching_cubes_lewiner(tsdf_vol, level=0)

tsdf_vol.get_point_cloud()

thank you for your time

Implementation with ROS

Has anyone tried this code in a ROS setting? I'm stuck at getting the transformation matrix, and more specifically, the rotation matrix. I am simulating the setup in Gazebo.

ROS uses a coordinate system where

X : forward
Y : left
Z : up

I am obtaining the pose matrices of 'world' to 'camera_depth_optical_link', since that is supposed to have the right transformations. For instance, one of the transformations is:

[[-7.10359030e-01 -5.65690465e-01 4.18789143e-01 5.30712575e-02]
[-7.03839505e-01 5.70930343e-01 -4.22668304e-01 2.73008553e-01]
[-3.93540756e-12 -5.95006589e-01 -8.03720821e-01 4.58332924e-01]
[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00]]

For better understanding, the rotation matrix in degrees corresponds to :
[-1.43486898e+02 2.25488605e-10 -1.35264135e+02]

I am merely rotating my camera about a vertical pole. (so, in this the 3rd element constantly changes) What must I do to get the correct transformation matrices. I have referred to issue #11 and have understood that this transformation is what I need to correct, however, despite several attempts, I am unable to get the right pose. Any solution will be appreciated!

.ply file not generating while running demo.py

Hi,
I got the following error while running the demo.py.

Traceback (most recent call last):
  File "demo.py", line 52, in <module>
    verts,faces,norms,colors = tsdf_vol.get_mesh()
  File "/home/anindya/tsdf-fusion-python/fusion.py", line 262, in get_mesh
    verts,faces,norms,vals = measure.marching_cubes_lewiner(tsdf_vol,level=0)
AttributeError: 'module' object has no attribute 'marching_cubes_lewiner'

how to get a single layer mesh?

Hi, thanks for the implement.
It seems the pointcloud generated has double layers for every surface. There is some kind of thickness related to _trunc_margin. Setting self._trunc_margin = self._voxel_size will lead to many holes.
How can I get a single layer mesh without thickness

Marching cubes algorithm generates fine but irregular faces.
In order to get smooth and regular point faces maybe some more post-processing work needs to be done. I will try poisson surface reconstruction later.

3D depth sensing

Explaining poses.txt

Hi,

first of all, thanks for this comprehensive demonstration of TSDF!

I am struggling a bit to understand the pose.txt files. Could you explain the transformation matrix from the *.pose.txt files in greater detail?

My assumption is that these 4x4 matrices correspond to the viewpoint transformation, so they represent the transformation from world coordinates to camera view coordinates for each frame?

If this is the case, I would have another question:

I have a set of frames and corresponding depth maps as png files. For each frame I estimated the camera pose using COLMAP, which uses SfM to calculate the correspondence points and estimate the camera position.
The output of the COLMAP reconstruction is the following:

The reconstructed pose of an image is specified as the projection from world to the camera coordinate system of an image using a quaternion (QW, QX, QY, QZ) and a translation vector (TX, TY, TZ). The quaternion is defined using the Hamilton convention, which is, for example, also used by the Eigen library. The coordinates of the projection/camera center are given by -R^t * T, where R^t is the inverse/transpose of the 3x3 rotation matrix composed from the quaternion and T is the translation vector. The local camera coordinate system of an image is defined in a way that the X axis points to the right, the Y axis to the bottom, and the Z axis to the front as seen from the image source.

So by extracting the 3x3 rotation matrix from the quaternion and concatenating it with the translation vector (TX, TY, TZ) I should get the desired 4x4 matrix, correct?

Maybe I am misinterpreting something, because unfortunately my reconstructed results do not look reasonable.

marching_cubes_lewiner is deprecated? (from skimage import measure)

Hi Andy, is it possible that marching_cubes_lewiner is deprecated?

In order for fusion.py to run, I had to modify the following in two places:

    verts = measure.marching_cubes_lewiner(tsdf_vol, level=0)[0]

to:

    verts = measure.marching_cubes(tsdf_vol, level=0, method='lewiner')[0]

Thank you!

Failed to import PyCUDA !!!

system: ubuntu16.04
version:
cuda: 9.0
pycuda: 2019.1

i test each line of fusion.py, and found line 11 has follows problem:

is version problem ???

Voxel volume size: 2565 x 4061 x 1821 - # points: 18,968,382,765

Why are my numbers so large? But the data set is not large, what causes such a cause, so that the program crashes.

ValueError: too many values to unpack (expected 2)

When I run my own dataset, I have problems that I don't understand。
What should I do？

Initializing voxel volume...
Voxel volume size: 62 x 39 x 54 - # points: 130,572
Fusing frame 1/228
Traceback (most recent call last):
File "/snap/pycharm-professional/306/plugins/python/helpers/pydev/pydevd.py", line 1496, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/snap/pycharm-professional/306/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/wang/tsdf-fusion-python-master/demo.py", line 55, in <module>
tsdf_vol.integrate(color_image, depth_im, cam_intr, cam_pose, obs_weight=1.)
File "/home/wang/tsdf-fusion-python-master/fusion.py", line 218, in integrate
im_h, im_w = depth_im.shape
ValueError: too many values to unpack (expected 2)

About rigid transformation matrix?

Thanks for your good repo.
Would provide the notation of your 4x4 rigid transformation matrix? It will easier to understand, what they are? or any other link?

Using different data for testing

Hello! i'm trying to modify the demo to get data from a folder instead of your data but i'm getting this error

Initializing voxel volume...
Voxel volume size: 5770 x 4123 x 3752
Traceback (most recent call last):
File "getfusion.py", line 48, in
tsdf_vol = fusion.TSDFVolume(vol_bnds,voxel_size=0.02)
File "/home/anigomez/Documents/DepthWork/tsdf-fusion-python/fusion.py", line 34, in init
self._tsdf_vol_cpu = np.ones(self._vol_dim).astype(np.float32)
File "/home/anigomez/.local/lib/python2.7/site-packages/numpy/core/numeric.py", line 223, in ones
a = empty(shape, dtype, order)
MemoryError

I left the voxel params the same as before, can you please tell me if i need to calculate voxel bounds, and if possible, how to do it?

here is my modified demo.py:

getfusion.txt

demo.py is throwing an error after initial pull

python demo.py
Traceback (most recent call last):
File "demo.py", line 9, in

    import fusion
  File "tsdf-fusion-python/fusion.py", line 346
    xyz_t_h = (transform @ xyz_h.T).T
                         ^
SyntaxError: invalid syntax

Issue with pycuda- error invoking 'nvcc --version'

Hello,

Script works absolutely fine in CPU mode but its very slow. I was trying to debug the code and got stuck on this issue.
I have installed all the required packages and also added below 2 lines to my .bashrc file

export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}$

export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Below is the error log from the console:

Connected to pydev debugger (build 191.7479.30)
Estimating voxel volume bounds...
Initializing voxel volume...
Backend TkAgg is interactive backend. Turning interactive mode on.
Voxel volume size: 405 x 264 x 289
Traceback (most recent call last):
File "/home/mhv7rng/pycharm-community-2019.1.3/helpers/pydev/pydevd.py", line 1758, in
main()
File "/home/mhv7rng/pycharm-community-2019.1.3/helpers/pydev/pydevd.py", line 1752, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/mhv7rng/pycharm-community-2019.1.3/helpers/pydev/pydevd.py", line 1147, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/mhv7rng/PycharmProjects/tsdf_fusion/demo.py", line 30, in
tsdf_vol = fusion.TSDFVolume(vol_bnds,voxel_size=0.02)
File "/home/mhv7rng/PycharmProjects/tsdf_fusion/fusion.py", line 133, in init
}""")
File "/home/mhv7rng/anaconda2/envs/don/lib/python2.7/site-packages/pycuda/compiler.py", line 291, in init
arch, code, cache_dir, include_dirs)
File "/home/mhv7rng/anaconda2/envs/don/lib/python2.7/site-packages/pycuda/compiler.py", line 254, in compile
return compile_plain(source, options, keep, nvcc, cache_dir, target)
File "/home/mhv7rng/anaconda2/envs/don/lib/python2.7/site-packages/pycuda/compiler.py", line 84, in compile_plain
checksum.update(get_nvcc_version(nvcc).encode("utf-8"))
File "</home/mhv7rng/anaconda2/envs/don/lib/python2.7/site-packages/decorator.pyc:decorator-gen-115>", line 2, in get_nvcc_version
File "/home/mhv7rng/.local/lib/python2.7/site-packages/pytools/init.py", line 539, in _deco
result = func(*args)
File "/home/mhv7rng/anaconda2/envs/don/lib/python2.7/site-packages/pycuda/compiler.py", line 16, in get_nvcc_version
result, stdout, stderr = call_capture_output(cmdline)
File "/home/mhv7rng/.local/lib/python2.7/site-packages/pytools/prefork.py", line 227, in call_capture_output
return forker.call_capture_output(cmdline, cwd, error_on_nonzero)
File "/home/mhv7rng/.local/lib/python2.7/site-packages/pytools/prefork.py", line 61, in call_capture_output
% (" ".join(cmdline), e))
pytools.prefork.ExecError: error invoking 'nvcc --version': [Errno 2] No such file or directory

Can you advice on how to solve this issue??

Is it possible to use non RGB images?

Is it possible to use non RGB images? like black and white? or Alpha8 images comming from UVC Camera?

Doesn't handle camera rotations properly

I'm currently using the Realsense D415 camera to gather RGBD data of an object, specifically a bottle. This camera is attached to the end-effector of a 6DOF robot. I use the robot to gather real-world pose coordinates of the camera. I use this kinematics data as the pose of the camera for both translation and rotation. This program works if I perform translation across one axis. However, it doesn't work when I perform a rotation.

Reading the papers linked in the README, I can't seem to figure out if their pose data is relative to the world's origin or if they're absolute physical locations, such as ones received from a robot's kinematics data. Do you know if I should perform any additional transformations on my pose data, similar to how the C++ programs calculate the inverse of the base frame and multiply that with each subsequent frame?

The first image is with the camera just performing a translation across one axis. The second image is the camera performing a translation across the same axis, but also performing a 0 to 360 degree rotation along the way.

EDIT: By the way, I have double-checked the camera intrinsics, and have used two different realsense depth cameras and I'm encountering similar issues in both. Also, I have verified the positions of the end-effector (which is carrying the camera) are correct.

No Licence

Hey @andyzeng,

First off thanks for putting this code up here, it's been a really handy reference for getting my head around TSDFs.

I was hoping to use this code in a project I'm working on but I noticed that you haven't specified a license for this repo. If you didn't intend for this to be open source then no problem, that's your prerogative, but if not, any chance you could add a Licence.md? I'm sure you've been through this kind of thing before, but just in case, I found this to be a handy reference when I was choosing a license for a few of my repos: https://choosealicense.com/

Anyway, thanks again!

How To Get Camera Pose

Hi there,
I want to make pose.txt with my own data, but I don’t know how this 4*4 matrix is generated. Some people said that it is to obtain the pose relative to each frame according to a specific key frame.
https://github.com/intel-isl/Open3D/blob/master/examples/Python/ReconstructionSystem/make_fragments.py
I used this method which is talked in previous issues. I dont konw the trans,odo_init which is we want.
Result is bad with original dataset:

TSDF encoding of edge cases

Hi there,
shouldn't the tsdf_vol be initialized with -1, s.t in the cases when the voxel is outside view frustum or behind the camera they are regarded as 'unobserved'?
self._tsdf_vol_cpu = -1 * np.ones(self._vol_dim).astype(np.float32)
Depending on the context, a zero depth_value might still be best regarded as observed:

if (depth_value == 0)
{
 tsdf_vol[voxel_idx] = 1;
 return;
}

Incorrect Result

Hello,

I used my own data to get the ply but the result doesn't look correct. What will be the possible problems? Is it related to my camera parameters?

Run with custom data

Hi fellas, I can run demo.py with data provided here. But I run into trouble when trying with custom data. I use RealSense L515 with ROS to capture RGB and depth images and ORB_SLAM to get poses. If I use demo.py without any change it will throw MemoryError at TSDFVolume initialization. But if I use bigger values here it will generate messy and non-sense mesh which is very much useless.

So I would really be interested to hear how you use this repo (or C++ version) with your own dataset.

By the way, I can use my data with BundleFusion without issue, even though it may run into an error after processing few hundreds of images.

Resizing/cropping the image

Thanks for the great code!
I was wondering if I could crop our relevant parts of the image and then run TSDF on that. If I do that, would I have to change the camera intrinsics?
I will give out an example scenario. Suppose in the example sequence, I wish to reconstruct only the table, and I have masked out only the table from the sequence, I would then crop the image only to have the table, (and if necessary, resize it back to the original image dimensions). Is this possible? In this case, what changes do I have to make to the camera intrinsic and the pose?
Any link to theory would also be great!
Thanks

Faster in CPU mode

Hi,
I found that if you move the np.meshgrid operation to init() function, the speed of CPU mode will up to almost 5 FPS.

How to get started with new images?

Hello,

If I wanted to start with fresh set of images to use this code to generate a mesh.ply file, what steps do I need to do to get the new images ready? I'm new to this area of study so any help getting going is appreciated!!

Thanks!

About 2cm resolution?

Thank you very much for your nice repo.
How can I increase the resolution?
According to your documentation:
"from the 7-scenes dataset into a 405 x 264 x 289 projective TSDF voxel volume with 2cm resolution at about 30 FPS in GPU mode"

Thanks in Advanced

what the RGB images are used for?

from the paper KinectFusion, the only raw depth images are enough, so I don't know what the RGB images here are used for?

andyzeng / tsdf-fusion-python Goto Github PK

tsdf-fusion-python's Introduction

Volumetric TSDF Fusion of RGB-D Images in Python

Requirements

Demo

Seen In

References

Citing

tsdf-fusion-python's People

Contributors

Stargazers

Watchers

Forkers

tsdf-fusion-python's Issues

system: ubuntu16.04 version: cuda: 9.0 pycuda: 2019.1

Recommend Projects

Recommend Topics

Recommend Org

system: ubuntu16.04
version:
cuda: 9.0
pycuda: 2019.1