dimatura / voxnet Goto Github PK

View Code? Open in Web Editor NEW

382.0 26.0 128.0 29 KB

3D/Volumetric Convolutional Neural Networks with Theano+Lasagne

License: Other

Python 99.67% Shell 0.33%

voxnet's Introduction

voxnet

3D/Volumetric Convolutional Neural Networks with Theano+Lasagne.

Installation

voxnet is based on Theano and Lasagne.

You will also need path.py and scikit-learn. Scikit-learn is used purely for evaluation of accuracy and is an easily removable dependency.

You can do something like

git clone [email protected]:dimatura/voxnet.git
cd voxnet
pip install --editable .

ModelNet10 Example

Get data

In this example we will use the ModelNet 10 dataset, from the excellent 3D ShapeNet project.

To make our life easier we will use the voxelized version, which is included in the source code distribution. Unfortunately, it comes in evil .mat files, so we will convert them to a more python-friendly data format first.

scripts/download_shapenet10.sh will try to download and convert the data for you. This may take a while.

# scripts/download_shapenet10.sh
wget http://3dshapenets.cs.princeton.edu/3DShapeNetsCode.zip 
unzip 3DShapeNetsCode
python convert_shapenet10.py 3DShapeNets

If you're curious, the data format is simply a tar file consisting of zlib-compressed .npy files. Simple and effective!

Train model

We will be messy and do everything in the scripts/ directory.

cd scripts/
python train.py config/shapenet10.py shapenet10_train.tar

config/shapenet10.py stores the model architecture and hyperparameters related to training as Python code. train.py loads this code dynamically, compiles the Theano model, and begins training with the data from shapenet10_train.tar. Note that compiling the Theano model might around a minute for the first execution. As soon as training begins, metrics will be printed to stdout and learned weights are periodically saved to weights.npz.

During training (which will take a few hours) you can monitor progress visually by by running scripts/train_reports.py. Note that this script has a few dependencies, including seaborn and pandas. The script uses training metrics stored in a file called metrics.jsonl. This is simply a format with one json record per line (inspired by JSON Lines).

Test model

python test.py config/shapenet10.py shapenet10_test.tar --out-fname out.npz

test.py uses the same model as train.py, but only for classifying instances from the test set. It performs simple evaluation and optionally saves the predictions in an .npz file.

If you don't want to train your own, you can use the --weights option with the an example result of running this script, shapenet10_weights.npz. This file was committed with Git LFS, so you can use that, or simply download the raw version from github.

Visualize

If you have CPU cycles to burn, try

python output_viz.py out.npz shapenet10_test.tar out.html

This will randomly select 10 instances from the test set, render them with a very very inefficient renderer, and create a small page called out.html with the renders, the ground truth and the predicted label (see example above). Requires gizeh.

Reference

@inproceedings{maturana_iros_2015,
    author = "Maturana, D. and Scherer, S.",
    title = "{VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition}",
    booktitle = "{IROS}",
    year = "2015",
    pdf = "/extra/voxnet_maturana_scherer_iros15.pdf",
}

TODO

Cleaning up
Testing
More options
Better visualization
Use new cudnn 3D capabilities

voxnet's People

Contributors

Stargazers

Watchers

Forkers

samim23 kasertim caomw bigjun nagyistge denghw alxndrkalinin xingyuxie weitang114 kltsyn nlkim0817 seasonsfx chenxp2311 satwantkumar holazollil jesse-back leliaonvidia kingsvalley guozanhua barryridge qianglan mlzxy gideonmay aliscifp arhik archive-git-repo srivignessh mayanxin89 carterbox hushunbo thunguyenphuoc jackiecx dr-lingyunxu wavelet303 galerkin rahulremanan freaad yxiao009 yisy stephencuinujia yhmh zeitgeistqian ieee820 resnick1223 conanhung sarah20187 zoonono jinfeng-wu ml-lab anguoyang aiyunfeng thatfreesky archenroot daerduocarey daitomanabe arashmh xc2017 caoba1 snci vbillys kent821 codeaudit xiaokeshen kaiwind88 nagyistoce rp0521 maximumendurance jianning-li runngezhang qibabysnow deepdriving ldepn barongeng jidebingfeng avr-tec naykira lunarpulse kamiyuanyang mlw67 fangzheng81 caifazhou mzkaramat keen3986 majk3l ai3dvision davidishere chenliz1 dominicdietl huniuc collector-m firstandsecond xidianxiaofeixian pclod mayaolong science-code patrickfeng housz77 liuwenhaha dendisuhubdy nnu-gisa

voxnet's Issues

ValueError: Could not infer context from inputs

Unable to train getting this error.

python train.py config/shapenet10.py shapenet10_train.tar
2018-01-29 13:34:37,459 INFO| Metrics will be saved to metrics.jsonl
2018-01-29 13:34:37,459 INFO| Compiling theano functions...
Traceback (most recent call last):
File "train.py", line 180, in
main(args)
File "train.py", line 133, in main
tfuncs, tvars = make_training_functions(cfg, model)
File "train.py", line 29, in make_training_functions
out = lasagne.layers.get_output(l_out, X)
File "/home/saurabh/anaconda2/lib/python2.7/site-packages/lasagne/layers/helper.py", line 197, in get_output
all_outputs[layer] = layer.get_output_for(layer_inputs, **kwargs)
File "/home/saurabh/voxnet/voxnet/layers.py", line 218, in get_output_for
contiguous_filters = gpu_contiguous(filters)
File "/home/saurabh/anaconda2/lib/python2.7/site-packages/theano/gof/op.py", line 615, in call
node = self.make_node(*inputs, **kwargs)
File "/home/saurabh/anaconda2/lib/python2.7/site-packages/theano/gpuarray/basic_ops.py", line 1067, in make_node
context_name=infer_context_name(input))
File "/home/saurabh/anaconda2/lib/python2.7/site-packages/theano/gpuarray/basic_ops.py", line 127, in infer_context_name
raise ValueError("Could not infer context from inputs")
ValueError: Could not infer context from inputs

Can't train voxnet - config file loading fails

Hi there,

whenever I try to train the network with the provided config file, I receive following error:


Traceback (most recent call last):
  File "train.py", line 184, in <module>
    main(args)
  File "train.py", line 126, in main
    config_module = imp.load_source("config", args.config_path)
  File "C:\WinPython-32bit-3.4.4.5Qt5\python-3.4.4\lib\imp.py", line 171, in loa
d_source
    module = methods.load()
  File "<frozen importlib._bootstrap>", line 1220, in load
  File "<frozen importlib._bootstrap>", line 1200, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1129, in _exec
  File "<frozen importlib._bootstrap>", line 1467, in exec_module
  File "<frozen importlib._bootstrap>", line 1577, in get_code
  File "<frozen importlib._bootstrap>", line 672, in _code_to_bytecode
ValueError: unmarshallable object

Strange behaviour on the train and test accuracies

I'm running the test example you propose using:

python train.py config/shapenet10.py shapenet10_train.tar

I see that the train accuracy goes up to 100% already from the 3rd iteration then it start to oscillate between 0% and 100% which is quite strange for a neural network (the loss seems coherent but it oscillate too a bit). Then if I run the test (I have set 'checkpoint_every_nth' : 1) using:

python test.py config/shapenet10.py shapenet10_test.tar

I get a very low accuracy of less than 10% ( or at least for the first 4 epochs). Is this normal? I have not modified the code.

Thanks

Running Voxnet on CPU

Is it possible to run the code on CPU only? I have gone through the issues, and have been able to fix most of the problems I have been facing. At last, the following is the command I am using to train the ShapeNet model:

THEANO_FLAGS='device=cpu,force_device=True,floatX=float32' python train.py config/shapenet10.py ../shapenet10_train.tar

I have CUDA installed in /usr/local/cuda, and the PATH and LD_LIBRARY_PATH variables have been updated to have /usr/local/cuda/bin and /usr/local/cuda/lib64 respectively.

When I run the command though, it keeps failing with:
2017-08-26 15:28:06,904 INFO| Metrics will be saved to metrics.jsonl
2017-08-26 15:28:06,904 INFO| Compiling theano functions...
Traceback (most recent call last):
File "train.py", line 180, in
main(args)
File "train.py", line 132, in main
tfuncs, tvars = make_training_functions(cfg, model)
File "train.py", line 28, in make_training_functions
out = lasagne.layers.get_output(l_out, X)
File "/home/ubuntu/.local/lib/python2.7/site-packages/lasagne/layers/helper.py", line 185, in get_output
all_outputs[layer] = layer.get_output_for(layer_inputs, **kwargs)
File "/home/ubuntu/sandboxes/voxnet/voxnet/layers.py", line 225, in get_output_for
activation = conved + self.b.dimshuffle('x', 0, 'x', 'x', 'x')
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/tensor/var.py", line 128, in add
return theano.tensor.basic.add(self, other)
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/gof/op.py", line 507, in call
node = self.make_node(*inputs, **kwargs)
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/tensor/elemwise.py", line 527, in make_node
inputs = map(as_tensor_variable, inputs)
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/tensor/basic.py", line 145, in as_tensor_variable
return x._as_TensorVariable() # TODO: pass name and ndim arguments
File "/home/ubuntu/.local/lib/python2.7/site-packages/theano/sandbox/cuda/var.py", line 30, in _as_TensorVariable
return HostFromGpu()(self)
NameError: global name 'HostFromGpu' is not defined

In the above, Device is CPU and force_device is True. If I set force_device to false, then it fails to retrieve any GPUs - probably because I did not follow instructions to enable GPU usage at all. I am first trying to get it working on CPU, since that seems a simpler thing to do.

Can someone help?

When test use shapenet10_weights.npz',Failed to interpret file Path(u'shapenet10_weights.npz') as a pickle

Convert Shapenet10

WindowsError: [Error 183] Cannot create a file when that file already exists

While running the train.py script I am getting the following error

os.rename(self, new)  
WindowsError: [Error 183] Cannot create a file when that file already exists

I am using Python 2.7

No module named voxnet

when i try to do(from voxnet-master) :
python ./scripts/convert_shapenet10.py 3DShapeNets, I get error:
Traceback (most recent call last):
File "./scripts/convert_shapenet10.py", line 9, in
import voxnet
ImportError: No module named voxnet

my directory structure:
voxnet-master:
├── 3DShapeNets
├── path.py
├── scripts
|── voxnet
├── doc

I installed sklearn and python-path from synaptic. I have not used the pip-install --editable method to install sklearn and python-path

?

EnvironmentError: ('The following error happened while compiling the node', GpuFromHost(X), '\n', 'You forced the use of gpu device gpu, but CUDA initialization failed with error:\nUnable to get the number of gpus available: CUDA driver version is insufficient for CUDA runtime version')

What‘s ’the configration environment？

Theano.sandbox.cuda deprecated

I'm trying to train the model, however get this information:
from theano.sandbox.cuda.basic_ops import gpu_contiguous ImportError: No module named cuda.basic_ops

And I found Theano converted from cuda to the new gpu back end(gpuarray) since 0.9.0.
Here is what they said:
https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29

Anybody got the same issue?

File named weights.npz is not saved in the trainning stage.

I trained model by running commands:
cd scripts/
python train.py config/shapenet10.py shapenet10_train.tar
, after about 11 hours, training is done, a file named metrics.jsonl is generated, but weights.npz does not exist.

Possible to use this framework for multi-class classification in a LiDAR image ?

I have LiDAR data with multiple objects classes in a single image, can this framework be extended to do this task .. any pointers on where to extend would help. thx

Docs: Python 3 not supported

Might want to throw in a mention in the README that voxnet isn't Python3-ready. Thanks for creating and releasing voxnet, BTW.

Voxnet for object detection and segmentation

Hi, I'm a bit of a noob in the topic of voxel grid classification, so sorry in advance if the question is a bit dumb.

Would it be possible to use VoxNet as it is to segment a point cloud (or occupancy grid for that matter) so I can distinguish which voxels belong to the object I'm looking for? Is there any alternative project to achieve this goal?

Thanks in advance!

Training error 'module' object has no attribute 'downsample'

When I am trying to run the train.py file for training, I am getting the error 'module' object has no attribute 'downsample'
I read that downsample in theano.tensor.signal import downsample has been moved to pool module.
But which function is equivalent in pool is equivalent to DownsampleFactorMax in downsample.

Please help ASAP.

Traceback (most recent call last):
  File "train.py", line 182, in <module>
    main(args)
  File "train.py", line 135, in main
    tfuncs, tvars = make_training_functions(cfg, model)
  File "train.py", line 31, in make_training_functions
    out = lasagne.layers.get_output(l_out, X)
  File "/usr/local/lib/python2.7/dist-packages/lasagne/layers/helper.py", line 185, in get_output
    all_outputs[layer] = layer.get_output_for(layer_inputs, **kwargs)
  File "/home/nvidia/virginia/voxnet/voxnet/layers.py", line 244, in get_output_for
    out = max_pool_3d(input, self.pool_shape)
  File "/home/nvidia/virginia/voxnet/voxnet/max_pool_3d.py", line 72, in max_pool_3d
    op = T.signal.downsample.DownsampleFactorMax((ds[1], ds[2]), ignore_border)
AttributeError: 'module' object has no attribute 'downsample'

TypeError: CudaNdarrayType only supports dtype float32 for now.

I'm just trying to run example given in the readme file.

Everything works until I get to the point where ./scripts/train.py is called where I get this error when I try to train it on the sample data.

$ python train.py config/shapenet10.py shapenet10_train.tar
...
TypeError: CudaNdarrayType only supports dtype float32 for now. Tried using dtype float64 for variable None

I'm working on a Mac with Python 2.7.11. I have theano and lasagne installed.

Am I missing something here?

Thanks in advance

density grid in your paper 'VoxNet'

Hello! Daniel Maturana:
I read your paper 'VoxNet' , which is great! However, I do not understand how to create a density grid in RGBD data obtained through the camera.Could you offer any help?Thank you very much!

I have convert my own .wrl files into voxel representation using the voxelization code in 3D shapenet,then how to convert them to occupancy grid models???

I noticed the voxelized version of ModelNet10 refered to in the original paper is also just voxel representation.And here codes take ModelNet10 as example,but i have not found some code to occupancy grid models.

Speeding up voxnet training

First of all, thank you for the code!

I tried increasing voxnet batch size to 256 MB but that only increases GPU memory footprint from 175MB to around 400MB... Having a GPU with 16GB RAM this is somewhat pitty. Any ideas on how to optimize the use of my GPU?

NameError: global name 'btc01' is not defined

Using the voxnet.layers.Conv3dLayer I am getting this error:

line 120, in get_output_for out_btc01 = conv3d2d.conv3d(signals=btc01, filters=self.W, NameError: global name 'btc01' is not defined

I use normal implementation
in_layer = layers.InputLayer(shape=(None, 100, None, None, None), input_var=foo) num_filters = 100 conv3d_voxnet = voxnet.layers.Conv3dLayer(in_layer, num_filters=num_filters, filter_size=(1, 1, 1))

Indeed the variable btc01 is not defined, a few lines up you find input_btc01, i try renaming the first one to the second but I am getting now new errors.

PS: I haven't been able to use a single layer form this git, all of them through me errors and force me to change Theano and Lasagne versions all the time, i am a little disappointed.

get classification probabilities

@dimatura how can I get classification probabilities between 0-1 using test.py? How would I use lasagne.objectives.cross_entropy() here?

IOError: Failed to interpret file Path(u'weights.npz') as a pickle

Hi:
I've met a problem when I run ”python test.py config/shapenet10.py shapenet10_test.tar --out-fname out.npz.“
The details about the problem list below：
nal.pool module.
"downsample module has been moved to the theano.tensor.signal.pool module.")
2016-07-28 18:25:41,842 INFO| Loading weights from weights.npz
Traceback (most recent call last):
File "test.py", line 108, in
main(args)
File "test.py", line 75, in main
voxnet.checkpoints.load_weights(args.weights_fname, model['l_out'])
File "/root/voxnet-master/voxnet/checkpoints.py", line 39, in load_weights
param_dict = np.load(fname)
File "/usr/lib64/python2.7/site-packages/numpy/lib/npyio.py", line 384, in load
"Failed to interpret file %s as a pickle" % repr(file))
IOError: Failed to interpret file Path(u'weights.npz') as a pickle

I have no idea about it 。

"convert_shapenet10.py" script fails

When I execute the script to convert the data required to perform the training, I obtain the following message:
"You are importing theano.sandbox.cuda. This is the old GPU back-end and "
unittest.case.SkipTest: You are importing theano.sandbox.cuda. This is the old GPU back-end and is removed from Theano. Use Theano 0.9 to use it. Even better, transition to the new GPU back-end! See https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29"

I changed the gpu-backend following the sugestion in the message, but the error remains.
Is there any repository where download the converted data?

How to use the model to classify ?

Hi, I finished training and testing,but I don't know how to use the model to classify objects.

What is the function of "chunk_size"?

Hi Dimatura,
Your paper and work are really inspiring.

I want to use Keras to repeat your work through the given codes.

I notice that in your codes you have augmented(jitter) and load the 3D voxel data in a size of "chunk_size * batch_size".

How about just take a mini batch of data (e.g.，we take a batch size of 32 in your code) every time before we jitter or load the data.

What's the purpose of processing the data regarding "chunk_size * batch_size" as a whole instead of just a mini batch of data without chunk_size？

Voxelization of point cloud is low

Hi, thanks for your excellent works!

May I know how you did voxelization of the raw point cloud? I wrote my own scripts for voxelization in both matlab and cython(complied to C), and found that voxelization of a large point cloud (maybe a frame of Kinect2, it contains part of your room )is very slow, around 1min, which is not real-time.

May I know more details of voxelization part of your works?(of raw point cloud, not meshed files)

Thanks a lot!

TypeError: unicode argument expected, got 'str'

Hi, i'm so new to github and been trying to implement voxnet these days but i got some problems.
not sure if it'll be too late to ask questions.

i've set up all enviroment in anaconda3 as below:
python 2.7.18
numpy 1.16.6
lasagne 0.2 (i upgraded to 0.2 since there was error in 0.1 version)
theono 0.9.0
scipy 1.2.1
scikit-learn 0.20.3
path 15.0.1
pip 19.3.1

I downloaded 3DShapeNets and unzipped it under the scritpts file already.
so far im trying the python convert_shapenet10.py 3DShapeNets.
But it shows TypeError: unicode argument expected, got 'str' in format.py file from virtual enviroment:

File "C:\Users\ruti\anaconda3\envs\VoxNet_2\lib\site-packages\numpy\lib\format.py", line 383, in _write_array_header
fp.write(header_prefix)
TypeError: unicode argument expected, got 'str

Do you know anything about this problem?
it'll be great if i can get some advises. Thanks!

no module name activation

ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-9-86b7106a2640> in <module>()
     14 from lasagne.layers import SliceLayer as SL
     15 
---> 16 import voxnet
     17 import CAcheckpoints
     18 import GANcheckpoints

/usr/local/lib/python3.6/dist-packages/voxnet/__init__.py in <module>()
      1 from .version import __version__
      2 
----> 3 import activations
      4 import checkpoints
      5 import init

ModuleNotFoundError: No module named 'activations

Why do we load the data once prior to the loop of epochs in the main function of training phase?

Hi， Dimatura.

In your main function of file 'train.py', you first load the data once and then in the loop of every epoch you load the data again.

` loader = (data_loader(cfg, args.training_fname))

    for epoch in xrange(cfg['max_epochs']):

           loader = (data_loader(cfg, args.training_fname))`

I wonder why we first use the data_loader function before the loop (as the bold face codes show)?

Is it unnecessary or I do not get your point?
That is to say, how about just writing:
`
for epoch in xrange(cfg['max_epochs']):

           loader = (data_loader(cfg, args.training_fname))`

Could you please help me on that?

No module named path

Hello,

I install your codes. I want to use only cpu because my machine hasnot NVIDIA. Is it avaible a version for cpu?

pickle error on the distribution file ?

Running the test without training and using the distribution shapenet_weights.npz gave pickle errors, is the file corrupted ?

$python test.py --weights shapenet10_weights.npz config/shapenet10.py shapenet10_test.tar --out-fname out.npz
WARNING (theano.sandbox.cuda.blas): do not use pad for BaseGpuCorr3dMM; please set padding in border_mode parameter, see the docstring for more details
2016-11-25 13:39:56,851 WARNING| do not use pad for BaseGpuCorr3dMM; please set padding in border_mode parameter, see the docstring for more details
WARNING (theano.sandbox.cuda.blas): do not use pad for BaseGpuCorr3dMM; please set padding in border_mode parameter, see the docstring for more details
2016-11-25 13:39:56,853 WARNING| do not use pad for BaseGpuCorr3dMM; please set padding in border_mode parameter, see the docstring for more details
2016-11-25 13:39:56,906 INFO| Loading weights from shapenet10_weights.npz
Traceback (most recent call last):
File "test.py", line 108, in
main(args)
File "test.py", line 75, in main
voxnet.checkpoints.load_weights(args.weights_fname, model['l_out'])
File "/home/prasad/voxnet/voxnet-master/voxnet/checkpoints.py", line 39, in load_weights
param_dict = np.load(fname)
File "/home/prasad/.virtualenvs/caffe24/local/lib/python2.7/site-packages/numpy/lib/npyio.py", line 416, in load
"Failed to interpret file %s as a pickle" % repr(file))
IOError: Failed to interpret file Path(u'shapenet10_weights.npz') as a pickle