farrajota / fastrcnn-example-torch Goto Github PK
View Code? Open in Web Editor NEWExample code on how to use the fastrcnn package for torch7
License: MIT License
Example code on how to use the fastrcnn package for torch7
License: MIT License
Seems some problems with the mutiGPU support.
I trained the Alexnet with -nGPU parameter 1, the training procedure goes well.
When I changed the -nGPU parameter to 2 or 4, error occurs:
**********************************************
*** Starting Train epoch 1/40, LR=1e-03
**********************************************
cudnnConvolutionForward failed: 7 convDesc=[mode : CUDNN_CROSS_CORRELATION datatype : CUDNN_DATA_FLOAT] hash=-dimA1,3,600,1000 -filtA64,3,3,3
1,64,600,1000 -padA1,1 -convStrideA1,1 CUDNN_DATA_FLOAT
/home/wangty/torch/install/bin/luajit: /home/wangty/torch/install/share/lua/5.1/nn/Container.lua:67:
In 1 module of nn.Sequential:
In 1 module of nn.Sequential:
In 1 module of nn.ParallelTable:
In 1 module of nn.Sequential:
In 1 module of nn.Sequential:
/home/wangty/torch/install/share/lua/5.1/cudnn/find.lua:94: Error in CuDNN: CUDNN_STATUS_MAPPING_ERROR (cudnnConvolutionForward)
stack traceback:
[C]: in function 'error'
/home/wangty/torch/install/share/lua/5.1/cudnn/find.lua:94: in function 'checkedCall'
...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:194: in function <...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:186>
[C]: in function 'xpcall'
/home/wangty/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
/home/wangty/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'updateOutput'
...ch/install/share/lua/5.1/fastrcnn/modules/NoBackprop.lua:21: in function <...ch/install/share/lua/5.1/fastrcnn/modules/NoBackprop.lua:20>
[C]: in function 'xpcall'
/home/wangty/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
/home/wangty/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'closure'
...
/home/wangty/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function </home/wangty/torch/install/share/lua/5.1/nn/Sequential.lua:41>
[C]: in function 'xpcall'
/home/wangty/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
/home/wangty/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
...ch/install/share/lua/5.1/torchnet/engine/optimengine.lua:102: in function 'train'
/home/wangty/torch/install/share/lua/5.1/fastrcnn/train.lua:285: in function 'train'
train.lua:66: in main chunk
[C]: in function 'dofile'
...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
hello farrajota,
I am studying the fastrcnn, and very interested to you fastrcnn-example-torch repo.
I follow the readme, however get the error:
$ th train.lua
==> (1/5) Load options
==> (2/5) Load dataset data loader
==> (3/5) Load roi proposals data
==> (4/5) Setup model:
==> (5/5) Train Fast-RCNN model
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/dbcollection/manager.py", line 313, in config_cache
cache_manager = CacheManager(is_test)
File "/usr/local/lib/python2.7/dist-packages/dbcollection/utils/cache.py", line 26, in __init__
self.setup_paths(is_test)
File "/usr/local/lib/python2.7/dist-packages/dbcollection/utils/cache.py", line 60, in setup_paths
home_dir = os.path.expanduser("~")
AttributeError: 'module' object has no attribute 'path'
/home/wangty/torch/install/bin/luajit: /home/wangty/torch/install/share/lua/5.1/json/init.lua:40: attempt to index local 'f' (a nil value)
stack traceback:
/home/wangty/torch/install/share/lua/5.1/json/init.lua:40: in function 'load'
...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:129: in function 'load'
/mnt/geekvc/fastrcnn-example-torch/data.lua:13: in function 'get_db_loader'
/mnt/geekvc/fastrcnn-example-torch/data.lua:34: in function 'fetch_loader_dataset'
/mnt/geekvc/fastrcnn-example-torch/data.lua:298: in function 'data_gen'
/home/wangty/torch/install/share/lua/5.1/fastrcnn/train.lua:47: in function 'train'
train.lua:66: in main chunk
[C]: in function 'dofile'
...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406620
is it the dbcollection installation case the error?
Thank you in advance!
I trained and tested with the default alexnet net and voc2007 dataset, everything goes well.
I changed the options.lua with coco dataset, and netType with vgg19, some errors occured, maybe the dbcollection package caused the error.
$ th train.lua
==> (1/5) Load options
==> (2/5) Load dataset data loader
==> (3/5) Load roi proposals data
Processing COCO train RoI proposals...
[======================================== 82783/82783 ================================>] Tot: 14m37s | Step: 10ms
Save COCO train RoI proposals to cache: /home/wangty/geekvc/fastrcnn-example-torch/data/cache/coco_proposals_train.t7
Processing COCO val RoI proposals...
[======================================== 40504/40504 ================================>] Tot: 7m12s | Step: 10ms
Save COCO val RoI proposals to cache: /home/wangty/geekvc/fastrcnn-example-torch/data/cache/coco_proposals_val.t7
==> (4/5) Setup model:
==> (5/5) Train Fast-RCNN model
==> Download coco data to disk...
Traceback (most recent call last):
File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 32, in fetch_dataset_constructor
return datasets[name]
KeyError: 'coco'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/manager.py", line 69, in download
keywords = dataset.download(name, data_dir_, cache_save_path, extract_data, verbose)
File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 124, in download
dataset_loader = setup_dataset_constructor(name, data_dir, cache_dir, extract_data, verbose)
File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 69, in setup_dataset_constructor
constructor = fetch_dataset_constructor(name)
File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 34, in fetch_dataset_constructor
raise KeyError('Undefined dataset name: {}'.format(name))
KeyError: 'Undefined dataset name: coco'
/home/wangty/torch/install/bin/luajit: ...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:60: attempt to index a nil value
stack traceback:
...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:60: in function 'exists_task'
...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:147: in function 'load'
/mnt/geekvc/fastrcnn-example-torch/data.lua:18: in function 'get_db_loader'
/mnt/geekvc/fastrcnn-example-torch/data.lua:164: in function 'fetch_loader_dataset'
/mnt/geekvc/fastrcnn-example-torch/data.lua:309: in function 'data_gen'
/home/wangty/torch/install/share/lua/5.1/fastrcnn/train.lua:47: in function 'train'
train.lua:66: in main chunk
[C]: in function 'dofile'
...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406620
I used this script in scripts/train_test_vgg16_coco.lua, and similar error occured.
$ th scripts/train_test_vgg16_coco.lua
Input options: -frcnn_hflip 0.5 -snapshot 10 -frcnn_rois_per_img 128 -nThreads 4 -optMethod sgd -netType vgg16 -trainIters 5000 -nGPU 1 -frcnn_test_max_size 1000 -frcnn_test_nms_thresh 0.3 -frcnn_test_scales 600 -frcnn_scales 600 -frcnn_roi_augment_offset 0.3 -frcnn_bg_thresh_lo 0.1 -frcnn_test_mode coco -dataset coco -frcnn_max_size 1000 -schedule {{40,1e-3,5e-4},{10,1e-4,5e-4}} -frcnn_imgs_per_batch 2 -frcnn_bg_thresh_hi 0.5 -expID frcnn_vgg16_coco -clear_buffers true -frcnn_fg_fraction 0.25 -frcnn_fg_thresh 0.5 -frcnn_bg_fraction 1 -testInter false
==> (1/5) Load options
==> (2/5) Load dataset data loader
==> (3/5) Load roi proposals data
==> (4/5) Setup model:
==> (5/5) Train Fast-RCNN model
==> Download coco data to disk...
Traceback (most recent call last):
File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 32, in fetch_dataset_constructor
return datasets[name]
KeyError: 'coco'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/manager.py", line 69, in download
keywords = dataset.download(name, data_dir_, cache_save_path, extract_data, verbose)
File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 124, in download
dataset_loader = setup_dataset_constructor(name, data_dir, cache_dir, extract_data, verbose)
File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 69, in setup_dataset_constructor
constructor = fetch_dataset_constructor(name)
File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 34, in fetch_dataset_constructor
raise KeyError('Undefined dataset name: {}'.format(name))
KeyError: 'Undefined dataset name: coco'
/home/wangty/torch/install/bin/luajit: ...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:60: attempt to index a nil value
stack traceback:
...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:60: in function 'exists_task'
...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:147: in function 'load'
/mnt/geekvc/fastrcnn-example-torch/data.lua:18: in function 'get_db_loader'
/mnt/geekvc/fastrcnn-example-torch/data.lua:164: in function 'fetch_loader_dataset'
/mnt/geekvc/fastrcnn-example-torch/data.lua:309: in function 'data_gen'
/home/wangty/torch/install/share/lua/5.1/fastrcnn/train.lua:47: in function 'train'
train.lua:66: in main chunk
[C]: in function 'dofile'
...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406620
==> (1/5) Load options
==> (2/5) Load dataset data loader
==> (3/5) Load roi proposals data
==> (4/5) Load model: /home/wangty/geekvc/fastrcnn-example-torch/data/exp/coco/vgg16_coco/model_final.t7
/home/wangty/torch/install/bin/luajit: cannot open </home/wangty/geekvc/fastrcnn-example-torch/data/exp/coco/vgg16_coco/model_final.t7> in mode r at /home/wangty/torch/pkg/torch/lib/TH/THDiskFile.c:670
stack traceback:
[C]: at 0x7fe6fb4ad330
[C]: in function 'DiskFile'
/home/wangty/torch/install/share/lua/5.1/torch/File.lua:405: in function 'load'
test.lua:49: in main chunk
[C]: in function 'dofile'
...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406620
thank you in advance.
Hi, I'm trying to download alexnet model doing:
th download/download_alexnet.lua
But I have the following error:
> ==> Downloading Alexnet model...
> --2017-12-05 16:17:21-- http://www.umiacs.umd.edu/~najibi/data/imgnet_models.tar.gz
> Resolution of www.umiacs.umd.edu (www.umiacs.umd.edu)... 128.8.120.33
> Connection to www.umiacs.umd.edu (www.umiacs.umd.edu)|128.8.120.33|:80... connect.
> HTTP request sent, waiting... 403 Forbidden
> 2017-12-05 16:17:22 ERROR 403: Forbidden.
>
> tar: It doesn't seem tar
>
> gzip: stdin: unexpected end of file
> tar: Child returned status 1
> tar: Error is not recoverable: exiting now
Then I tried to open the link (http://www.umiacs.umd.edu/~najibi/data/imgnet_models.tar.gz) from browser but of course it doesn't work (ERROR 403: Forbidden).
Is it available an other link to download alexnet model?
Thanks
Hi, farrajota
I trained and tested the fastrcnn with voc dataset, everything goes well.
I trained the fastrcnn in coco dataset with no error, when I tested the accuracy with the trained model, error occured:
$ th test.lua
==> (1/5) Load options
==> (2/5) Load dataset data loader
==> (3/5) Load roi proposals data
==> (4/5) Load model: /home/wangty/geekvc/fastrcnn-example-torch/data/exp/coco/frcnn_vgg16_coco/model_final.t7
==> (5/5) Test Fast-RCNN model
666666
444444
cococo
111111
/home/wangty/torch/install/bin/luajit: invalid arguments: IntTensor FloatTensor IntTensor
expected arguments: [*IntTensor*] IntTensor [int] IntTensor IntTensor
stack traceback:
[C]: at 0x7f8ea94f0e10
[C]: in function 'addcmul'
...angty/torch/install/share/lua/5.1/fastrcnn/utils/box.lua:141: in function 'convertFrom'
...y/torch/install/share/lua/5.1/fastrcnn/ImageDetector.lua:85: in function 'detect'
...e/wangty/torch/install/share/lua/5.1/fastrcnn/Tester.lua:130: in function 'testOne'
...e/wangty/torch/install/share/lua/5.1/fastrcnn/Tester.lua:212: in function 'test'
/home/wangty/torch/install/share/lua/5.1/fastrcnn/test.lua:34: in function 'test'
test.lua:73: in main chunk
[C]: in function 'dofile'
...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406620
-netType vgg16.
I follow the fastrcnn package installation and dataset setup, when I try to train the network, error occured as follows
$ th train.lua
==> (1/5) Load options
==> (2/5) Load dataset data loader
==> (3/5) Load roi proposals data
==> (4/5) Setup model:
==> (5/5) Train Fast-RCNN model
/home/wangty/torch/install/bin/luajit: /mnt/geekvc/fastrcnn-example-torch/data.lua:13: attempt to index local 'dbc' (a boolean value)
stack traceback:
/mnt/geekvc/fastrcnn-example-torch/data.lua:13: in function 'get_db_loader'
/mnt/geekvc/fastrcnn-example-torch/data.lua:34: in function 'fetch_loader_dataset'
/mnt/geekvc/fastrcnn-example-torch/data.lua:298: in function 'data_gen'
/home/wangty/torch/install/share/lua/5.1/fastrcnn/train.lua:47: in function 'train'
train.lua:66: in main chunk
[C]: in function 'dofile'
...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406620
I tested in the torch TREPL about this command
th> dbc = require 'dbcollection.manager'
[0.0000s]
th> dbc
true
th> dbc = require 'dbcollection'
th> dbc.load{name='pascal_voc_2007',task='detection_d'}
...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:141: attempt to index global 'manager' (a nil value)
stack traceback:
...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:141: in function 'load'
[string "_RESULT={dbc.load{name='pascal_voc_2007',task..."]:1: in main chunk
[C]: in function 'xpcall'
/home/wangty/torch/install/share/lua/5.1/trepl/init.lua:661: in function 'repl'
...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:204: in main chunk
[C]: at 0x00406620
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.