Giter Site home page Giter Site logo

fastrcnn-example-torch's People

Contributors

farrajota avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

geekvc

fastrcnn-example-torch's Issues

Train with multi GPU

Seems some problems with the mutiGPU support.
I trained the Alexnet with -nGPU parameter 1, the training procedure goes well.
When I changed the -nGPU parameter to 2 or 4, error occurs:

**********************************************
*** Starting Train epoch 1/40, LR=1e-03
**********************************************

cudnnConvolutionForward failed:         7        convDesc=[mode : CUDNN_CROSS_CORRELATION datatype : CUDNN_DATA_FLOAT] hash=-dimA1,3,600,1000 -filtA64,3,3,3
 1,64,600,1000 -padA1,1 -convStrideA1,1 CUDNN_DATA_FLOAT
/home/wangty/torch/install/bin/luajit: /home/wangty/torch/install/share/lua/5.1/nn/Container.lua:67:
In 1 module of nn.Sequential:
In 1 module of nn.Sequential:
In 1 module of nn.ParallelTable:
In 1 module of nn.Sequential:
In 1 module of nn.Sequential:
/home/wangty/torch/install/share/lua/5.1/cudnn/find.lua:94: Error in CuDNN: CUDNN_STATUS_MAPPING_ERROR (cudnnConvolutionForward)
stack traceback:
        [C]: in function 'error'
        /home/wangty/torch/install/share/lua/5.1/cudnn/find.lua:94: in function 'checkedCall'
        ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:194: in function <...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:186>
        [C]: in function 'xpcall'
        /home/wangty/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
        /home/wangty/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'updateOutput'
        ...ch/install/share/lua/5.1/fastrcnn/modules/NoBackprop.lua:21: in function <...ch/install/share/lua/5.1/fastrcnn/modules/NoBackprop.lua:20>
        [C]: in function 'xpcall'
        /home/wangty/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
        /home/wangty/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'closure'
        ...
        /home/wangty/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function </home/wangty/torch/install/share/lua/5.1/nn/Sequential.lua:41>
        [C]: in function 'xpcall'
        /home/wangty/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
        /home/wangty/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
        ...ch/install/share/lua/5.1/torchnet/engine/optimengine.lua:102: in function 'train'
        /home/wangty/torch/install/share/lua/5.1/fastrcnn/train.lua:285: in function 'train'
        train.lua:66: in main chunk
        [C]: in function 'dofile'
        ...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk

AttributeError: 'module' object has no attribute 'path'

hello farrajota,
I am studying the fastrcnn, and very interested to you fastrcnn-example-torch repo.
I follow the readme, however get the error:

$ th train.lua
==> (1/5) Load options
==> (2/5) Load dataset data loader
==> (3/5) Load roi proposals data
==> (4/5) Setup model:
==> (5/5) Train Fast-RCNN model
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/dbcollection/manager.py", line 313, in config_cache
    cache_manager = CacheManager(is_test)
  File "/usr/local/lib/python2.7/dist-packages/dbcollection/utils/cache.py", line 26, in __init__
    self.setup_paths(is_test)
  File "/usr/local/lib/python2.7/dist-packages/dbcollection/utils/cache.py", line 60, in setup_paths
    home_dir = os.path.expanduser("~")
AttributeError: 'module' object has no attribute 'path'
/home/wangty/torch/install/bin/luajit: /home/wangty/torch/install/share/lua/5.1/json/init.lua:40: attempt to index local 'f' (a nil value)
stack traceback:
        /home/wangty/torch/install/share/lua/5.1/json/init.lua:40: in function 'load'
        ...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:129: in function 'load'
        /mnt/geekvc/fastrcnn-example-torch/data.lua:13: in function 'get_db_loader'
        /mnt/geekvc/fastrcnn-example-torch/data.lua:34: in function 'fetch_loader_dataset'
        /mnt/geekvc/fastrcnn-example-torch/data.lua:298: in function 'data_gen'
        /home/wangty/torch/install/share/lua/5.1/fastrcnn/train.lua:47: in function 'train'
        train.lua:66: in main chunk
        [C]: in function 'dofile'
        ...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: at 0x00406620

is it the dbcollection installation case the error?
Thank you in advance!

Train with coco dataset, KeyError: 'coco'

I trained and tested with the default alexnet net and voc2007 dataset, everything goes well.
I changed the options.lua with coco dataset, and netType with vgg19, some errors occured, maybe the dbcollection package caused the error.

$ th train.lua
==> (1/5) Load options
==> (2/5) Load dataset data loader
==> (3/5) Load roi proposals data
Processing COCO train RoI proposals...
 [======================================== 82783/82783 ================================>]  Tot: 14m37s | Step: 10ms
Save COCO train RoI proposals to cache: /home/wangty/geekvc/fastrcnn-example-torch/data/cache/coco_proposals_train.t7
Processing COCO val RoI proposals...
 [======================================== 40504/40504 ================================>]  Tot: 7m12s | Step: 10ms
Save COCO val RoI proposals to cache: /home/wangty/geekvc/fastrcnn-example-torch/data/cache/coco_proposals_val.t7
==> (4/5) Setup model:
==> (5/5) Train Fast-RCNN model

==> Download coco data to disk...
Traceback (most recent call last):
  File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 32, in fetch_dataset_constructor
    return datasets[name]
KeyError: 'coco'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/manager.py", line 69, in download
    keywords = dataset.download(name, data_dir_, cache_save_path, extract_data, verbose)
  File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 124, in download
    dataset_loader = setup_dataset_constructor(name, data_dir, cache_dir, extract_data, verbose)
  File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 69, in setup_dataset_constructor
    constructor = fetch_dataset_constructor(name)
  File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 34, in fetch_dataset_constructor
    raise KeyError('Undefined dataset name: {}'.format(name))
KeyError: 'Undefined dataset name: coco'
/home/wangty/torch/install/bin/luajit: ...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:60: attempt to index a nil value
stack traceback:
        ...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:60: in function 'exists_task'
        ...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:147: in function 'load'
        /mnt/geekvc/fastrcnn-example-torch/data.lua:18: in function 'get_db_loader'
        /mnt/geekvc/fastrcnn-example-torch/data.lua:164: in function 'fetch_loader_dataset'
        /mnt/geekvc/fastrcnn-example-torch/data.lua:309: in function 'data_gen'
        /home/wangty/torch/install/share/lua/5.1/fastrcnn/train.lua:47: in function 'train'
        train.lua:66: in main chunk
        [C]: in function 'dofile'
        ...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: at 0x00406620

I used this script in scripts/train_test_vgg16_coco.lua, and similar error occured.

$ th scripts/train_test_vgg16_coco.lua
Input options: -frcnn_hflip 0.5 -snapshot 10 -frcnn_rois_per_img 128 -nThreads 4 -optMethod sgd -netType vgg16 -trainIters 5000 -nGPU 1 -frcnn_test_max_size 1000 -frcnn_test_nms_thresh 0.3 -frcnn_test_scales 600 -frcnn_scales 600 -frcnn_roi_augment_offset 0.3 -frcnn_bg_thresh_lo 0.1 -frcnn_test_mode coco -dataset coco -frcnn_max_size 1000 -schedule {{40,1e-3,5e-4},{10,1e-4,5e-4}} -frcnn_imgs_per_batch 2 -frcnn_bg_thresh_hi 0.5 -expID frcnn_vgg16_coco -clear_buffers true -frcnn_fg_fraction 0.25 -frcnn_fg_thresh 0.5 -frcnn_bg_fraction 1 -testInter false
==> (1/5) Load options
==> (2/5) Load dataset data loader
==> (3/5) Load roi proposals data
==> (4/5) Setup model:
==> (5/5) Train Fast-RCNN model

==> Download coco data to disk...
Traceback (most recent call last):
  File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 32, in fetch_dataset_constructor
    return datasets[name]
KeyError: 'coco'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/manager.py", line 69, in download
    keywords = dataset.download(name, data_dir_, cache_save_path, extract_data, verbose)
  File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 124, in download
    dataset_loader = setup_dataset_constructor(name, data_dir, cache_dir, extract_data, verbose)
  File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 69, in setup_dataset_constructor
    constructor = fetch_dataset_constructor(name)
  File "/home/wangty/.pyenv/versions/anaconda3-4.1.0/lib/python3.5/site-packages/dbcollection/datasets/funs.py", line 34, in fetch_dataset_constructor
    raise KeyError('Undefined dataset name: {}'.format(name))
KeyError: 'Undefined dataset name: coco'
/home/wangty/torch/install/bin/luajit: ...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:60: attempt to index a nil value
stack traceback:
        ...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:60: in function 'exists_task'
        ...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:147: in function 'load'
        /mnt/geekvc/fastrcnn-example-torch/data.lua:18: in function 'get_db_loader'
        /mnt/geekvc/fastrcnn-example-torch/data.lua:164: in function 'fetch_loader_dataset'
        /mnt/geekvc/fastrcnn-example-torch/data.lua:309: in function 'data_gen'
        /home/wangty/torch/install/share/lua/5.1/fastrcnn/train.lua:47: in function 'train'
        train.lua:66: in main chunk
        [C]: in function 'dofile'
        ...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: at 0x00406620
==> (1/5) Load options
==> (2/5) Load dataset data loader
==> (3/5) Load roi proposals data
==> (4/5) Load model: /home/wangty/geekvc/fastrcnn-example-torch/data/exp/coco/vgg16_coco/model_final.t7
/home/wangty/torch/install/bin/luajit: cannot open </home/wangty/geekvc/fastrcnn-example-torch/data/exp/coco/vgg16_coco/model_final.t7> in mode r  at /home/wangty/torch/pkg/torch/lib/TH/THDiskFile.c:670
stack traceback:
        [C]: at 0x7fe6fb4ad330
        [C]: in function 'DiskFile'
        /home/wangty/torch/install/share/lua/5.1/torch/File.lua:405: in function 'load'
        test.lua:49: in main chunk
        [C]: in function 'dofile'
        ...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: at 0x00406620

thank you in advance.

Alexnet Model

Hi, I'm trying to download alexnet model doing:

th download/download_alexnet.lua
But I have the following error:


> ==> Downloading Alexnet model... 	
> --2017-12-05 16:17:21--  http://www.umiacs.umd.edu/~najibi/data/imgnet_models.tar.gz
> Resolution of www.umiacs.umd.edu (www.umiacs.umd.edu)... 128.8.120.33
> Connection to www.umiacs.umd.edu (www.umiacs.umd.edu)|128.8.120.33|:80... connect.
> HTTP request sent, waiting... 403 Forbidden
> 2017-12-05 16:17:22 ERROR 403: Forbidden.
> 
> tar: It doesn't seem tar
> 
> gzip: stdin: unexpected end of file
> tar: Child returned status 1
> tar: Error is not recoverable: exiting now

Then I tried to open the link (http://www.umiacs.umd.edu/~najibi/data/imgnet_models.tar.gz) from browser but of course it doesn't work (ERROR 403: Forbidden).
Is it available an other link to download alexnet model?
Thanks

Test accuracy with coco dataset

Hi, farrajota
I trained and tested the fastrcnn with voc dataset, everything goes well.
I trained the fastrcnn in coco dataset with no error, when I tested the accuracy with the trained model, error occured:

$ th test.lua
==> (1/5) Load options
==> (2/5) Load dataset data loader
==> (3/5) Load roi proposals data
==> (4/5) Load model: /home/wangty/geekvc/fastrcnn-example-torch/data/exp/coco/frcnn_vgg16_coco/model_final.t7
==> (5/5) Test Fast-RCNN model
666666
444444
cococo
111111
/home/wangty/torch/install/bin/luajit: invalid arguments: IntTensor FloatTensor IntTensor
expected arguments: [*IntTensor*] IntTensor [int] IntTensor IntTensor
stack traceback:
        [C]: at 0x7f8ea94f0e10
        [C]: in function 'addcmul'
        ...angty/torch/install/share/lua/5.1/fastrcnn/utils/box.lua:141: in function 'convertFrom'
        ...y/torch/install/share/lua/5.1/fastrcnn/ImageDetector.lua:85: in function 'detect'
        ...e/wangty/torch/install/share/lua/5.1/fastrcnn/Tester.lua:130: in function 'testOne'
        ...e/wangty/torch/install/share/lua/5.1/fastrcnn/Tester.lua:212: in function 'test'
        /home/wangty/torch/install/share/lua/5.1/fastrcnn/test.lua:34: in function 'test'
        test.lua:73: in main chunk
        [C]: in function 'dofile'
        ...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: at 0x00406620

-netType vgg16.

The fastrcnn package train

I follow the fastrcnn package installation and dataset setup, when I try to train the network, error occured as follows

$ th train.lua
==> (1/5) Load options
==> (2/5) Load dataset data loader
==> (3/5) Load roi proposals data
==> (4/5) Setup model:
==> (5/5) Train Fast-RCNN model
/home/wangty/torch/install/bin/luajit: /mnt/geekvc/fastrcnn-example-torch/data.lua:13: attempt to index local 'dbc' (a boolean value)
stack traceback:
        /mnt/geekvc/fastrcnn-example-torch/data.lua:13: in function 'get_db_loader'
        /mnt/geekvc/fastrcnn-example-torch/data.lua:34: in function 'fetch_loader_dataset'
        /mnt/geekvc/fastrcnn-example-torch/data.lua:298: in function 'data_gen'
        /home/wangty/torch/install/share/lua/5.1/fastrcnn/train.lua:47: in function 'train'
        train.lua:66: in main chunk
        [C]: in function 'dofile'
        ...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: at 0x00406620

I tested in the torch TREPL about this command

th> dbc = require 'dbcollection.manager'
                                                                      [0.0000s]
th> dbc
true
th> dbc = require 'dbcollection'
th> dbc.load{name='pascal_voc_2007',task='detection_d'}
...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:141: attempt to index global 'manager' (a nil value)
stack traceback:
        ...gty/torch/install/share/lua/5.1/dbcollection/manager.lua:141: in function 'load'
        [string "_RESULT={dbc.load{name='pascal_voc_2007',task..."]:1: in main chunk
        [C]: in function 'xpcall'
        /home/wangty/torch/install/share/lua/5.1/trepl/init.lua:661: in function 'repl'
        ...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:204: in main chunk
        [C]: at 0x00406620

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.