Giter Site home page Giter Site logo

Comments (7)

btgraham avatar btgraham commented on July 26, 2024 1

I have switched compute_20, code=sm_20 to compute_30,code=sm_30
in setup file.

from sparseconvnet.

gnedster avatar gnedster commented on July 26, 2024 1

The hello-world.py example works now. Thanks for the quick fix!

from sparseconvnet.

oztc avatar oztc commented on July 26, 2024

I have the same issue

from sparseconvnet.

btgraham avatar btgraham commented on July 26, 2024

Hello. To help me debug, can you please show the output from:
cd SpareConvNet/PyTorch
python setup.py develop
ls sparseconvnet/SCN/
(Also, what OS? What Python version? Conda or not?)

from sparseconvnet.

oztc avatar oztc commented on July 26, 2024

Hi btgraham,

the following log is my output when I use "python setup.py develop" in SpareConvNet/PyTorch:

ozzie@debian:~/working/work/ML/SparseConvNet/PyTorch$ python setup.py develop

Building SCN module
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
generating /tmp/tmpS1UlkY/_SCN.c
running build_ext
building '_SCN' extension
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include -I/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/ozzie/anaconda2/include/python2.7 -c _SCN.c -o ./_SCN.o
gcc -pthread -shared -L/home/ozzie/anaconda2/lib -Wl,-rpath=/home/ozzie/anaconda2/lib,--no-as-needed ./_SCN.o /media/New_bt/ML/SparseConvNet/PyTorch/sparseconvnet/SCN/init.cu.o -L/home/ozzie/anaconda2/lib -lpython2.7 -o ./_SCN.so
running develop
running egg_info
creating sparseconvnet.egg-info
writing sparseconvnet.egg-info/PKG-INFO
writing top-level names to sparseconvnet.egg-info/top_level.txt
writing dependency_links to sparseconvnet.egg-info/dependency_links.txt
writing manifest file 'sparseconvnet.egg-info/SOURCES.txt'
reading manifest file 'sparseconvnet.egg-info/SOURCES.txt'
writing manifest file 'sparseconvnet.egg-info/SOURCES.txt'
running build_ext
Creating /home/ozzie/anaconda2/lib/python2.7/site-packages/sparseconvnet.egg-link (link to .)
Adding sparseconvnet 0.1 to easy-install.pth file

Installed /media/New_bt/ML/SparseConvNet/PyTorch
Processing dependencies for sparseconvnet==0.1
Finished processing dependencies for sparseconvnet==0.1

ozzie@debian:~/working/work/ML/SparseConvNet/examples/Assamese_handwriting$ python VGGplus.py
Downloading and preprocessing data ...
--2017-07-18 18:06:00-- https://archive.ics.uci.edu/ml/machine-learning-databases/00208/Online%20Handwritten%20Assamese%20Characters%20Dataset.rar
Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.249
Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.249|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 8067448 (7.7M) [text/plain]
Saving to: ‘Online Handwritten Assamese Characters Dataset.rar’

Online Handwritten 100%[=====================>] 7.69M 1.22MB/s in 12s

2017-07-18 18:06:13 (671 KB/s) - ‘Online Handwritten Assamese Characters Dataset.rar’ saved [8067448/8067448]

UNRAR 5.30 beta 2 freeware Copyright (c) 1993-2015 Alexander Roshal

Extracting from Online Handwritten Assamese Characters Dataset.rar

Extracting data_table.pdf OK
Extracting 1.1.txt OK
Extracting 10.1.txt OK
Extracting 100.1.txt OK
Extracting 101.1.txt OK
Extracting 102.1.txt OK
Extracting 103.1.txt OK
Extracting 104.1.txt OK
Extracting 105.1.txt OK
Extracting 106.1.txt OK
Extracting 107.1.txt OK
Extracting 108.1.txt OK
Extracting 109.1.txt OK
................ (the middle "Extracting xxx.txt OK" is removed by Ozzie Zhang because it is too long)
Extracting 53.9.txt OK
Extracting 54.9.txt OK
Extracting 55.9.txt OK
Extracting 56.9.txt OK
Extracting 57.9.txt OK
Extracting 58.9.txt OK
Extracting 59.9.txt OK
Extracting 6.9.txt OK
Extracting 60.9.txt OK
Extracting 61.9.txt OK
Extracting 62.9.txt OK
Extracting 63.9.txt OK
Extracting 64.9.txt OK
Extracting 65.9.txt OK
Extracting 66.9.txt OK
Extracting 67.9.txt OK
Extracting 68.9.txt OK
Extracting 69.9.txt OK
Extracting 7.9.txt OK
Extracting 70.9.txt OK
Extracting 71.9.txt OK
Extracting 72.9.txt OK
Extracting 73.9.txt OK
Extracting 74.9.txt OK
Extracting 75.9.txt OK
Extracting 76.9.txt OK
Extracting 77.9.txt OK
Extracting 78.9.txt OK
Extracting 79.9.txt OK
Extracting 8.9.txt OK
Extracting 80.9.txt OK
Extracting 81.9.txt OK
Extracting 82.9.txt OK
Extracting 83.9.txt OK
Extracting 84.9.txt OK
Extracting 85.9.txt OK
Extracting 86.9.txt OK
Extracting 87.9.txt OK
Extracting 88.9.txt OK
Extracting 89.9.txt OK
Extracting 9.9.txt OK
Extracting 90.9.txt OK
Extracting 91.9.txt OK
Extracting 92.9.txt OK
Extracting 93.9.txt OK
Extracting 94.9.txt OK
Extracting 95.9.txt OK
Extracting 96.9.txt OK
Extracting 97.9.txt OK
Extracting 98.9.txt OK
Extracting 99.9.txt OK
All OK
(6588, 1647)

nn.Sequential {
[input -> (0) -> (1) -> output]
(0): nn.Sequential {
[input -> (0) -> (1) -> (2) -> (3) -> output]
(0): nn.Sequential {
[input -> (0) -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> output]
(0): ValidConvolution 3->8 C3
(1): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(2): ValidConvolution 8->8 C3
(3): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(4): MaxPooling3/2
(5): ValidConvolution 8->16 C3
(6): BatchNormReLU(16,eps=0.0001,momentum=0.9,affine=True)
(7): ValidConvolution 16->16 C3
(8): BatchNormReLU(16,eps=0.0001,momentum=0.9,affine=True)
(9): MaxPooling3/2
(10): sparseconvnet.legacy.concatTable.ConcatTable {
input
|-> (0): ValidConvolution 16->16 C3 |-> (1): nn.Sequential {
[input -> (0) -> (1) -> (2) -> (3) -> (4) -> output]
(0): Convolution 16->8 C3/2
(1): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(2): ValidConvolution 8->8 C3
(3): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(4): Deconvolution 8->8 C3/2
}
+. -> output
}
(11): JoinTable: 16 + 8 -> 24
(12): BatchNormReLU(24,eps=0.0001,momentum=0.9,affine=True)
(13): sparseconvnet.legacy.concatTable.ConcatTable {
input
|-> (0): ValidConvolution 24->16 C3 |-> (1): nn.Sequential {
[input -> (0) -> (1) -> (2) -> (3) -> (4) -> output]
(0): Convolution 24->8 C3/2
(1): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(2): ValidConvolution 8->8 C3
(3): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(4): Deconvolution 8->8 C3/2
}
+. -> output
}
(14): JoinTable: 16 + 8 -> 24
(15): BatchNormReLU(24,eps=0.0001,momentum=0.9,affine=True)
(16): MaxPooling3/2
(17): sparseconvnet.legacy.concatTable.ConcatTable {
input
|-> (0): ValidConvolution 24->24 C3 |-> (1): nn.Sequential {
[input -> (0) -> (1) -> (2) -> (3) -> (4) -> output]
(0): Convolution 24->8 C3/2
(1): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(2): ValidConvolution 8->8 C3
(3): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(4): Deconvolution 8->8 C3/2
}
+. -> output
}
(18): JoinTable: 24 + 8 -> 32
(19): BatchNormReLU(32,eps=0.0001,momentum=0.9,affine=True)
(20): sparseconvnet.legacy.concatTable.ConcatTable {
input
|-> (0): ValidConvolution 32->24 C3 |-> (1): nn.Sequential {
[input -> (0) -> (1) -> (2) -> (3) -> (4) -> output]
(0): Convolution 32->8 C3/2
(1): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(2): ValidConvolution 8->8 C3
(3): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(4): Deconvolution 8->8 C3/2
}
+. -> output
}
(21): JoinTable: 24 + 8 -> 32
(22): BatchNormReLU(32,eps=0.0001,momentum=0.9,affine=True)
(23): MaxPooling3/2
}
(1): Convolution 32->64 C5/1
(2): BatchNormReLU(64,eps=0.0001,momentum=0.9,affine=True)
(3): SparseToDense(2)
}
(1): nn.Sequential {
[input -> (0) -> (1) -> output]
(0): nn.View(-1, 64)
(1): nn.Linear(64 -> 183)
}
}
('input spatial size',
95
95
[torch.LongTensor of size 2]
)
Replicating training set 10 times (1 epoch = 10 iterations through the training set = 10x6588 training samples)
{'weightDecay': 0.0001, 'initial_LR': 0.1, 'checkPoint': False, 'nEpochs': 100, 'LR_decay': 0.05, 'momentum': 0.9}
('#parameters', 97295)
THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THC/generic/THCTensorMath.cu line=35 error=8 : invalid device function
Traceback (most recent call last):
File "VGGplus.py", line 38, in
{'nEpochs': 100, 'initial_LR': 0.1, 'LR_decay': 0.05, 'weightDecay': 1e-4})
File "/media/New_bt/ML/SparseConvNet/PyTorch/sparseconvnet/legacy/classificationTrainValidate.py", line 73, in ClassificationTrainValidate
model.forward(batch['input'])
File "/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/legacy/nn/Module.py", line 33, in forward
return self.updateOutput(input)
File "/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/legacy/nn/Sequential.py", line 36, in updateOutput
currentOutput = module.updateOutput(currentOutput)
File "/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/legacy/nn/Sequential.py", line 36, in updateOutput
currentOutput = module.updateOutput(currentOutput)
File "/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/legacy/nn/Sequential.py", line 36, in updateOutput
currentOutput = module.updateOutput(currentOutput)
File "/media/New_bt/ML/SparseConvNet/PyTorch/sparseconvnet/legacy/validConvolution.py", line 46, in updateOutput
torch.cuda.IntTensor() if input.features.is_cuda else nullptr)
File "/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/utils/ffi/init.py", line 177, in safe_call
result = torch._C._safe_call(*args, **kwargs)
torch.FatalError: cuda runtime error (8) : invalid device function at /b/wheel/pytorch-src/torch/lib/THC/generic/THCTensorMath.cu:35

I guess that this should be a CUDA device issue related my GPU device number

I should use arch=compute_30,code=sm_30 because my GPU is nvidia k4000

from sparseconvnet.

oztc avatar oztc commented on July 26, 2024

my os is
uname -a
Linux debian 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/Linux

python
Python 2.7.12 |Anaconda custom (64-bit)| (default, Jul 2 2016, 17:42:40)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org

from sparseconvnet.

oztc avatar oztc commented on July 26, 2024

my bug should be related to torch not SparseConvNet

from sparseconvnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.