Hi, First, great work! However I'm trying to get the Python examples

my os is uname -a Linux debian 3.16.0-4-amd64 <a class="issue-link js-issue-li

Python Module Failing,about facebookresearch/sparseconvnet

btgraham commented on July 26, 2024 1

I have switched compute_20, code=sm_20 to compute_30,code=sm_30
in setup file.

from sparseconvnet.

gnedster commented on July 26, 2024 1

The hello-world.py example works now. Thanks for the quick fix!

from sparseconvnet.

oztc commented on July 26, 2024

I have the same issue

from sparseconvnet.

btgraham commented on July 26, 2024

Hello. To help me debug, can you please show the output from:
cd SpareConvNet/PyTorch
python setup.py develop
ls sparseconvnet/SCN/
(Also, what OS? What Python version? Conda or not?)

from sparseconvnet.

oztc commented on July 26, 2024

Hi btgraham,

the following log is my output when I use "python setup.py develop" in SpareConvNet/PyTorch:

ozzie@debian:~/working/work/ML/SparseConvNet/PyTorch$ python setup.py develop

Building SCN module
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
generating /tmp/tmpS1UlkY/_SCN.c
running build_ext
building '_SCN' extension
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include -I/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/ozzie/anaconda2/include/python2.7 -c _SCN.c -o ./_SCN.o
gcc -pthread -shared -L/home/ozzie/anaconda2/lib -Wl,-rpath=/home/ozzie/anaconda2/lib,--no-as-needed ./_SCN.o /media/New_bt/ML/SparseConvNet/PyTorch/sparseconvnet/SCN/init.cu.o -L/home/ozzie/anaconda2/lib -lpython2.7 -o ./_SCN.so
running develop
running egg_info
creating sparseconvnet.egg-info
writing sparseconvnet.egg-info/PKG-INFO
writing top-level names to sparseconvnet.egg-info/top_level.txt
writing dependency_links to sparseconvnet.egg-info/dependency_links.txt
writing manifest file 'sparseconvnet.egg-info/SOURCES.txt'
reading manifest file 'sparseconvnet.egg-info/SOURCES.txt'
writing manifest file 'sparseconvnet.egg-info/SOURCES.txt'
running build_ext
Creating /home/ozzie/anaconda2/lib/python2.7/site-packages/sparseconvnet.egg-link (link to .)
Adding sparseconvnet 0.1 to easy-install.pth file

Installed /media/New_bt/ML/SparseConvNet/PyTorch
Processing dependencies for sparseconvnet==0.1
Finished processing dependencies for sparseconvnet==0.1

ozzie@debian:~/working/work/ML/SparseConvNet/examples/Assamese_handwriting$ python VGGplus.py
Downloading and preprocessing data ...
--2017-07-18 18:06:00-- https://archive.ics.uci.edu/ml/machine-learning-databases/00208/Online%20Handwritten%20Assamese%20Characters%20Dataset.rar
Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.249
Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.249|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 8067448 (7.7M) [text/plain]
Saving to: ‘Online Handwritten Assamese Characters Dataset.rar’

Online Handwritten 100%[=====================>] 7.69M 1.22MB/s in 12s

2017-07-18 18:06:13 (671 KB/s) - ‘Online Handwritten Assamese Characters Dataset.rar’ saved [8067448/8067448]

Extracting from Online Handwritten Assamese Characters Dataset.rar

Extracting data_table.pdf OK
Extracting 1.1.txt OK
Extracting 10.1.txt OK
Extracting 100.1.txt OK
Extracting 101.1.txt OK
Extracting 102.1.txt OK
Extracting 103.1.txt OK
Extracting 104.1.txt OK
Extracting 105.1.txt OK
Extracting 106.1.txt OK
Extracting 107.1.txt OK
Extracting 108.1.txt OK
Extracting 109.1.txt OK
................ (the middle "Extracting xxx.txt OK" is removed by Ozzie Zhang because it is too long)
Extracting 53.9.txt OK
Extracting 54.9.txt OK
Extracting 55.9.txt OK
Extracting 56.9.txt OK
Extracting 57.9.txt OK
Extracting 58.9.txt OK
Extracting 59.9.txt OK
Extracting 6.9.txt OK
Extracting 60.9.txt OK
Extracting 61.9.txt OK
Extracting 62.9.txt OK
Extracting 63.9.txt OK
Extracting 64.9.txt OK
Extracting 65.9.txt OK
Extracting 66.9.txt OK
Extracting 67.9.txt OK
Extracting 68.9.txt OK
Extracting 69.9.txt OK
Extracting 7.9.txt OK
Extracting 70.9.txt OK
Extracting 71.9.txt OK
Extracting 72.9.txt OK
Extracting 73.9.txt OK
Extracting 74.9.txt OK
Extracting 75.9.txt OK
Extracting 76.9.txt OK
Extracting 77.9.txt OK
Extracting 78.9.txt OK
Extracting 79.9.txt OK
Extracting 8.9.txt OK
Extracting 80.9.txt OK
Extracting 81.9.txt OK
Extracting 82.9.txt OK
Extracting 83.9.txt OK
Extracting 84.9.txt OK
Extracting 85.9.txt OK
Extracting 86.9.txt OK
Extracting 87.9.txt OK
Extracting 88.9.txt OK
Extracting 89.9.txt OK
Extracting 9.9.txt OK
Extracting 90.9.txt OK
Extracting 91.9.txt OK
Extracting 92.9.txt OK
Extracting 93.9.txt OK
Extracting 94.9.txt OK
Extracting 95.9.txt OK
Extracting 96.9.txt OK
Extracting 97.9.txt OK
Extracting 98.9.txt OK
Extracting 99.9.txt OK
All OK
(6588, 1647)

nn.Sequential {
[input -> (0) -> (1) -> output]
(0): nn.Sequential {
[input -> (0) -> (1) -> (2) -> (3) -> output]
(0): nn.Sequential {
[input -> (0) -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> output]
(0): ValidConvolution 3->8 C3
(1): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(2): ValidConvolution 8->8 C3
(3): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(4): MaxPooling3/2
(5): ValidConvolution 8->16 C3
(6): BatchNormReLU(16,eps=0.0001,momentum=0.9,affine=True)
(7): ValidConvolution 16->16 C3
(8): BatchNormReLU(16,eps=0.0001,momentum=0.9,affine=True)
(9): MaxPooling3/2
(10): sparseconvnet.legacy.concatTable.ConcatTable {
input
|-> (0): ValidConvolution 16->16 C3 |-> (1): nn.Sequential {
[input -> (0) -> (1) -> (2) -> (3) -> (4) -> output]
(0): Convolution 16->8 C3/2
(1): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(2): ValidConvolution 8->8 C3
(3): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(4): Deconvolution 8->8 C3/2
}
+. -> output
}
(11): JoinTable: 16 + 8 -> 24
(12): BatchNormReLU(24,eps=0.0001,momentum=0.9,affine=True)
(13): sparseconvnet.legacy.concatTable.ConcatTable {
input
|-> (0): ValidConvolution 24->16 C3 |-> (1): nn.Sequential {
[input -> (0) -> (1) -> (2) -> (3) -> (4) -> output]
(0): Convolution 24->8 C3/2
(1): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(2): ValidConvolution 8->8 C3
(3): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(4): Deconvolution 8->8 C3/2
}
+. -> output
}
(14): JoinTable: 16 + 8 -> 24
(15): BatchNormReLU(24,eps=0.0001,momentum=0.9,affine=True)
(16): MaxPooling3/2
(17): sparseconvnet.legacy.concatTable.ConcatTable {
input
|-> (0): ValidConvolution 24->24 C3 |-> (1): nn.Sequential {
[input -> (0) -> (1) -> (2) -> (3) -> (4) -> output]
(0): Convolution 24->8 C3/2
(1): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(2): ValidConvolution 8->8 C3
(3): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(4): Deconvolution 8->8 C3/2
}
+. -> output
}
(18): JoinTable: 24 + 8 -> 32
(19): BatchNormReLU(32,eps=0.0001,momentum=0.9,affine=True)
(20): sparseconvnet.legacy.concatTable.ConcatTable {
input
|-> (0): ValidConvolution 32->24 C3 |-> (1): nn.Sequential {
[input -> (0) -> (1) -> (2) -> (3) -> (4) -> output]
(0): Convolution 32->8 C3/2
(1): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(2): ValidConvolution 8->8 C3
(3): BatchNormReLU(8,eps=0.0001,momentum=0.9,affine=True)
(4): Deconvolution 8->8 C3/2
}
+. -> output
}
(21): JoinTable: 24 + 8 -> 32
(22): BatchNormReLU(32,eps=0.0001,momentum=0.9,affine=True)
(23): MaxPooling3/2
}
(1): Convolution 32->64 C5/1
(2): BatchNormReLU(64,eps=0.0001,momentum=0.9,affine=True)
(3): SparseToDense(2)
}
(1): nn.Sequential {
[input -> (0) -> (1) -> output]
(0): nn.View(-1, 64)
(1): nn.Linear(64 -> 183)
}
}
('input spatial size',
95
95
[torch.LongTensor of size 2]
)
Replicating training set 10 times (1 epoch = 10 iterations through the training set = 10x6588 training samples)
{'weightDecay': 0.0001, 'initial_LR': 0.1, 'checkPoint': False, 'nEpochs': 100, 'LR_decay': 0.05, 'momentum': 0.9}
('#parameters', 97295)
THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THC/generic/THCTensorMath.cu line=35 error=8 : invalid device function
Traceback (most recent call last):
File "VGGplus.py", line 38, in
{'nEpochs': 100, 'initial_LR': 0.1, 'LR_decay': 0.05, 'weightDecay': 1e-4})
File "/media/New_bt/ML/SparseConvNet/PyTorch/sparseconvnet/legacy/classificationTrainValidate.py", line 73, in ClassificationTrainValidate
model.forward(batch['input'])
File "/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/legacy/nn/Module.py", line 33, in forward
return self.updateOutput(input)
File "/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/legacy/nn/Sequential.py", line 36, in updateOutput
currentOutput = module.updateOutput(currentOutput)
File "/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/legacy/nn/Sequential.py", line 36, in updateOutput
currentOutput = module.updateOutput(currentOutput)
File "/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/legacy/nn/Sequential.py", line 36, in updateOutput
currentOutput = module.updateOutput(currentOutput)
File "/media/New_bt/ML/SparseConvNet/PyTorch/sparseconvnet/legacy/validConvolution.py", line 46, in updateOutput
torch.cuda.IntTensor() if input.features.is_cuda else nullptr)
File "/home/ozzie/anaconda2/lib/python2.7/site-packages/torch/utils/ffi/init.py", line 177, in safe_call
result = torch._C._safe_call(*args, **kwargs)
torch.FatalError: cuda runtime error (8) : invalid device function at /b/wheel/pytorch-src/torch/lib/THC/generic/THCTensorMath.cu:35

I guess that this should be a CUDA device issue related my GPU device number

I should use arch=compute_30,code=sm_30 because my GPU is nvidia k4000

from sparseconvnet.

oztc commented on July 26, 2024

my os is
uname -a
Linux debian 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/Linux

python
Python 2.7.12 |Anaconda custom (64-bit)| (default, Jul 2 2016, 17:42:40)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org

from sparseconvnet.

oztc commented on July 26, 2024

my bug should be related to torch not SparseConvNet

from sparseconvnet.

Python Module Failing about sparseconvnet HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent