soumith / cuda-convnet2.torch Goto Github PK
View Code? Open in Web Editor NEWTorch7 bindings for cuda-convnet2 kernels!
License: Apache License 2.0
Torch7 bindings for cuda-convnet2 kernels!
License: Apache License 2.0
Hi,
I'm trying to run train.lua in the test folder, but with ccn2.SpatialConvolutionLocal(originally ccn2.SpatialConvolution).
But it keeps giving me error
/usr/local/bin/luajit: /usr/local/share/lua/5.1/ccn2/SpatialConvolutionLocal.lua:23: attempt to perform arithmetic on field 'kH' (a nil value)
Is there additional things I should know to use locally connected layers ?
Thank you in advance :)
Hyungwon
We discovered today that with 'blocked = true' normalisation output is different from imagenet caffe network. Should it be set to false by default and passed as an option?
"Considerable speedup(1.5x under VGG model with miniBatch of 32, 1.1x under AlexNet with miniBatch of 128), and the optimizations focus on fully employing gpu-releated functions." - @bestimage-tencent
(edit : I just missed something while checking the code, sorry.)
tmp/luarocks_ccn2-scm-1-5152/cuda-convnet2.torch/cudaconv3/src/filter_acts.cu(2086) : getLastCudaError() CUDA error : filterActs: kernel execution failed : (8) invalid device function.
I read few others got the same problem because of different version. I am having : "GeForce 820M" card
conv_util.cu
has multiple references to THCudaTensor_isSameSizeAs
. When I google this, I currently only find a link to this repository. I think I can work around this temporarily, since all uses are in assertions, and therefore should have no side effects.
from the code, the weight matrix from a spatial convolution local layer is a 2d matrix:
self.weight = torch.Tensor(outputSize*nInputPlane*filterSize, nOutputPlane)
is the 1st dimension's combination order exactly like the multiplications above? or..is there a way to decompose this 2d weight matrix into the form that is like nn.SpatialConvolutionLocal?
such that it is a 6d matrix like the following from nn.SpatialConvolutionLocal:
self.weight = torch.Tensor(self.oH, self.oW, nOutputPlane, nInputPlane, kH, kW)
thanks.
Can I use cuda-convnet2 with a GTX 650? The following snippet (extracted from benchmark.lua
) fails with the error message
/tmp/luarocks_ccn2-scm-1-7061/cuda-convnet2.torch/cudaconv3/src/filter_acts.cu(2085) : getLastCudaError() CUDA error : filterActs: kernel execution failed : (8) invalid device function .
require 'ccn2'
n = ccn2.SpatialConvolution(64, 128, 9, 1):cuda()
i = torch.randn(64, 64, 64, 128):cuda()
n:forward(i)
The following assert exists in cuda-convent2:
https://github.com/soumith/cuda-convnet2.torch/blob/master/cudaconv3/src/img_acts.cu#L1208
This causes failure in some cases that cuda-convnet2 should support, for example:
model = nn.Sequential()
model:add(ccn2.SpatialConvolutionLocal(16, 16, 63, 9))
model:add(nn.ReLU())
model:backward(torch.rand(16,63,63,128))
will fail, because the number of filters is 16 which doesn't pass the assert check. However, the documentation here mentions that this number should be a multiple of 16.
I removed all parts of NVMatrix within cudaconv3, so we have all layers in cuda-convnet2 exposed as C functions which take in THCudaTensor*
This weekend I will write lua/ffi wrappers around the now exposed C functions.
Feel free to contribute!
Cheers,
S
Nvidia cards don't allow textures bigger than 512MB. Because this code uses texture memory, this imposes a limit on the sizes of various buffers. For example if your layer has too many filters (such that its output size exceeds 512MB), the code will crash.
TODO: add non-texture-using routines to bypass this.
Already tracked by Alex, this issue here will help me track this repo's progress on it.
https://github.com/soumith/cuda-convnet2.torch/blob/master/SpatialConvolutionLocal.lua#L29
bias vector is set to length of (self.oH x self.oH x nOutputPlane), but actually it always has only nOutputPlane different values (one bias for each output map).
just noticed that there's this partialSum thing in cuda-convnet2 that is (a) undocumented, (b) is much faster to do accGradParameters. I'm just noticing it now, fml!
Have to add it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.