tbennun / cudnn-training Goto Github PK

View Code? Open in Web Editor NEW

254.0 254.0 92.0 57 KB

A CUDNN minimal deep learning training code sample using LeNet.

Cuda 88.92% C++ 9.05% CMake 2.03%

cudnn-training's People

Contributors

Stargazers

Watchers

Forkers

ljlabs shin1985 mrgloom todo-clear caomw cafesiesta adamslevy bundenth yochju brian0129 sleeperqp arita37 xinchenhawaii da-ki-ko takeshineshiro madhurgoel guker ermin-gong mrgoogol wanjinchang benjamesbabala raaka1 fancyerii phperwu iezsf hughperkins ntuanhung phoenixqm tomokane primebuilder fastalgo zqcr ironhide23586 kwccoin kash009 akshit-sharma ilchelo vonhachtaugust arunirc gcjyzdd adamphunt zeuseyera jiazhihao llcf lzrlizard maheshkha neo-vincent ujjalbuet barseghyanartur super-hippo limingfan podismine jing-vision melody-rain robgrzel zcwang jeonggunlee junweima kanul chomolungma hashimsharif jiangnanhugo zhaojp-frank bssrdf mythreyi22 shigangli kevinzbw asds25810 d356 mzc88 geotyper echowanghf luizhenriquesena d123456ddq yanjiegao taomiao c00lrain waikeichan schetkiglobe7 vastyellownew azfaruzman mkhairy danielyoon2013 uzbekdev1 wconstab lebronhe falconyd dongso youzagou dalvikus

cudnn-training's Issues

about dropoutLayer

I find that there is no dropout function in CUDnn, and from your code I think it's hard to build a new dropout function with CUDA by myself. Do you have any experience in build a dropoutLayer with CUDA?

/cudnn-training/lenet.cu(346): error: argument of type "int" is incompatible with parameter of type "cudnnNanPropagation_t"

When I try to make the project, I get

$ make
[ 50%] Building NVCC (Device) object CMakeFiles/trainlenet.dir//./trainlenet_generated_lenet.cu.o
/home/moose/GitHub/cudnn-training/lenet.cu(176): warning: result of call is not used

/home/moose/GitHub/cudnn-training/lenet.cu(186): warning: result of call is not used

/home/moose/GitHub/cudnn-training/lenet.cu(224): warning: result of call is not used

/home/moose/GitHub/cudnn-training/lenet.cu(234): warning: result of call is not used

/home/moose/GitHub/cudnn-training/lenet.cu(346): error: argument of type "int" is incompatible with parameter of type "cudnnNanPropagation_t"

/home/moose/GitHub/cudnn-training/lenet.cu(346): error: too few arguments in function call

/home/moose/GitHub/cudnn-training/lenet.cu(346): error: argument of type "int" is incompatible with parameter of type "cudnnNanPropagation_t"

/home/moose/GitHub/cudnn-training/lenet.cu(346): error: too few arguments in function call

/home/moose/GitHub/cudnn-training/lenet.cu(417): error: argument of type "int" is incompatible with parameter of type "cudnnTensorFormat_t"

/home/moose/GitHub/cudnn-training/lenet.cu(417): error: too few arguments in function call

/home/moose/GitHub/cudnn-training/lenet.cu(417): error: argument of type "int" is incompatible with parameter of type "cudnnTensorFormat_t"

/home/moose/GitHub/cudnn-training/lenet.cu(417): error: too few arguments in function call

/home/moose/GitHub/cudnn-training/lenet.cu(475): error: identifier "CUDNN_ADD_SAME_C" is undefined

/home/moose/GitHub/cudnn-training/lenet.cu(475): error: identifier "cudnnAddTensor_v2" is undefined

/home/moose/GitHub/cudnn-training/lenet.cu(487): error: identifier "CUDNN_ADD_SAME_C" is undefined

/home/moose/GitHub/cudnn-training/lenet.cu(487): error: identifier "cudnnAddTensor_v2" is undefined

/home/moose/GitHub/cudnn-training/lenet.cu(513): error: argument of type "cudnnActivationMode_t" is incompatible with parameter of type "cudnnActivationDescriptor_t"

/home/moose/GitHub/cudnn-training/lenet.cu(513): error: argument of type "cudnnActivationMode_t" is incompatible with parameter of type "cudnnActivationDescriptor_t"

/home/moose/GitHub/cudnn-training/lenet.cu(579): error: argument of type "cudnnActivationMode_t" is incompatible with parameter of type "cudnnActivationDescriptor_t"

/home/moose/GitHub/cudnn-training/lenet.cu(579): error: argument of type "cudnnActivationMode_t" is incompatible with parameter of type "cudnnActivationDescriptor_t"

/home/moose/GitHub/cudnn-training/lenet.cu(604): error: identifier "cudnnConvolutionBackwardFilter_v2" is undefined

/home/moose/GitHub/cudnn-training/lenet.cu(608): error: identifier "cudnnConvolutionBackwardData_v2" is undefined

/home/moose/GitHub/cudnn-training/lenet.cu(621): error: identifier "cudnnConvolutionBackwardFilter_v2" is undefined

19 errors detected in the compilation of "/tmp/tmpxft_00002a85_00000000-13_lenet.compute_50.cpp1.ii".
CMake Error at trainlenet_generated_lenet.cu.o.cmake:260 (message):
  Error generating file
  /home/moose/GitHub/cudnn-training/build/CMakeFiles/trainlenet.dir//./trainlenet_generated_lenet.cu.o


make[2]: *** [CMakeFiles/trainlenet.dir/./trainlenet_generated_lenet.cu.o] Error 1
make[1]: *** [CMakeFiles/trainlenet.dir/all] Error 2
make: *** [all] Error 2

Do you have an idea what the problem is?

d_onevec not release

The memory of d_onevec was not release

Embedded system

Hey i am testing it on an Embedded system.

cudnn-training$ ./trainlenet

Reading input data
Done. Training dataset size: 60000, Test dataset size: 10000
Batch size: 64, iterations: 1000
Preparing dataset
Training...
Iteration time: 12.190692 ms
Classification result: 8.43% error (used 10000 images)

am i supposed to give image as an input ? How is the result displayed.

Missing LICENSE file

The project looks interesting. What's the license type of this project? Thanks.

windows环境下的一个改正

lenet.cu 文件123行（line number 124-129）
// Filenames
DEFINE_bool(pretrained, false, "Use the pretrained CUDNN model as input");
DEFINE_string(train_images, "train-images-idx3-ubyte", "Training images filename");
DEFINE_string(train_labels, "train-labels-idx1-ubyte", "Training labels filename");
DEFINE_string(test_images, "t10k-images-idx3-ubyte", "Test images filename");
DEFINE_string(test_labels, "t10k-labels-idx1-ubyte", "Test labels filename");

改为

DEFINE_string(train_images, "train-images.idx3-ubyte", "Training images filename");
DEFINE_string(train_labels, "train-labels.idx1-ubyte", "Training labels filename");
DEFINE_string(test_images, "t10k-images.idx3-ubyte", "Test images filename");
DEFINE_string(test_labels, "t10k-labels.idx1-ubyte", "Test labels filename");

Accuracy Limitations

The code can only achieve ~8% accuracy on the MNIST dataset. Why is that? The original LeNET achieves around ~0.21. There is a huge performance gap.

compile errors and excute errors

I find VS can't compile this 2 sentences:
// Disable copying
TrainingContext& operator=(const TrainingContext&) = delete;
TrainingContext(const TrainingContext&) = delete;
And I choose to annotation them, then I can compile successfully. I don't know if these two sentences are necessary.
However, when I debug it, there is a error, is about test images function, locate at line 486 in lenet.cu, error is CUDNN_STATUS_EXECUTION_FAILED at cudnnConvolutionForward(), I look up the document and it means the function failed to launch on the GPU, I don't know why, it is connected with annotation those two sentences? Really hope you can reply it ,thank you very much!
This is output:

OpenCL compilation possible now :-)

Hi Tal Ben-Nun,

This is not an issue, it's just for information: using cuda-on-cl, https://github.com/hughperkins/cuda-on-cl, I think it is now more or less possible to build and run your cudnn-training code, on an OpenCL device, without needing CUDA Toolkit etc. Simply needs some small changes to the CMakeLists.txt master...hughperkins:opencl

(I was/am using your code as a test bed for writing the cuda-on-cl compiler :-) )

Cannot compile under Mac

make
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/ext2int2adm/minst-minist/build
[ 33%] Building NVCC (Device) object CMakeFiles/trainlenet.dir/trainlenet_generated_lenet.cu.o
/Users/ext2int2adm/minst-minist/lenet.cu(438): error: too few arguments in function call

/Users/ext2int2adm/minst-minist/lenet.cu(438): error: too few arguments in function call

2 errors detected in the compilation of "/var/folders/_p/rprk3ny52js8t0frzdn3strc0000gn/T//tmpxft_00014fe8_00000000-11_lenet.compute_50.cpp1.ii".
CMake Error at trainlenet_generated_lenet.cu.o.cmake:278 (message):
Error generating file
/Users/ext2int2adm/minst-minist/build/CMakeFiles/trainlenet.dir//./trainlenet_generated_lenet.cu.o

make[2]: *** [CMakeFiles/trainlenet.dir/trainlenet_generated_lenet.cu.o] Error 1
make[1]: *** [CMakeFiles/trainlenet.dir/all] Error 2
make: *** [all] Error 2

Cannot find *.bin pretrained models

"You can also use the pre-trained weights published along with CUDNN, using the "pretrained" flag."
Hi,
First of all, thanks for writing this code using CuDNN and CuBLAS. It will be helpful for my current project. In your code, you initialised a flag named 'pretrained' if we want to use a pre-trained model. I generated a model using theano in python and stored it as a .pkl file. Now, I want to import it into the present code as pretrained model. I would like to know in what structure are you storing the network in .bin format and importing it. (I couldn't find any .bin files in your repository). Any suggestions related to this issue are appreciated!

Compile cudnn-training

I have some problems compile in VS2013. I got this error:

Error 1 error : argument of type "cudnnAddMode_t" is incompatible with parameter of type "const void *" D:\cudnn-training\lenet.cu cudnn-training

Error while make execution

The program is generating an error while executing make command. It says
cudnn-training/lenet.cu(438): error: too few arguments in function call

I am using cudnn 6 with CUDA 8 on Ubuntu 16.04.
After a few digging some results on the google search, I figured that parameters of cudnnSetConvolution2dDescriptor function have changed for version 6 of cudnn from 8 arguments to 9 arguments.

source for above conclusion.

training set images

train-images-idx3-ubyte.gz: training set images (9912422 bytes)
train-labels-idx1-ubyte.gz: training set labels (28881 bytes)
t10k-images-idx3-ubyte.gz: test set images (1648877 bytes)
t10k-labels-idx1-ubyte.gz: test set labels (4542 bytes)

do we have to unpack them ?

Pretrained net not working

Hi tbennun,
first of all thank you for Lenet sample, great tutorial for beginners like me.
As you wrote .FromFile function, I wrote a .ToFile one for reuse of trained nets (just fwrite instead of fread!). If the pair to/from is working correctly ( I know that by simply comparing values), why am I not able to reach the same score of a freshly trained net? I mean... if I run your sample without modifications, the test after the training gives for example 2% of error. But if I save the weights after the training and I reload them (the very same weights) for the test then I get 90%... I'm sure I miss something... There are many parameters in ForwardPropagation function, do I have to save/restore them too? Missing some initializations?

Thank you in advance,
M.

Runtime Error

I have the same runtime error as in issue #3. There the error is not discussed in detail and the proposed fix is to set the batch size in the test stage to 64.
In my opinion the correct batch size in the test stage should be 1 because we are classifying just one image.
Can you give me some advice? I am not sure whether I am doing something wrong or if there is a bug in the library.
Here are some more details about the problem.

Environment Description
I am using Visual Studio 2013 community edition on Win7 64. I have graphical card GeForce GTX 960.

Problem Details
I can compile the code successfully.
However when I run the example there is a run time error CUDNN_STATUS_EXECUTION_FAILED during the call test_context.ForwardPropagation() at line #982.

After examination of the code I found the following behavior.
It has to do with the "batch size" parameter of the constructor calls at lines:

#783 // Initialize CUDNN/CUBLAS training context
#784 TrainingContext context(...)

and

#968// Initialize a TrainingContext structure for testing (different batch size)
#969 TrainingContext test_context(...).

The values of the "batch size" parameter are set to 64 and 1 during training and testing respectively.
I examined the code further and found out that when the batch size is 64 the parameter
m_workspaceSize in srtucture TrainingContext is 3464. When batch size is 1 the m_workspaceSize is 19508.
I experimented with two other settings of the batch sizes. In both cases the program runs without error:

64 and 2. In both cases the m_workspaceSize value is the same 3464.
1 and 1. In both cases the m_workspaceSize value is the same 19508.

Thank you in advance.

Activation Problem

Sorry to bother if I'm wrong but, it seems like there is no activation between convolution layers...

Compile Error

I'm getting the following error during compile. I am new to cmake so I appreciate the guidance if I'm doing something wrong.

I ran cmake CMakeLists.txt and get the following output.

-- The C compiler identification is AppleClang 7.0.2.7000181
-- The CXX compiler identification is AppleClang 7.0.2.7000181
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - found
-- Found Threads: TRUE  
-- Found CUDA: /usr/local/cuda (found suitable version "7.5", minimum required is "6.5") 
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/mainuser/projects/cudnn-training

Now I ran make and got the following error.

[ 33%] Building NVCC (Device) object CMakeFiles/trainlenet.dir/trainlenet_generated_lenet.cu.o
nvcc fatal   : redefinition of argument 'std'
CMake Error at trainlenet_generated_lenet.cu.o.cmake:207 (message):
  Error generating
  /Users/adamlevy/projects/cudnn-training/CMakeFiles/trainlenet.dir//./trainlenet_generated_lenet.cu.o


make[3]: *** [CMakeFiles/trainlenet.dir/trainlenet_generated_lenet.cu.o] Error 1
make[2]: *** [CMakeFiles/trainlenet.dir/all] Error 2
make[1]: *** [CMakeFiles/trainlenet.dir/rule] Error 2
make: *** [trainlenet] Error 2

I'm running up to date OS X Yosemite on a 2012 MacBook Pro Retina.

Thank you,
Adam

Running - Error : CUDNN_STATUS_BAD_PARAM

Hi,

When I run your code on my computer I have the following error :

CUDNN failure: CUDNN_STATUS_BAD_PARAM
/home/admin/cudnn-training-master/lenet.cu:658
Aborting...

It correspond to a problem with : cudnnConvolutionBackwardFilter() function.
After verification of parameters following the cudNN documentation I was not able to find the problem.

Adress sent as parameters :

Cudnn handle : 0x627eb60
Alpha : 0x7ffe073b5ce0
pool1Tensor : 0x122d6790
Pool1 : 0xb05c20000
conv2Tensor : 0x122d67f0
dpool2 : 0xb06540000
conv2Desc : 0x122d6ab0
conv2bwalgo : 2
workspace : 0xb06840000
m_workspaceSize : 11141120
Beta : 0x7ffe073b5cf0
conv2filterDesc : 0x122d6a10
gconv2 : 0xb058fa000

so, no NULL values

And it seems that every other mentioned issues mentioned in the documentation do not concern your program.

Do you have an idea how to solve this problem ?

Thanks,

Nicolas

some compiling problem of the ZOPH_RNN

i'm using ZOPH_RNN of https://github.com/isi-nlp/Zoph_RNN to train a seq2seq model. but there are some problem when i compile the source code. Problems are as follows:

nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
src/conv_char.hpp(341): error: argument of type "int" is incompatible with parameter of type "cudnnTensorFormat_t"
detected during:
instantiation of "void Input_To_Hidden_Layer::init_Input_To_Hidden_Layer_GPU(int, int, int, int, __nv_bool, dType, __nv_bool, dType, neuralMT_model *, int, __nv_bool, dType *, __nv_bool, global_params &, __nv_bool) [with dType=precision]"
src/Input_To_Hidden_Layer.hpp(193): here
instantiation of "void Input_To_Hidden_Layer::init_Input_To_Hidden_Layer(int, int, int, int, __nv_bool, dType, __nv_bool, dType, neuralMT_model *, int, __nv_bool, dType, __nv_bool, __nv_bool, dType *, __nv_bool, global_params &, __nv_bool) [with dType=precision]"
src/model.hpp(109): here
instantiation of "void neuralMT_model::initModel(int, int, int, int, int, __nv_bool, dType, __nv_bool, dType, std::string, std::string, __nv_bool, __nv_bool, __nv_bool, int, int, __nv_bool, int, std::vector<int, std::allocator>, __nv_bool, dType, attention_params, global_params &) [with dType=precision]"
src/main.cu(1297): here

src/conv_char.hpp(341): error: too few arguments in function call
detected during:
instantiation of "void Input_To_Hidden_Layer::init_Input_To_Hidden_Layer_GPU(int, int, int, int, __nv_bool, dType, __nv_bool, dType, neuralMT_model , int, __nv_bool, dType , __nv_bool, global_params &, __nv_bool) [with dType=precision]"
src/Input_To_Hidden_Layer.hpp(193): here
instantiation of "void Input_To_Hidden_Layer::init_Input_To_Hidden_Layer(int, int, int, int, __nv_bool, dType, __nv_bool, dType, neuralMT_model , int, __nv_bool, dType, __nv_bool, __nv_bool, dType , __nv_bool, global_params &, __nv_bool) [with dType=precision]"
src/model.hpp(109): here
instantiation of "void neuralMT_model::initModel(int, int, int, int, int, __nv_bool, dType, __nv_bool, dType, std::string, std::string, __nv_bool, __nv_bool, __nv_bool, int, int, __nv_bool, int, std::vector<int, std::allocator>, __nv_bool, dType, attention_params, global_params &) [with dType=precision]"
src/main.cu(1297): here

can any one help me out T_T, maybe the cuDNN version not match??