Giter Site home page Giter Site logo

warp-ctc's Introduction

Baidu Logo

In Chinese 中文版

warp-ctc

A fast parallel implementation of CTC, on both CPU and GPU.

Introduction

Connectionist Temporal Classification is a loss function useful for performing supervised learning on sequence data, without needing an alignment between input data and labels. For example, CTC can be used to train end-to-end systems for speech recognition, which is how we have been using it at Baidu's Silicon Valley AI Lab.

DSCTC

The illustration above shows CTC computing the probability of an output sequence "THE CAT ", as a sum over all possible alignments of input sequences that could map to "THE CAT ", taking into account that labels may be duplicated because they may stretch over several time steps of the input data (represented by the spectrogram at the bottom of the image). Computing the sum of all such probabilities explicitly would be prohibitively costly due to the combinatorics involved, but CTC uses dynamic programming to dramatically reduce the complexity of the computation. Because CTC is a differentiable function, it can be used during standard SGD training of deep neural networks.

In our lab, we focus on scaling up recurrent neural networks, and CTC loss is an important component. To make our system efficient, we parallelized the CTC algorithm, as described in this paper. This project contains our high performance CPU and CUDA versions of the CTC loss, along with bindings for Torch. The library provides a simple C interface, so that it is easy to integrate into deep learning frameworks.

This implementation has improved training scalability beyond the performance improvement from a faster parallel CTC implementation. For GPU-focused training pipelines, the ability to keep all data local to GPU memory allows us to spend interconnect bandwidth on increased data parallelism.

Performance

Our CTC implementation is efficient compared with many of the other publicly available implementations. It is also written to be as numerically stable as possible. The algorithm is numerically sensitive and we have observed catastrophic underflow even in double precision with the standard calculation - the result of division of two numbers on the order of 1e-324 which should have been approximately one, instead become infinity when the denominator underflowed to 0. Instead, by performing the calculation in log space, it is numerically stable even in single precision floating point at the cost of significantly more expensive operations. Instead of one machine instruction, addition requires the evaluation of multiple transcendental functions. Because of this, the speed of CTC implementations can only be fairly compared if they are both performing the calculation the same way.

We compare our performance with Eesen, a CTC implementation built on Theano, and a Cython CPU only implementation Stanford-CTC. We benchmark the Theano implementation operating on 32-bit floating-point numbers and doing the calculation in log-space, in order to match the other implementations we compare against. Stanford-CTC was modified to perform the calculation in log-space as it did not support it natively. It also does not support minibatches larger than 1, so would require an awkward memory layout to use in a real training pipeline, we assume linear increase in cost with minibatch size.

We show results on two problem sizes relevant to our English and Mandarin end-to-end models, respectively, where T represents the number of timesteps in the input to CTC, L represents the length of the labels for each example, and A represents the alphabet size.

On the GPU, our performance at a minibatch of 64 examples ranges from 7x faster to 155x faster than Eesen, and 46x to 68x faster than the Theano implementation.

GPU Performance

Benchmarked on a single NVIDIA Titan X GPU.

T=150, L=40, A=28 warp-ctc Eesen Theano
N=1 3.1 ms .5 ms 67 ms
N=16 3.2 ms 6 ms 94 ms
N=32 3.2 ms 12 ms 119 ms
N=64 3.3 ms 24 ms 153 ms
N=128 3.5 ms 49 ms 231 ms
T=150, L=20, A=5000 warp-ctc Eesen Theano
N=1 7 ms 40 ms 120 ms
N=16 9 ms 619 ms 385 ms
N=32 11 ms 1238 ms 665 ms
N=64 16 ms 2475 ms 1100 ms
N=128 23 ms 4950 ms 2100 ms

CPU Performance

Benchmarked on a dual-socket machine with two Intel E5-2660 v3 processors - warp-ctc used 40 threads to maximally take advantage of the CPU resources. Eesen doesn't provide a CPU implementation. We noticed that the Theano implementation was not parallelizing computation across multiple threads. Stanford-CTC provides no mechanism for parallelization across threads.

T=150, L=40, A=28 warp-ctc Stanford-CTC Theano
N=1 2.6 ms 13 ms 15 ms
N=16 3.4 ms 208 ms 180 ms
N=32 3.9 ms 416 ms 375 ms
N=64 6.6 ms 832 ms 700 ms
N=128 12.2 ms 1684 ms 1340 ms
T=150, L=20, A=5000 warp-ctc Stanford-CTC Theano
N=1 21 ms 31 ms 850 ms
N=16 37 ms 496 ms 10800 ms
N=32 54 ms 992 ms 22000 ms
N=64 101 ms 1984 ms 42000 ms
N=128 184 ms 3968 ms 86000 ms

Interface

The interface is in include/ctc.h. It supports CPU or GPU execution, and you can specify OpenMP parallelism if running on the CPU, or the CUDA stream if running on the GPU. We took care to ensure that the library does not perform memory allocation internally, in order to avoid synchronizations and overheads caused by memory allocation.

Compilation

warp-ctc has been tested on Ubuntu 14.04 and OSX 10.10. Windows is not supported at this time.

First get the code:

git clone https://github.com/baidu-research/warp-ctc.git
cd warp-ctc

create a build directory:

mkdir build
cd build

if you have a non standard CUDA install export CUDA_BIN_PATH=/path_to_cuda so that CMake detects CUDA and to ensure Torch is detected, make sure th is in $PATH

run cmake and build:

cmake ../
make

The C library and torch shared libraries should now be built along with test executables. If CUDA was detected, then test_gpu will be built; test_cpu will always be built.

Tests

To run the tests, make sure the CUDA libraries are in LD_LIBRARY_PATH (DYLD_LIBRARY_PATH for OSX).

The Torch tests must be run from the torch_binding/tests/ directory.

Torch Installation

luarocks make torch_binding/rocks/warp-ctc-scm-1.rockspec

You can also install without cloning the repository using

luarocks install http://raw.githubusercontent.com/baidu-research/warp-ctc/master/torch_binding/rocks/warp-ctc-scm-1.rockspec

There is a Torch CTC tutorial.

Contributing

We welcome improvements from the community, please feel free to submit pull requests.

Known Issues / Limitations

The CUDA implementation requires a device of at least compute capability 3.0.

The CUDA implementation supports a maximum label length of 639 (timesteps are unlimited).

warp-ctc's People

Contributors

arkizh avatar bryancatanzaro avatar bshillingford avatar dzhwinter avatar ekelsen avatar est31 avatar gangliao avatar hawkaaron avatar iassael avatar jaredcasper avatar lunararcanus avatar luoyetx avatar pmixer avatar shubho avatar umhan35 avatar wangchaochaohu avatar windstamp avatar xreki avatar zhwesky2010 avatar zlsh80826 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

warp-ctc's Issues

Installaiton on Ubuntu 16.04 fails

[ 10%] Building NVCC (Device) object CMakeFiles/warpctc.dir/src/warpctc_generated_reduce.cu.o
/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(36): error: identifier "__builtin_ia32_monitorx" is undefined

/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(42): error: identifier "__builtin_ia32_mwaitx" is undefined

2 errors detected in the compilation of "/tmp/tmpxft_00001829_00000000-16_reduce.compute_52.cpp1.ii".
CMake Error at warpctc_generated_reduce.cu.o.cmake:266 (message):
  Error generating file
  /home/sarvex/warp-ctc/build/CMakeFiles/warpctc.dir/src/./warpctc_generated_reduce.cu.o


CMakeFiles/warpctc.dir/build.make:70: recipe for target 'CMakeFiles/warpctc.dir/src/warpctc_generated_reduce.cu.o' failed
make[2]: *** [CMakeFiles/warpctc.dir/src/warpctc_generated_reduce.cu.o] Error 1
CMakeFiles/Makefile2:141: recipe for target 'CMakeFiles/warpctc.dir/all' failed
    make[1]: *** [CMakeFiles/warpctc.dir/all] Error 2
    Makefile:127: recipe for target 'all' failed
    make: *** [all] Error 2

Number of epochs required

Hello,
Can I kindly ask for advice on the recommended number of epochs to be run to achieve non-blank predictions when using the CTC code? My network is always tending to predict a blank symbol, and I am thinking this is an effect of the number of run epochs (which is 30 for now).
Can someone give advice on the issue? Do you expect such a behavior to be an effect of the number of epochs, or the size of the training data?

Any help would be very much appreciated. Thank you !

please look the error when I run "test_cpu"

follow the Compilation explain, create "test_cpu" file.
when I try to run the program, but it's error like this:
$ ./test_cpu
./test_cpu: error while loading shared libraries: libwarpctc.so: cannot open shared object file: No such file or directory
"libwarpctc.so" exists. What is the problem?

Torch installation without cloning fails

When I try to install warp-ctc for Torch, it simply fails with the error as shown below.

$> luarocks install http://raw.githubusercontent.com/baidu-research/warp-ctc/master/torch_binding/rocks/warp-ctc-scm-1.rockspec

Using http://raw.githubusercontent.com/baidu-research/warp-ctc/master/torch_binding/rocks/warp-ctc-scm-1.rockspec... switching to 'build' mode

Error: Error fetching file: Failed downloading http://raw.githubusercontent.com/baidu-research/warp-ctc/master/torch_binding/rocks/warp-ctc-scm-1.rockspec - warp-ctc-scm-1.rockspec

However, manually cloning the repo works.

installing without CUDA

I'm trying to install warp-ctc on a google compute instance that does not have CUDA.

Below is the output from both the cmake and make step:

CMakeLists.txt	doc  examples  include	LICENSE  python  README.md  src  tests
root@rnn-permanent-kaldi:/srv/deepspeech/src/transforms/warp-ctc# mkdir build
root@rnn-permanent-kaldi:/srv/deepspeech/src/transforms/warp-ctc# cd build/
root@rnn-permanent-kaldi:/srv/deepspeech/src/transforms/warp-ctc/build# cmake ..
-- The C compiler identification is GNU 4.9.2
-- The CXX compiler identification is GNU 4.9.2
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
CUDA_TOOLKIT_ROOT_DIR not found or specified
-- Could NOT find CUDA (missing:  CUDA_TOOLKIT_ROOT_DIR CUDA_NVCC_EXECUTABLE CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY) (Required is at least version "6.5")
-- cuda found FALSE
-- Building shared library with no GPU support
-- Configuring done
-- Generating done
-- Build files have been written to: /srv/deepspeech/src/transforms/warp-ctc/build
root@rnn-permanent-kaldi:/srv/deepspeech/src/transforms/warp-ctc/build# make
Scanning dependencies of target warpctc
[ 50%] Building CXX object CMakeFiles/warpctc.dir/src/ctc_entrypoint.cpp.o
/srv/deepspeech/src/transforms/warp-ctc/src/ctc_entrypoint.cpp:49:30: error: ‘cudaStream_t’ has not been declared
                              cudaStream_t stream,
                              ^
/srv/deepspeech/src/transforms/warp-ctc/src/ctc_entrypoint.cpp: In function ‘int compute_ctc_gpu(const float*, float*, const int*, const int*, const int*, int, int, float*, int, char*)’:
/srv/deepspeech/src/transforms/warp-ctc/src/ctc_entrypoint.cpp:50:53: error: conflicting declaration of C function ‘int compute_ctc_gpu(const float*, float*, const int*, const int*, const int*, int, int, float*, int, char*)’
                              char *ctc_gpu_workspace){
                                                     ^
In file included from /srv/deepspeech/src/transforms/warp-ctc/src/ctc_entrypoint.cpp:5:0:
/srv/deepspeech/src/transforms/warp-ctc/include/ctc.h:99:5: note: previous declaration ‘int compute_ctc_gpu(const float*, float*, const int*, const int*, const int*, int, int, float*, CUstream, char*)’
 int compute_ctc_gpu(const float* const activations,
     ^
CMakeFiles/warpctc.dir/build.make:54: recipe for target 'CMakeFiles/warpctc.dir/src/ctc_entrypoint.cpp.o' failed
make[2]: *** [CMakeFiles/warpctc.dir/src/ctc_entrypoint.cpp.o] Error 1
CMakeFiles/Makefile2:95: recipe for target 'CMakeFiles/warpctc.dir/all' failed
make[1]: *** [CMakeFiles/warpctc.dir/all] Error 2
Makefile:117: recipe for target 'all' failed
make: *** [all] Error 2

I'm not familiar enough with C compiling to know where this error is or how to fix it, but I'm assuming that it has something to do with the make file not finding CUDA.

Any help would be greatly appreciated.

test_gpu Fails!

I am unable to run ./test_gpu, I've copied the result of ldd test_gpu
linux-vdso.so.1 => (0x00007ffe271e6000)
libcudart.so.7.5 => /usr/local/cuda/lib64/libcudart.so.7.5 (0x00007f96c0807000)
libwarpctc.so => /home/sarunac4/baidu/warp-ctc/build/libwarpctc.so (0x00007f96c0450000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f96c014c000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f96bfe46000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f96bfc30000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f96bf86b000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f96bf667000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f96bf449000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f96bf241000)
libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f96bf032000)
/lib64/ld-linux-x86-64.so.2 (0x00007f96c0a65000)
But when I try running ./test_gpu I get the error
Running GPU tests
terminate called after throwing an instance of 'thrust::system::system_error'
what(): cudaSetDevice: unknown error
Aborted (core dumped)

I am running on Ubuntu 14.04, with a TitanX GPU and have the NVIDIA driver version 352.63 installed, any help would be appreciated.

Regards,
Deepak Kadetotad
PhD Student

then loss is inf

When i train the model using the warp ctc,the ctc criterion return the loss that is inf.Is there anything wrong ?How could i solve this problem?

the installation of wrap-ctc

Hello,
May I kindly ask for help on the the installation of wrap-ctc? when I install the wrap-ctc by running "luarocks make torch_binding/rocks/warp-ctc-scm-1.rockspec" at the top level directory,one error occurs as follows:
Missing dependencies for warp-ctc:
torch >= 7.0

Error: Could not satisfy dependency: torch >= 7.0**
so, I don't have any idea about this.
anyone has encountered such problem? Can you give me some advices on this ?Any help would be appreciated .thanks

torch tutorial

In the first paragraph, the tutorial said use 'abcd' four characters, but why did it get a 'daceba' at last?

Attribute error: Module object has not attribute "warpCTC"

Traceback (most recent call last):
File "lstm_ocr.py", line 186, in
symbol = sym_gen(SEQ_LENGTH)
File "lstm_ocr.py", line 177, in sym_gen
num_label = num_label)
File "/home/pratikgoyal/Desktop/lstm.py", line 76, in lstm_unroll
sm = mx.sym.WarpCTC(data=pred, label=label, label_length = num_label, input_length = seq_len)

getting the attribute error as i mentioned, while running lstmocr.py

Using the torch bindings to experiment with CTC

Hello,
May I kindly ask for some help?while Using the torch bindings to experiment with CTC, I want to use the code to do some calculation by " require 'cutorch' ",however one error occurs as follows:
image

And I have compiled with GPU support,standard CUDA.
While I do the experiment with CPU,the situation as follows:
image

so, I have no idea why there is difference between them .
anyone has encountered such problem? Can you give me some advices on this ?Any help would be appreciated .thanks

Building error

Have you encountered the following error?

Scanning dependencies of target warpctc
Linking CXX shared library libwarpctc.so
/usr/bin/ld: cannot find -lTHC
collect2: error: ld returned 1 exit status
make[2]: *** [libwarpctc.so] Error 1
make[1]: *** [CMakeFiles/warpctc.dir/all] Error 2
make: *** [all] Error 2

How to overcome it?

dlopen: cannot load any more object with static TLS

warpctc_tensorflow sometimes raises this error if it is imported after tensorflow imports another library that uses a *.so file, like in the following case:

>>> import tensorflow.contrib.ffmpeg
>>> import warpctc_tensorflow

tensorflow.python.framework.errors_impl.NotFoundError: dlopen: cannot load any more object with static TLS

A temporary fix is to just import warpctc_tensorflow beforehand, which seems to not trigger the error.

>>> import warpctc_tensorflow
>>> import tensorflow.contrib.ffmpeg

This fix, however, is quite ugly, and often means importing warpctc_tensorflow in the central main.py or train.py, as well as all associated test runners, etc. The actual ctc function is often only called in a small library function.

The gpu_ctc function returns 0

Hello,when I use gpu_ctc function, it always returns 0, but with cpu_ctc, it returns the correct value. Can anyone gives me some advices on this? Any help would be appreciated, many thanks.

image

image

can not work on gtx1080

error test_gpu:

Running GPU tests
terminate called after throwing an instance of 'std::runtime_error'
  what():  Error: compute_ctc_loss in small_test, stat = execution failed
Aborted (core dumped)

Building error, OpenBlas

make[2]: *** No rule to make target `/opt/OpenBLAS/lib/libopenblas.so', needed by `libwarpctc.so'.  Stop.
make[1]: *** [CMakeFiles/warpctc.dir/all] Error 2
make: *** [all] Error 2

I have OpenBLAS installed not in /opt/OpenBLAS/lib, but in the other local folder (i do not have root access). Is it possible to specify where the lib should look for openblas?

Gradient output from batch using torch binding

Sorry if this is answered elsewhere or blatantly obvious, but I'm not entirely sure of the formatting of the gradients after carrying out a batch like below:

th>acts = torch.Tensor({{0,0,0,0,0},{1,2,3,4,5},{-5,-4,-3,-2,-1},
                        {0,0,0,0,0},{6,7,8,9,10},{-10,-9,-8,-7,-6},
                        {0,0,0,0,0},{11,12,13,14,15},{-15,-14,-13,-12,-11}}):cuda()
th>labels = {{1}, {3,3}, {2,3}}
th>sizes = {1,3,3}
th>grads = torch.Tensor(acts:size())
th>gpu_ctc(acts, grads, labels, sizes)

{
  1 : 1.6094379425049
  2 : 7.355742931366
  3 : 4.938850402832
}

Should we expect the gradients to also be in column major formatting like how we put our multiple sequences in (i.e we would need to reverse the batching steps we did for the activation sequence with the gradients)? Thanks!

Decoding?

Just making sure: this repo only contains the code used for training, not for decoding, is that correct? (the Deep Speech 2 paper mentions using beam search to find the optimal transcription, but I don't see this in the code)

Support for Lua5.2?

Is warp-ctc not supported for Lua5.2? I had previously installed this package successfully on my Torch distribution made with Lua5.1, however I had to upgrade my Torch to Lua52 (version 5.2) and suddenly warp-ctc installation fails.

mohit.jain@node10:~/torch/warp-ctc$ luarocks make torch_binding/rocks/warp-ctc-scm-1.rockspec
Warning: unmatched variable LUALIB
cmake -E make_directory build && cd build && cmake .. -DLUALIB= -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="/users/mohit.jain/torch/install/bin/.." -DCMAKE_INSTALL_PREFIX="/users/mohit.jain/torch/install/lib/luarocks/rocks/warp-ctc/scm-1" && make -j$(getconf _NPROCESSORS_ONLN) && make install

-- cuda found TRUE
-- Found Torch7 in /users/mohit.jain/torch/install
-- Torch found /users/mohit.jain/torch/install/share/cmake/torch
-- Building shared library with GPU support
-- Building Torch Bindings with GPU support
-- Configuring done
-- Generating done
-- Build files have been written to: /users/mohit.jain/torch/warp-ctc/build
Linking CXX shared library libwarpctc.so
/usr/bin/ld: cannot find -lluajit
collect2: error: ld returned 1 exit status
make[2]: *** [libwarpctc.so] Error 1
make[1]: *** [CMakeFiles/warpctc.dir/all] Error 2
make: *** [all] Error 2

Error: Build error: Failed building.

mohit.jain@node10:~/torch/warp-ctc/build$ cmake ../
-- The C compiler identification is GNU 4.8.4
-- The CXX compiler identification is GNU 4.8.4
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Found CUDA: /usr/local/cuda (found suitable version "7.5", minimum required is "6.5") 
-- cuda found TRUE
-- Found Torch7 in /users/mohit.jain/torch/install
-- Torch found /users/mohit.jain/torch/install/share/cmake/torch
-- Building shared library with GPU support
-- Building Torch Bindings with GPU support
-- Configuring done
-- Generating done
-- Build files have been written to: /users/mohit.jain/torch/warp-ctc/build
mohit.jain@node10:~/torch/warp-ctc/build$ make
[ 16%] Building NVCC (Device) object CMakeFiles/warpctc.dir/src/./warpctc_generated_reduce.cu.o
[ 33%] Building NVCC (Device) object CMakeFiles/warpctc.dir/src/./warpctc_generated_ctc_entrypoint.cu.o
Scanning dependencies of target warpctc
Linking CXX shared library libwarpctc.so
/usr/bin/ld: cannot find -lluajit
collect2: error: ld returned 1 exit status
make[2]: *** [libwarpctc.so] Error 1
make[1]: *** [CMakeFiles/warpctc.dir/all] Error 2
make: *** [all] Error 2

Input size much larger than label length

I find if input sequence length (e.g. 10) is much larger than label length (e.g. 4), the probability of blank will be much larger than other labels.

How to solve this problem?

Build warp-ctc shared/static libs

Sometimes, we need to build a static warp-ctc library.

Also, We don't want to bind with Torch.

So, PR is here #65, which did not influence the current status of warp-ctc .

Meaning of N

Hello,

Could you please explain what N symbol means in README.md benchmark tables?
L, A, T dimensions are described well but I cannot find a description for N.

Sorry if I am missing something obvious.
Anyway, thank you for your work,
Pavel

warp-ctc for cmake ExternalProject_Add

I planned to integrate warp-ctc in PaddlePaddle via ExternalProject_Add.

But, there is a problem for current warp-ctc cmake building system which automatically
check host system if it's cuda supported.

But, we want to control with cpu/gpu from our system, not from warp-ctc.

ExternalProject_Add(
    warpctc
    GIT_REPOSITORY "https://github.com/baidu-research/warp-ctc.git"
    GIT_TAG "v1.0"
    PREFIX ${WARPCTC_SOURCES_DIR}
    CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${WARPCTC_INSTALL_DIR}
    CMAKE_ARGS -DWITH_GPU=ON   # we want to add this flag in warp-ctc
    LOG_DOWNLOAD=ON
    UPDATE_COMMAND ""
)

I want to give a PR for warp-ctc's CMakeLists.txt.

optional(WITH_GPU, "compile warp-ctc with gpu", ${CUDA_FOUND})

"gpu_ctc" problem ?

When I try the example, I encounter a problem as follow.
"
th> gpu_ctc -h
[string "_RESULT={gpu_ctc -h}"]:1: attempt to perform arithmetic on global 'gpu_ctc' (a nil value)
stack traceback:
[string "_RESULT={gpu_ctc -h}"]:1: in main cheunk
[C]: in function 'xpcall'
/root/torch/install/share/lua/5.1/trepl/init.lua:651: in function 'repl'
/root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
[C]: at 0x00406670
"

Installation problems (TF bindings)

I'm trying to install tensorflow binding and have followed the steps:

cloned tf repo (without building it, tf is installed via pip)
Sucessfully installed package.
It fails on tests:

test_ctc_loss_op (unittest.loader._FailedTest) ... ERROR
test_warpctc_op (unittest.loader._FailedTest) ... ERROR

======================================================================
ERROR: test_ctc_loss_op (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: test_ctc_loss_op
Traceback (most recent call last):
  File "/usr/lib/python3.5/unittest/loader.py", line 428, in _find_test_path
    module = self._get_module_from_name(name)
  File "/usr/lib/python3.5/unittest/loader.py", line 369, in _get_module_from_name
    __import__(name)
  File "/tmp/warp-ctc/tensorflow_binding/tests/test_ctc_loss_op.py", line 23, in <module>
    import warpctc_tensorflow
  File "/tmp/warp-ctc/tensorflow_binding/warpctc_tensorflow/__init__.py", line 7, in <module>
    _warpctc = tf.load_op_library(lib_file)
  File "/usr/lib/python3.5/site-packages/tensorflow/python/framework/load_library.py", line 64, in load_op_library
    None, None, error_msg, error_code)
tensorflow.python.framework.errors_impl.NotFoundError: /tmp/warp-ctc/tensorflow_binding/warpctc_tensorflow/kernels.cpython-35m-x86_64-linux-gnu.so: undefined symbol: _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE


======================================================================
ERROR: test_warpctc_op (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: test_warpctc_op
Traceback (most recent call last):
  File "/usr/lib/python3.5/unittest/loader.py", line 428, in _find_test_path
    module = self._get_module_from_name(name)
  File "/usr/lib/python3.5/unittest/loader.py", line 369, in _get_module_from_name
    __import__(name)
  File "/tmp/warp-ctc/tensorflow_binding/tests/test_warpctc_op.py", line 3, in <module>
    from warpctc_tensorflow import ctc
  File "/tmp/warp-ctc/tensorflow_binding/warpctc_tensorflow/__init__.py", line 7, in <module>
    _warpctc = tf.load_op_library(lib_file)
  File "/usr/lib/python3.5/site-packages/tensorflow/python/framework/load_library.py", line 64, in load_op_library
    None, None, error_msg, error_code)
tensorflow.python.framework.errors_impl.NotFoundError: /tmp/warp-ctc/tensorflow_binding/warpctc_tensorflow/kernels.cpython-35m-x86_64-linux-gnu.so: undefined symbol: _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE


----------------------------------------------------------------------
Ran 2 tests in 0.000s

FAILED (errors=2)
/t/w/tensorflow_binding »            

Ls of directory in question:

/t/w/tensorflow_binding » ls /tmp/warp-ctc/tensorflow_binding/warpctc_tensorflow      master ✔
__pycache__  __init__.py  kernels.cpython-35m-x86_64-linux-gnu.so

Installaition question“/home/geff/kaldi-ctc-master/tools/warp-ctc/src/ctc_entrypoint.cu(1): error: this declaration has no storage class or type specifier”

Hello,when i install warp-ctc ,i get this question:
[ 14%] Building NVCC (Device) object CMakeFiles/warpctc.dir/src/warpctc_generated_ctc_entrypoint.cu.o
/home/geff/kaldi-ctc-master/tools/warp-ctc/src/ctc_entrypoint.cu(1): error: this declaration has no storage class or type specifier

/home/geff/kaldi-ctc-master/tools/warp-ctc/src/ctc_entrypoint.cu(1): error: expected a ";"

2 errors detected in the compilation of "/tmp/tmpxft_00001c98_00000000-16_ctc_entrypoint.compute_52.cpp1.ii".
CMake Error at warpctc_generated_ctc_entrypoint.cu.o.cmake:266 (message):
Error generating file
/home/geff/kaldi-ctc-master/tools/warp-ctc/build/CMakeFiles/warpctc.dir/src/./warpctc_generated_ctc_entrypoint.cu.o

CMakeFiles/warpctc.dir/build.make:187: recipe for target 'CMakeFiles/warpctc.dir/src/warpctc_generated_ctc_entrypoint.cu.o' failed
make[2]: *** [CMakeFiles/warpctc.dir/src/warpctc_generated_ctc_entrypoint.cu.o] Error 1
CMakeFiles/Makefile2:141: recipe for target 'CMakeFiles/warpctc.dir/all' failed
make[1]: *** [CMakeFiles/warpctc.dir/all] Error 2
Makefile:127: recipe for target 'all' failed
make: *** [all] Error 2

anyone has encountered such problem? Can you give me some advices on this ?Any help would be appreciated .thanks

make error on mac osx 10.12

Hi, when I make warp-ctc on mac osx 10.12, I get this error:

nvcc fatal   : The version ('80000') of the host compiler ('Apple clang') is not supported

My xcode version is 8.2.1. Except downgrade my xcode version, how can I solve this problem?

Thanks for your replay.

Errors: test_gpu doesn't work

Hi all
I have these errors when I try to run 'test_gpu':

Running GPU tests
terminate called after throwing an instance of 'std::runtime_error'
what(): Error: compute_ctc_loss in small_test, stat = execution failed
Aborted (core dumped)

Note that: I can run 'test_cpu' without any problems (Running CPU tests, Tests pass)

I was wondering I might miss some libraries or anything else.
Every kind of suggestion would be appreciated!

Thank you so much.

Warp-CTC error on GPU : "cuda memcpy or memset failed"

I compile mxnet with warp-ctc plugin.
My env is: Ubuntu 14.04 + CUDA 8.0 + cuDNN 5.1 + Torch 7.0, GTX960.

When I compile warp-ctc, everything is normal, and passed "warp-ctc/build/test_gpu".

I rebuild mxnet successfully, except it shows warning : "nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning)."

Whichever I execute "example/image-classification/train_mnist.py" on mx.context.cpu(0) or mx.context.gpu(0), it works normal.

But when I execute "example/warpctc/toy_ctc.py" on mx.context.gpu(0), an error occurred :

terminate called after throwing an instance of 'std::runtime_error'
  what():  Error: compute_ctc_loss, stat = cuda memcpy or memset failed"

And if I use cpu, it is OK.

How to solve this problem?THX.

'THC.h' file not found error

I've tried to install torch binding, but always got error

/tmp/luarocks_warp-ctc-scm-1-3075/warp-ctc/torch_binding/binding.cpp:16:14: fatal error:
  'THC.h' file not found
#include "THC.h"

I'm using Mac OSX 11.
Could you please tell me which kind is this error ?

test_gpu fails on GTX 1060

The execution of test_gpu fails on a GTX 1060 in a freshly installed Ubuntu 16.04 and cuda 8.0.44. This happens even after following the solutions given in #40 and #46 . More specifically I cloned the master branch and added to the CMakeLists.txt

set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_53,code=sm_53")
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_60,code=sm_60")
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_61,code=sm_61")
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_62,code=sm_62")

then compiled and when executing text_gpu I get

Running GPU tests
terminate called after throwing an instance of 'std::runtime_error'
  what():  Error: compute_ctc_loss in small_test, stat = execution failed
Aborted (core dumped)

Why the software history was not kept?

Hi there,

I'm a researcher studying software evolution. As part of my current research, I'm studying the implications of open-sourcing a proprietary software, for instance, if the project succeed in attracting newcomers. However, I observed that some projects, like warp-ctc, deleted the software history during the transition to open-source.

Knowing that software history is indispensable for developers (e.g., developers need to refer to history several times a day), I would like to ask warp-ctc developers the following four brief questions:

  1. Why did you decide to not keep the software history?
  2. Do the core developers faced any kind of problems, when trying to refer to the old history? If so, how did they solve these problems?
  3. Do the newcomers faced any kind of problems, when trying to refer to the old history? If so, how did they solve these problems?
  4. How does the lack of history impacted on software evolution? Does it placed any burden in understanding and evolving the software?

Thanks in advance for your collaboration,

Gustavo Pinto, PhD
http://www.gustavopinto.org

infinite CTC costs

Apologies if I misunderstood something, but running the following code seems to return infinite CTC costs, though the gradients are fine.

th> require 'warp_ctc'
th> acts = torch.Tensor({{0,-150,0,0,0}}):float()
th> grads = torch.zeros(acts:size()):float()
th> labels = {{1}}
th> sizes = {1}
th> cpu_ctc(acts, grads, labels, sizes)
{
  1 : inf
}
th> print(grads)
 0.2500  0.0000  0.2500  0.2500  0.2500
[torch.FloatTensor of size 1x5]

Is this simply something that we have to guard against in our own Softmax code?

tensorflow binding error: "Could not find file or directory /root/tensorflow/_python_build/tensorflow/include."

I want to use warpctc for tensorflow, but I encounter some problems.

I install tensorflow form the sourcecode, and set all necessary environment variables which is mentioned in installation instructions. However, when I go to the tensorflow_bind dir and run the command "python setup.py install", terminals shows "Could not find file or directory /root/tensorflow/_python_build/tensorflow/include." Actually, I did cannot find that directory in tensorflow.

dir /root/tensorflow/_python_build/tensorflow/
---------------------------
contrib  core  examples  __init__.py  __init__.pyc  models  python  stream_executor  tensorboard  tools

so, I edit file "setup.py" and change "tf_includes=[tf_include, tf_src_dir]" to "tf_includes=[tf_src_dir]",
then I run "python setup.py install" to install warpctc_tensorflow successfully.

I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so.8.0 locally
running install
running bdist_egg
running egg_info
writing warpctc_tensorflow.egg-info/PKG-INFO
writing top-level names to warpctc_tensorflow.egg-info/top_level.txt
writing dependency_links to warpctc_tensorflow.egg-info/dependency_links.txt
reading manifest file 'warpctc_tensorflow.egg-info/SOURCES.txt'
writing manifest file 'warpctc_tensorflow.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/warpctc_tensorflow
copying build/lib.linux-x86_64-2.7/warpctc_tensorflow/__init__.py -> build/bdist.linux-x86_64/egg/warpctc_tensorflow
copying build/lib.linux-x86_64-2.7/warpctc_tensorflow/kernels.so -> build/bdist.linux-x86_64/egg/warpctc_tensorflow
byte-compiling build/bdist.linux-x86_64/egg/warpctc_tensorflow/__init__.py to __init__.pyc
creating stub loader for warpctc_tensorflow/kernels.so
byte-compiling build/bdist.linux-x86_64/egg/warpctc_tensorflow/kernels.py to kernels.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying warpctc_tensorflow.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying warpctc_tensorflow.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying warpctc_tensorflow.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying warpctc_tensorflow.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
warpctc_tensorflow.__init__: module references __path__
creating 'dist/warpctc_tensorflow-0.1-py2.7-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing warpctc_tensorflow-0.1-py2.7-linux-x86_64.egg
creating /usr/local/lib/python2.7/dist-packages/warpctc_tensorflow-0.1-py2.7-linux-x86_64.egg
Extracting warpctc_tensorflow-0.1-py2.7-linux-x86_64.egg to /usr/local/lib/python2.7/dist-packages
Adding warpctc-tensorflow 0.1 to easy-install.pth file

Installed /usr/local/lib/python2.7/dist-packages/warpctc_tensorflow-0.1-py2.7-linux-x86_64.egg
Processing dependencies for warpctc-tensorflow==0.1
Finished processing dependencies for warpctc-tensorflow==0.1

However, when run "import warpctc_tensorflow", error happens:

>>> import warpctc_tensorflow
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so.8.0 locally
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/warpctc_tensorflow/__init__.py", line 7, in <module>
    _warpctc = tf.load_op_library(lib_file)
  File "/root/tensorflow/_python_build/tensorflow/python/framework/load_library.py", line 64, in load_op_library
    None, None, error_msg, error_code)
tensorflow.python.framework.errors_impl.NotFoundError: /usr/local/lib/python2.7/dist-packages/warpctc_tensorflow/kernels.so: undefined symbol: _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE

In addition, the version of my tensorflow is '0.11.head'.

tensorflow.__version__
'0.11.head'

Could anyone give any suggestions? Thank you.

Failing GPU tests on CUDA 8

I'm running on a GTX 1070. This is compiled with CUDA 8.0 release candidate.

The ./test_gpu script fails with the following error:

Running GPU tests
terminate called after throwing an instance of 'std::runtime_error'
  what():  Error: compute_ctc_loss in small_test, stat = execution failed
Aborted (core dumped)

Attaching a debugger, I see:

(gdb) run
Starting program: /warp-ctc/build/test_gpu 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Running GPU tests
[New Thread 0x7fffef84b700 (LWP 10325)]
[New Thread 0x7fffef04a700 (LWP 10326)]
terminate called after throwing an instance of 'std::runtime_error'
  what():  Error: compute_ctc_loss in small_test, stat = execution failed

Program received signal SIGABRT, Aborted.
0x00007ffff6d55c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007ffff6d55c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007ffff6d59028 in __GI_abort () at abort.c:89
#2  0x00007ffff7660535 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007ffff765e6d6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007ffff765e703 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ffff765e922 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x0000000000403e33 in throw_on_error (message=0x4097f8 "Error: compute_ctc_loss in small_test", status=<optimized out>) at /storage/deep_learning/warp-ctc/tests/test.h:11
#7  small_test () at /storage/deep_learning/warp-ctc/tests/test_gpu.cu:63
#8  0x000000000040360f in main () at /storage/deep_learning/warp-ctc/tests/test_gpu.cu:333

TF Binding NaN loss issue

When I switched from tf.nn.ctc_loss to warpctc_tensorflow.ctc I got NaN loss for the early moment of training even though tf.nn.ctc_loss could learn normally.

warp-ctc-nan

I wouldn't know how should I make this issue reproducible on minimal configuration.

I had experienced NaN loss issue frequently(not always) on torch binding, and at that time I thought this NaN issue comes from training set. But when I could switched between tf implementation of ctc loss and warp-ctc, I am now suspect this NaN loss issue originated from warp-ctc core.

In the experience from torch binding of warp-ctc, warp-ctc would return NaN, inf, or -inf loss often.

For blank sequence of labels

If there is a blank output sequence for one input sequence totally, how to set the table for this sequence of target labels? Is it right to set it to an empty table {} or a table of {0}?
Thank you.

C tutorial

I am attempting to integrate warp-ctc into an existing C++ project, but can't quite work out the initialisation. Is it possible to give a small-scale example in C of how to set up and call compute_ctc_loss?

Error installing Tensorflow binding

I tried using following command and get errors, any idea? please help

cd warp-ctc
mkdir build cd build
make
then
cd ../tensorflow_binding

sudo TENSORFLOW_SRC_PATH=../tensorflow python setup.py test

...
building 'warpctc_tensorflow.kernels' extension
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/usr/local/lib/python2.7/dist-packages/tensorflow/include -I../tensorflow -I/home/siva/git/warp-ctc/tensorflow_binding/../include -I/usr/include/python2.7 -c src/ctc_op_kernel.cc -o build/temp.linux-x86_64-2.7/src/ctc_op_kernel.o -std=c++11 -fPIC -Wno-return-type
In file included from src/ctc_op_kernel.cc:7:0:
../tensorflow/../tensorflow/core/framework/op_kernel.h:516:5: error: ‘ScopedStepContainer’ does not name a type
ScopedStepContainer* step_container = nullptr;
^
../tensorflow/../tensorflow/core/framework/op_kernel.h:943:3: error: ‘ScopedStepContainer’ does not name a type
ScopedStepContainer* step_container() const {
^
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

Tensorflow Test Error

Hi,

I've installed the tensorflow binding for warp_ctc, the installation went without a hitch but after running the commandpython setup.py test I end up getting an error, I have pasted the output to the command below

I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
running test
running egg_info
writing warpctc_tensorflow.egg-info/PKG-INFO
writing top-level names to warpctc_tensorflow.egg-info/top_level.txt
writing dependency_links to warpctc_tensorflow.egg-info/dependency_links.txt
reading manifest file 'warpctc_tensorflow.egg-info/SOURCES.txt'
writing manifest file 'warpctc_tensorflow.egg-info/SOURCES.txt'
running build_ext
copying build/lib.linux-x86_64-2.7/warpctc_tensorflow/kernels.so -> warpctc_tensorflow
running test
running egg_info
writing warpctc_tensorflow.egg-info/PKG-INFO
writing top-level names to warpctc_tensorflow.egg-info/top_level.txt
writing dependency_links to warpctc_tensorflow.egg-info/dependency_links.txt
reading manifest file 'warpctc_tensorflow.egg-info/SOURCES.txt'
writing manifest file 'warpctc_tensorflow.egg-info/SOURCES.txt'
running build_ext
copying build/lib.linux-x86_64-2.7/warpctc_tensorflow/kernels.so -> warpctc_tensorflow
testBasicCPU (test_ctc_loss_op.CTCLossTest) ... I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.2155
pciBusID 0000:03:00.0
Total memory: 11.92GiB
Free memory: 11.77GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
ok
testBasicGPU (test_ctc_loss_op.CTCLossTest) ... I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
ERROR
test_session (test_ctc_loss_op.CTCLossTest)
Returns a TensorFlow Session for use in executing tests. ... ok
test_basic_cpu (test_warpctc_op.WarpCTCTest) ... I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
ok
test_basic_gpu (test_warpctc_op.WarpCTCTest) ... I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
ok
test_multiple_batches_cpu (test_warpctc_op.WarpCTCTest) ... I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
ok
test_multiple_batches_gpu (test_warpctc_op.WarpCTCTest) ... I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
ok
test_session (test_warpctc_op.WarpCTCTest)
Returns a TensorFlow Session for use in executing tests. ... ok

======================================================================
ERROR: testBasicGPU (test_ctc_loss_op.CTCLossTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/sarunac4/RNN/warp-ctc/tensorflow_binding/tests/test_ctc_loss_op.py", line 227, in testBasicGPU
    self._testBasic(use_gpu=True)
  File "/home/sarunac4/RNN/warp-ctc/tensorflow_binding/tests/test_ctc_loss_op.py", line 220, in _testBasic
    self._testCTCLoss(inputs, seq_lens, labels, loss_truth, grad_truth, use_gpu=use_gpu)
  File "/home/sarunac4/RNN/warp-ctc/tensorflow_binding/tests/test_ctc_loss_op.py", line 83, in _testCTCLoss
    (tf_loss, tf_grad) = sess.run([loss, grad])
  File "/home/sarunac4/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 717, in run
    run_metadata_ptr)
  File "/home/sarunac4/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 915, in _run
    feed_dict_string, options, run_metadata)
  File "/home/sarunac4/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 965, in _do_run
    target_list, options, run_metadata)
  File "/home/sarunac4/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 985, in _do_call
    raise type(e)(node_def, op, message)
InvalidArgumentError: Cannot assign a device to node 'CTCLoss': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
         [[Node: CTCLoss = CTCLoss[_kernel="WarpCTC", ctc_merge_repeated=true, preprocess_collapse_repeated=false, _device="/device:GPU:0"](Const_3, Const, Const_1, CTCLoss/sequence_length)]]

Caused by op u'CTCLoss', defined at:
  File "setup.py", line 126, in <module>
    test_suite = 'setup.discover_test_suite',
  File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
    dist.run_commands()
  File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
    self.run_command(cmd)
  File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
    cmd_obj.run()
  File "/home/sarunac4/tensorflow/local/lib/python2.7/site-packages/setuptools/command/test.py", line 210, in run
    self.run_tests()
  File "/home/sarunac4/tensorflow/local/lib/python2.7/site-packages/setuptools/command/test.py", line 231, in run_tests
    testRunner=self._resolve_as_ep(self.test_runner),
  File "/usr/lib/python2.7/unittest/main.py", line 94, in __init__
    self.parseArgs(argv)
  File "/usr/lib/python2.7/unittest/main.py", line 149, in parseArgs
    self.createTests()
  File "/usr/lib/python2.7/unittest/main.py", line 158, in createTests
    self.module)
  File "/usr/lib/python2.7/unittest/loader.py", line 130, in loadTestsFromNames
    suites = [self.loadTestsFromName(name, module) for name in names]
  File "/usr/lib/python2.7/unittest/loader.py", line 91, in loadTestsFromName
    module = __import__('.'.join(parts_copy))
  File "/home/sarunac4/RNN/warp-ctc/tensorflow_binding/setup.py", line 126, in <module>
    test_suite = 'setup.discover_test_suite',
  File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
    dist.run_commands()
  File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
    self.run_command(cmd)
  File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
    cmd_obj.run()
  File "/home/sarunac4/tensorflow/local/lib/python2.7/site-packages/setuptools/command/test.py", line 210, in run
    self.run_tests()
  File "/home/sarunac4/tensorflow/local/lib/python2.7/site-packages/setuptools/command/test.py", line 231, in run_tests
    testRunner=self._resolve_as_ep(self.test_runner),
  File "/usr/lib/python2.7/unittest/main.py", line 95, in __init__
    self.runTests()
  File "/usr/lib/python2.7/unittest/main.py", line 232, in runTests
    self.result = testRunner.run(self.test)
  File "/usr/lib/python2.7/unittest/runner.py", line 151, in run
    test(result)
  File "/usr/lib/python2.7/unittest/suite.py", line 70, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/suite.py", line 108, in run
    test(result)
  File "/usr/lib/python2.7/unittest/suite.py", line 70, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/suite.py", line 108, in run
    test(result)
  File "/usr/lib/python2.7/unittest/suite.py", line 70, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/suite.py", line 108, in run
    test(result)
  File "/usr/lib/python2.7/unittest/suite.py", line 70, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/suite.py", line 108, in run
    test(result)
  File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
    testMethod()
  File "/home/sarunac4/RNN/warp-ctc/tensorflow_binding/tests/test_ctc_loss_op.py", line 227, in testBasicGPU
    self._testBasic(use_gpu=True)
  File "/home/sarunac4/RNN/warp-ctc/tensorflow_binding/tests/test_ctc_loss_op.py", line 220, in _testBasic
    self._testCTCLoss(inputs, seq_lens, labels, loss_truth, grad_truth, use_gpu=use_gpu)
  File "/home/sarunac4/RNN/warp-ctc/tensorflow_binding/tests/test_ctc_loss_op.py", line 76, in _testCTCLoss
    sequence_length=seq_lens)
  File "/home/sarunac4/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/ctc_ops.py", line 144, in ctc_loss
    ctc_merge_repeated=ctc_merge_repeated)
  File "/home/sarunac4/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_ctc_ops.py", line 162, in _ctc_loss
    name=name)
  File "/home/sarunac4/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
    op_def=op_def)
  File "/home/sarunac4/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2380, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/sarunac4/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1298, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Cannot assign a device to node 'CTCLoss': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
         [[Node: CTCLoss = CTCLoss[_kernel="WarpCTC", ctc_merge_repeated=true, preprocess_collapse_repeated=false, _device="/device:GPU:0"](Const_3, Const, Const_1, CTCLoss/sequence_length)]]


----------------------------------------------------------------------
Ran 8 tests in 0.616s

FAILED (errors=1)

Please let me know how to fix this.

Regards,
Deepak

TF binding runtime error

Tried to run tests but get following error, tried using -D_GLIBCXX_USE_CXX11_ABI=0 in setup.py, still get errors. please help
_warpctc = tf.load_op_library(lib_file)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/load_library.py", line 64, in load_op_library
None, None, error_msg, error_code)
/warp-ctc/tensorflow_binding/warpctc_tensorflow/kernels.so: undefined symbol: _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.