flashlight / wav2letter Goto Github PK

Facebook AI Research's Automatic Speech Recognition Toolkit

Home Page: https://github.com/facebookresearch/wav2letter/wiki

License: Other

CMake 2.03% C 0.26% C++ 51.21% Python 23.58% Dockerfile 0.26% Shell 4.37% Perl 0.78% Jupyter Notebook 17.51%

cpp deep-learning end-to-end speech-recognition wav2letter

wav2letter's Introduction

Flashlight is a fast, flexible machine learning library written entirely in C++ from the Facebook AI Research and the creators of Torch, TensorFlow, Eigen and Deep Speech. Its core features include:

Total internal modifiability including internal APIs for tensor computation.
A small footprint, with the core clocking in at under 10 MB and 20k lines of C++.
High-performance defaults featuring just-in-time kernel compilation with modern C++ via the ArrayFire tensor library.
An emphasis on efficiency and scale.

Native support in C++ and simple extensibility makes Flashlight a powerful research framework that enables fast iteration on new experimental setups and algorithms with little unopinionation and without sacrificing performance. In a single repository, Flashlight provides apps for research across multiple domains:

Automatic speech recognition (formerly wav2letter project) — Documentation | Tutorial
Image classification
Object detection
Language modeling

Project Layout

Flashlight is broken down into a few parts:

flashlight/lib contains kernels and standalone utilities for audio processing and more.
flashlight/fl is the core tensor interface and neural network library using the ArrayFire tensor library by default.
flashlight/pkg are domain packages for speech, vision, and text built on the core.
flashlight/app are applications of the core library to machine learning across domains.

Quickstart

First, build and install Flashlight and link it to your own project.

Sequential forms a sequence of Flashlight Modules for chaining computation.

Implementing a simple convnet is easy.

#include <flashlight/fl/flashlight.h>

Sequential model;

model.add(View(fl::Shape({IM_DIM, IM_DIM, 1, -1})));
model.add(Conv2D(
    1 /* input channels */,
    32 /* output channels */,
    5 /* kernel width */,
    5 /* kernel height */,
    1 /* stride x */,
    1 /* stride y */,
    PaddingMode::SAME; /* padding mode */,
    PaddingMode::SAME; /* padding mode */));
model.add(ReLU());
model.add(Pool2D(
    2 /* kernel width */,
    2 /* kernel height */,
    2 /* stride x */,
    2 /* stride y */));
model.add(Conv2D(32, 64, 5, 5, 1, 1, PaddingMode::SAME, PaddingMode::SAME));
model.add(ReLU());
model.add(Pool2D(2, 2, 2, 2));
model.add(View(fl::Shape({7 * 7 * 64, -1})));
model.add(Linear(7 * 7 * 64, 1024));
model.add(ReLU());
model.add(Dropout(0.5));
model.add(Linear(1024, 10));
model.add(LogSoftmax());

Performing forward and backward computation is straightforwards:

auto output = model.forward(input);
auto loss = categoricalCrossEntropy(output, target);
loss.backward();

See the MNIST example for a full tutorial including a training loop and dataset abstractions.

Variable is a tape-based abstraction that wraps Flashlight tensors. Tape-based Automatic differentiation in Flashlight is simple and works as you'd expect.

Autograd Example

auto A = Variable(fl::rand({1000, 1000}), true /* calcGrad */);
auto B = 2.0 * A;
auto C = 1.0 + B;
auto D = log(C);
D.backward(); // populates A.grad() along with gradients for B, C, and D.

Building and Installing

Install with vcpkg | With Docker | From Source | From Source with vcpkg | Build Your Project with Flashlight

Requirements

At minimum, compilation requires:

A C++ compiler with good C++17 support (e.g. gcc/g++ >= 7)
CMake — version 3.10 or later, and make
A Linux-based operating system.

See the full dependency list for more details if building from source.

Instructions for building/installing Python bindings can be found here.

Flashlight Build Setups

Flashlight can be broken down into several components as described above. Each component can be incrementally built by specifying the correct build options.

There are two ways to work with Flashlight:

As an installed library that you link to with your own project. This is best for building standalone applications dependent on Flashlight.
With in-source development where the Flashlight project source is changed and rebuilt. This is best if customizing/hacking the core framework or the Flashlight-provided app binaries.

Flashlight can be built in one of two ways:

With vcpkg, a C++ package manager.
From source by installing dependencies as needed.

Installing Flashlight with `vcpkg`

Library Installation with `vcpkg`

Flashlight is most-easily built and installed with vcpkg. Both the CUDA and CPU backends are supported with vcpkg. For either backend, first install Intel MKL. For the CUDA backend, install CUDA >= 9.2, cuDNN, and NCCL. Then, after installing vcpkg, install the libraries and core with:

./vcpkg/vcpkg install flashlight-cuda # CUDA backend, OR
./vcpkg/vcpkg install flashlight-cpu  # CPU backend

To install Flashlight apps, check the features available for installation by running ./vcpkg search flashlight-cuda or ./vcpkg search flashlight-cpu. Each app is a "feature": for example, ./vcpkg install flashlight-cuda[asr] installs the ASR app with the CUDA backend.

Below is the currently-supported list of features (for each of flashlight-cuda and flashlight-cpu):

flashlight-{cuda/cpu}[lib]      # Flashlight libraries
flashlight-{cuda/cpu}[nn]       # Flashlight neural net library
flashlight-{cuda/cpu}[asr]      # Flashlight speech recognition app
flashlight-{cuda/cpu}[lm]       # Flashlight language modeling app
flashlight-{cuda/cpu}[imgclass] # Flashlight image classification app

Flashlight app binaries are also built for the selected features and are installed into the vcpkg install tree's tools directory.

Integrating Flashlight into your own project with is simple using vcpkg's CMake toolchain integration.

From-Source Build with `vcpkg`

First, install the dependencies for your backend of choice using vcpkg (click to expand the below):

Installing CUDA Backend Dependencies with vcpkg

To build the Flashlight CUDA backend from source using dependencies installed with vcpkg, install CUDA >= 9.2, cuDNN, NCCL, and Intel MKL, then build the rest of the dependencies for the CUDA backend based on which Flashlight features you'd like to build:

./vcpkg install \
    cuda intel-mkl fftw3 cub kenlm                \ # if building flashlight libraries
    arrayfire[cuda] cudnn nccl openmpi cereal stb \ # if building the flashlight neural net library
    gflags glog                                   \ # if building any flashlight apps
    libsndfile                                    \ # if building the flashlight asr app
    gtest                                           # optional, if building tests

Installing CPU Backend Dependencies with vcpkg

To build the Flashlight CPU backend from source using dependencies installed with vcpkg, install Intel MKL, then build the rest of the dependencies for the CPU backend based on which Flashlight features you'd like to build:

./vcpkg install \
    intel-mkl fftw3 kenlm                              \ # for flashlight libraries
    arrayfire[cpu] gloo[mpi] openmpi onednn cereal stb \ # for the flashlight neural net library
    gflags glog                                        \ # for the flashlight runtime pkg (any flashlight apps using it)
    libsndfile                                         \ # for the flashlight speech pkg
    gtest                                                # optional, for tests

Build Using the `vcpkg` Toolchain File

To build Flashlight from source with these dependencies, clone the repository:

git clone https://github.com/flashlight/flashlight.git && cd flashlight
mkdir -p build && cd build

Then, build from source using vcpkg's CMake toolchain:

cmake .. \
    -DCMAKE_BUILD_TYPE=Release \
    -DFL_BUILD_ARRAYFIRE=ON \
    -DCMAKE_TOOLCHAIN_FILE=[path to your vcpkg clone]/scripts/buildsystems/vcpkg.cmake
make -j$(nproc)
make install -j$(nproc) # only if you want to install Flashlight for external use

To build a subset of Flashlight's features, see the build options below.

Building from Source

To build from source, first install the below dependencies. Most are available with your system's local package manager.

Some dependencies marked below are downloaded and installed automatically if not found on the local system. FL_BUILD_STANDALONE determines this behavior — if disabled, dependencies won't be downloaded and built when building Flashlight.

Once all dependencies are installed, clone the repository:

git clone https://github.com/flashlight/flashlight.git && cd flashlight
mkdir -p build && cd build

Then build all Flashlight components with:

cmake .. -DCMAKE_BUILD_TYPE=Release -DFL_BUILD_ARRAYFIRE=ON [...build options]
make -j$(nproc)
make install

Setting the MKLROOT environment variable (export MKLROOT=/opt/intel/oneapi/mkl/latest or export MKLROOT=/opt/intel/mkl on most Linux-based systems) can help CMake find Intel MKL if not initially found.

To build a smaller subset of Flashlight features/apps, see the build options below for a complete list of options.

To install Flashlight in a custom directory, use CMake's CMAKE_INSTALL_PREFIX argument. Flashlight libraries can be built as shared libraries using CMake's BUILD_SHARED_LIBS argument.

Flashlight uses modern CMake and IMPORTED targets for most dependencies. If a dependency isn't found, passing -D<package>_DIR to your cmake command or exporting <package>_DIR as an environment variable equal to the path to <package>Config.cmake can help locate dependencies on your system. See the documentation for more details. If CMake is failing to locate a package, check to see if a corresponding issue has already been created before creating your own.

Minimal setup on macOS

On MacOS, ArrayFire can be installed with homebrew and the Flashlight core can be built as follows:

brew install arrayfire
cmake .. \
      -DFL_ARRAYFIRE_USE_OPENCL=ON \
      -DFL_USE_ONEDNN=OFF \
      -DFL_BUILD_TESTS=OFF \
      -DFL_BUILD_EXAMPLES=OFF \
      -DFL_BUILD_SCRIPTS=OFF \
      -DFL_BUILD_DISTRIBUTED=OFF
make -j$(nproc)

Dependencies

Dependencies marked with * are automatically downloaded and built from source if not found on the system. Setting FL_BUILD_STANDALONE to OFF disables this behavior.

Dependencies marked with ^ are required if building with distributed training enabled (FL_BUILD_DISTRIBUTED — see the build options below). Distributed training is required for all apps.

Dependencies marked with † are installable via vcpkg. See the instructions for installing those dependencies above for doing a Flashlight from-source build.

Component	Backend	Dependencies
libraries	CUDA	CUDA >= 9.2, CUB*† (if CUDA < 11)
libraries	CPU	A BLAS library (Intel MKL >= 2018, OpenBLAS†, etc)
core	Any	ArrayFire >= 3.7.3†, an MPI library^(OpenMPI†, etc), cereal† >= 1.3.0, stb†
	CUDA	CUDA >= 9.2, NCCL^, cuDNN
	CPU	oneDNN† >= 2.5.2, gloo (with MPI)*^†
app: all	Any	Google Glog†, Gflags†
app: asr	Any	libsndfile† >= 10.0.28, a BLAS library (Intel MKL >= 2018, OpenBLAS†, etc), flashlight/text
app: imgclass	Any	-
app: imgclass	Any	-
app: lm	Any	flashlight/text*
tests	Any	Google Test (gtest, with gmock)*† >= 1.10.0

Build Options

The Flashlight CMake build accepts the following build options (prefixed with -D when running CMake from the command line):

Name	Options	Default Value	Description
FL_BUILD_ARRAYFIRE	ON, OFF	ON	Build Flashlight with the ArrayFire backend.
FL_BUILD_ARRAYFIRE	ON, OFF	ON	Downloads/builds some dependencies if not found.
FL_BUILD_LIBRARIES	ON, OFF	ON	Build the Flashlight libraries.
	ON, OFF	ON	Build the Flashlight neural net library.
	ON, OFF	ON	Build with distributed training; required for apps.
FL_BUILD_CONTRIB	ON, OFF	ON	Build contrib APIs subject to breaking changes.
FL_BUILD_APPS	ON, OFF	ON	Build applications (see below).
FL_BUILD_APP_ASR	ON, OFF	ON	Build the automatic speech recognition application.
FL_BUILD_APP_IMGCLASS	ON, OFF	ON	Build the image classification application.
FL_BUILD_APP_LM	ON, OFF	ON	Build the language modeling application.
FL_BUILD_APP_ASR_TOOLS	ON, OFF	ON	Build automatic speech recognition app tools.
FL_BUILD_TESTS	ON, OFF	ON	Build tests.
FL_BUILD_EXAMPLES	ON, OFF	ON	Build examples.
FL_BUILD_EXPERIMENTAL	ON, OFF	OFF	Build experimental components.
CMAKE_BUILD_TYPE	See docs.	Debug	See the CMake documentation.
CMAKE_INSTALL_PREFIX	[Directory]	See docs.	See the CMake documentation.

Building Your Own Project with Flashlight

Flashlight is most-easily linked to using CMake. Flashlight exports the following CMake targets when installed:

flashlight::flashlight — contains flashlight libraries as well as the flashlight core autograd and neural network library.
flashlight::fl_pkg_runtime — contains flashlight core as well as common utilities for training (logging / flags / distributed utils).
flashlight::fl_pkg_vision — contains flashlight core as well as common utilities for vision pipelines.
flashlight::fl_pkg_text — contains flashlight core as well as common utilities for dealing with text data.
flashlight::fl_pkg_speech — contains flashlight core as well as common utilities for dealing with speech data.
flashlight::fl_pkg_halide — contains flashlight core and extentions to easily interface with halide.

Given a simple project.cpp file that includes and links to Flashlight:

#include <iostream>

#include <flashlight/fl/flashlight.h>

int main() {
  fl::init();
  fl::Variable v(fl::full({1}, 1.), true);
  auto result = v + 10;
  std::cout << "Tensor value is " << result.tensor() << std::endl; // 11.000
  return 0;
}

The following CMake configuration links Flashlight and sets include directories:

cmake_minimum_required(VERSION 3.10)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

add_executable(myProject project.cpp)

find_package(flashlight CONFIG REQUIRED)
target_link_libraries(myProject PRIVATE flashlight::flashlight)

With a `vcpkg` Flashlight Installation

If you installed Flashlight with vcpkg, the above CMake configuration for myProject can be built by running:

cd project && mkdir build && cd build
cmake .. \
  -DCMAKE_TOOLCHAIN_FILE=[path to vcpkg clone]/scripts/buildsystems/vcpkg.cmake \
  -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)

With a From-Source Flashlight Installation

If using a from-source installation of Flashlight, Flashlight will be found automatically by CMake:

cd project && mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)

If Flashlight is installed in a custom location using a CMAKE_INSTALL_PREFIX, passing -Dflashlight_DIR=[install prefix]/share/flashlight/cmake as an argument to your cmake command can help CMake find Flashlight.

Building and Running Flashlight with Docker

Flashlight and its dependencies can also be built with the provided Dockerfiles; see the accompanying Docker documentation for more information.

Contributing and Contact

Contact: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

Flashlight is being very actively developed. See CONTRIBUTING for more on how to help out.

Acknowledgments

Some of Flashlight's code is derived from arrayfire-ml.

Citing

You can cite Flashlight using:

@misc{kahn2022flashlight,
      title={Flashlight: Enabling Innovation in Tools for Machine Learning},
      author={Jacob Kahn and Vineel Pratap and Tatiana Likhomanenko and Qiantong Xu and Awni Hannun and Jeff Cai and Paden Tomasello and Ann Lee and Edouard Grave and Gilad Avidov and Benoit Steiner and Vitaliy Liptchinsky and Gabriel Synnaeve and Ronan Collobert},
      year={2022},
      eprint={2201.12465},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

License

Flashlight is under an MIT license. See LICENSE for more information.

wav2letter's People

Contributors

Stargazers

Watchers

Forkers

qoboty nguyenducnhaty vishwgupta aaron-15 mamonraab chikaobuah chuckcho quantumgame niucheney neo4reo avsolatorio fendaq pevans nieshaoshuai yuklam6 henfee tonydeep ml-lab kustomzone johnsonc nadiamostafa alomdaelmasry mgraczyk nexus85 wojohowitz00 tony32769 pg1992 kirobo kenqyu zahorecztibor monad-one cv-ip bin2000 henry034 vivoutlaw kormilitzin uhmwpe shubhampachori12110095 tenoorja oppa3109 linecode mediaeater zhoudaqing fireae genfossil zhongxingpeng maojinhui mehrdad-shokri b-xiang entn-at inginx xzm2004260 dataswang highswitch chowjee pengyulong zhongkailv ricelingz ssh-shashi xshhhm shaunstanislauslau johncrickett imjul1an zhaoyoudong lyk125 abhyantrika munaachyuta hbcbh1999 rsantana-isg mokebuluo suzhoushr openthings fancyerii vitanie zhanghonglishanzai hylihitic zerohjy twistedmove q315099997 fuxiocteract terry1504 xkuang hanvoo99 josejamilena hgccode i-spark janx2 dreadlord1984 wawa19933 wangmengzhi youaremybean nalin-adhikari yjhbnb wuxiaobo gogobook frankatmech quyvx sathyapatel hiyoung-asr 19ai

wav2letter's Issues

Training model error

Why is this?
luajit test.lua ~/librispeech-glu-highdropout.bin -progress -show -test dev-clean -save -datadir ~/librispeech-proc/ -dictdir ~/librispeech-proc/ -gfsai

luajit: cannot open </root/librispeech-glu-highdropout.bin> in mode r at /tmp/luarocks_torch-scm-1-1319/torch7/lib/TH/THDiskFile.c:673
stack traceback:
[C]: at 0x7feba26f6720
[C]: in function 'DiskFile'
...orch/install/share/lua/5.1/wav2letter/runtime/serial.lua:54: in function 'loadmodel'
test.lua:54: in main chunk
[C]: at 0x00406670

Flac Issue in create.lua

Background/Use Case
At the moment, I'm simply interested in using it for pre-trained decoding. (I'd ideally extract the lattice/confidences as well if possible with your flags).

System: Mac El Capitan, CPU

Assumptions Made in Installation:
Skipped MKL install since not training
Boost installed through brew install
Skipped MPI installs since not planning on training again
Skipped CUDNN and CUNN installs since CPU (test.lau currently requires this, but I commented those relevant parts out).

I'm currently stuck at Training wav2letter models

luajit issue with flac data
I throughly followed the installation instructions per the README. To ultimately run the decoder, I need a librispeech-proc/letters.lst file, which I assume is generated by:

luajit ~/wav2letter/data/librispeech/create.lua...

However, downloading and unpacking everything as required, this command claims that my .flac files are in an "unimplemented format".

As far as I understood, downloading all of the LibriSpeech data is needed to decode, since you need to create letters.lst.
Perhaps you could upload a letters.lst file for LibriSpeech, and then none of this will be necessary?

Thank you!

Error: Out of Memory

I'm using the pre-trained "librispeech-glu-highdropout" model with GeForce GTX 960M GPU (2GB memory) and I'm getting this error:

| number of classes (network) = 30
| reloading model </home/vraj/librispeech-glu-highdropout.bin>
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-6610/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
luajit: /home/vraj/usr/share/lua/5.1/nn/WeightNorm.lua:201: cuda runtime error (2) : out of memory at /tmp/luarocks_cutorch-scm-1-6610/cutorch/lib/THC/generic/THCStorage.cu:66
stack traceback:
[C]: in function 'new'
/home/vraj/usr/share/lua/5.1/nn/WeightNorm.lua:201: in function 'read'
/home/vraj/usr/share/lua/5.1/torch/File.lua:351: in function 'readObject'
/home/vraj/usr/share/lua/5.1/torch/File.lua:369: in function 'readObject'
/home/vraj/usr/share/lua/5.1/torch/File.lua:369: in function 'readObject'
/home/vraj/usr/share/lua/5.1/nn/Module.lua:192: in function 'read'
/home/vraj/usr/share/lua/5.1/torch/File.lua:351: in function 'readObject'
/home/vraj/usr/share/lua/5.1/torch/File.lua:369: in function 'readObject'
/home/vraj/usr/share/lua/5.1/wav2letter/runtime/serial.lua:57: in function 'loadmodel'
/home/vraj/wav2letter/test.lua:104: in main chunk
[C]: at 0x004057a0

Is there a way around this issue?

Running test on with CPU trained model fails because of cunn

Why do we require it right away if it is required later conditionally: https://github.com/facebookresearch/wav2letter/blob/cf2d486040c9c99ff50e861012d91480f326c403/test.lua#L10 ?

I am getting this:

$ luajit ~/wav2letter/test.lua ~/librispeech-glu-highdropout-cpu.bin -progress -show -test dev-clean -save -datadir ~/librispeech-proc/ -dictdir ~/librispeech-proc/ -gfsai
luajit: /home/sasha/wav2letter/test.lua:10: module 'cunn' not found:
	no field package.preload['cunn']
	no file './cunn.lua'
	no file '/home/sasha/usr/share/luajit-2.0.4/cunn.lua'
	no file '/usr/local/share/lua/5.1/cunn.lua'
	no file '/usr/local/share/lua/5.1/cunn/init.lua'
	no file '/home/sasha/usr/share/lua/5.1/cunn.lua'
	no file '/home/sasha/usr/share/lua/5.1/cunn/init.lua'
	no file './cunn.so'
	no file '/usr/local/lib/lua/5.1/cunn.so'
	no file '/home/sasha/usr/lib/lua/5.1/cunn.so'
	no file '/usr/local/lib/lua/5.1/loadall.so'
stack traceback:
	[C]: in function 'require'
	/home/sasha/wav2letter/test.lua:10: in main chunk

unknown issue on installation

pls ignore if here it's not good place to raise question.

bash-3.2$ cd speech/
bash-3.2$ luarocks make rocks/speech-scm-1.rockspec

-- Could NOT find PkgConfig (missing: PKG_CONFIG_EXECUTABLE)
CMake Error at /Applications/CMake.app/Contents/share/cmake-3.10/Modules/FindPackageHandleStandardArgs.cmake:137 (message):
Could NOT find FFTW (missing: FFTW_INCLUDES FFTW_LIBRARIES)
Call Stack (most recent call first):
/Applications/CMake.app/Contents/share/cmake-3.10/Modules/FindPackageHandleStandardArgs.cmake:378 (_FPHSA_FAILURE_MESSAGE)
cmake/FindFFTW.cmake:94 (find_package_handle_standard_args)
CMakeLists.txt:8 (find_package)

Change data/librispeech/create.lua

Hi all,

Thanks for putting this out there. I'm still building, but ran into an issue.

This is a bit of a small change for a pull request, so I just created an issue and hope you'll update/test. I had issues with running the first "Data pre-processing" step, specifically:

luajit ~/wav2letter/data/librispeech/create.lua ~/LibriSpeech ~/librispeech-proc

When run, I get the following output:

| parsing SPEAKERS.TXT for gender...
| analyzing /LibriSpeech/train-clean-100...
luajit: /wav2letter/data/librispeech/create.lua:54: cannot open /LibriSpeech/train-clean-100: No such file or directory
stack traceback:
	[C]: in function 'dir'
	/wav2letter/data/librispeech/create.lua:54: in function 'fileapply'
	/wav2letter/data/librispeech/create.lua:72: in function 'createidx'
	/wav2letter/data/librispeech/create.lua:152: in main chunk
	[C]: at 0x00405430

When I edit line 149 of create.lua from:

   local src = string.format("%s/%s", src, subpath)

   local src = string.format("%s", src)

it seems to work fine.

There aren't corresponding directories in LibriSpeech as there are in librispeech-proc, so the subpath didn't make sense to me. I'm not a lua expert, nor do I know this codebase well, so I'm not sure this is the right solution here. I would very much appreciate someone else checking it out.

Pre-trained models, CPU-only

After a few smaller patches (forthcoming) I got through the whole process describe in the README for testing the pre-trained models. Up to the point where I can finally run test.lua. I’m pretty confident that this would run, but I can’t get test.lua to run CPU-only. I don’t have an nVidia card handy and so didn’t install any CUDA-related libraries.

So I’m running test.lua with -gpu 0 (which requires another forthcoming patch) and this is where I’m stuck:

| number of classes (network) = 30
| reloading model </Volumes/Rest/Temp/prem 201X-XX-XX/LibriSpeech-11/librispeech-glu-highdropout.bin>
luajit: ...ment/Projects/torch/install/share/lua/5.1/torch/File.lua:343: unknown Torch class <torch.CudaTensor>
stack traceback:
	[C]: in function 'error'
	...ment/Projects/torch/install/share/lua/5.1/torch/File.lua:343: in function 'readObject'
	...ment/Projects/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	...pment/Projects/torch/install/share/lua/5.1/nn/Module.lua:192: in function 'read'
	...ment/Projects/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
	...ment/Projects/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	...ment/Projects/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	...pment/Projects/torch/install/share/lua/5.1/nn/Module.lua:192: in function 'read'
	...ment/Projects/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
	...ment/Projects/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	...orch/install/share/lua/5.1/wav2letter/runtime/serial.lua:57: in function 'loadmodel'
	...do/Development/Projects/wav2letter-Facebook-STT/test.lua:108: in main chunk
	[C]: at 0x010319be60

The issue appears to be similar to soumith/dcgan.torch#28, but I don’t know enough of the architecture to make progress here quick.

It looks like the network was saved in GPU-only mode and requires some kind of conversion to run CPU-only. Is this correct?

In that case, an you supply a CPU-only or hybrid model?

Thanks for all the work you put into this project!

docker container

it would be really useful to at least provide a dockerfile or a docker-compose.yaml so that CPU inference can be evaluated without a long install (specially for non-Lua users)

Why is beamsize such big

I am looking for the motivation of why is beamsize such big. Isn't that making decoding too long, so that it cannot be used in production? What happens if beam size is much smaller?

bad argument #1 to 'lines' (/data/local/packages/ai-group.speechdata/latest/speech/letters.lst: No such file or directory)

HI
I am using pretrained model for cpu to transcribe a simple audio file but I am getting the following error. can someone help me to rectify it

command : "luajit test.lua librispeech-glu-highdropout-cpu.bin -progress -show -test audio_6134_24_2812.8_2825.73.wav -save"

luajit: /torch/install/share/lua/5.1/wav2letter/runtime/data.lua:146: bad argument #1 to 'lines' (/data/local/packages/ai-group.speechdata/latest/speech/letters.lst: No such file or directory)
stack traceback:
[C]: in function 'lines'
/torch/install/share/lua/5.1/wav2letter/runtime/data.lua:146: in function 'newdict'
test.lua:72: in main chunk
[C]: at 0x00405d50

problem happen when decode

hi there
i have download pretrain mode and dev-set(openslr 12) and
test dev-clean and
try to decode... it has error

~/usr/bin/luajit ~/git/wav2letter/decode.lua ~/pretrain dev-clean -show -letters ~/librispeech-proc/letters-rep.lst  -words ~/lm_model/dict.lst -lm ~/lm_model/3-gram.pruned.3e-7.arpa -lmweight 3.1639 -beamsize 25000 -beamscore 40 -nthread 0 -smearing max -show
PARAMETERS: -lmweight 3.163900 -wordscore 0.000000 -unkscore -inf -beamsize 25000.000000 -beamscore 40.000000 -silweight 0.000000

[loading ~/lm_model/dict.lst]
[200000 tokens found]
[loading ~/librispeech-proc/letters-rep.lst]
[30 letters found]
Loading the LM will be faster if you build a binary file.
Reading /srv/data/jhkang/wav2letter_dataset/lm_model/3-gram.pruned.3e-7.arpa
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
[Lexicon Trie memory usage: 129.24 Mb]
[torch.LongStorage of size 0]   <- this  my append line at indexeddataset.lua line 412

/home/jhkang/usr/bin/luajit: bad argument #2 to '?' (out of bounds at /home/jhkang/torch/pkg/torch/lib/TH/generic/THStorage.c:202)
stack traceback:
       [C]: at 0x7fb529f28b30
       [C]: in function '__index'
       ...nstall/share/lua/5.1/torchnet/dataset/indexeddataset.lua:413: in function '__init'
       /home/jhkang/torch/install/share/lua/5.1/torch/init.lua:91: in function </home/jhkang/torch/install/share/lua/5.1/torch/init.lua:87>
       [C]: in function 'IndexedDatasetReader'
       /home/jhkang/git/wav2letter/decode.lua:83: in function 'test'
       /home/jhkang/git/wav2letter/decode.lua:187: in main chunk
       [C]: at 0x00406020

i follow that error message and

found code

local idx = torch.LongStorage(indexfilename)            <- idx is size 0

how can i solve it?

Could not find libsndfile

When running cd wav2letter && luarocks make rocks/wav2letter-scm-1.rockspec && cd .. I get the following error:
CMake Error at cmake/FindSndFile.cmake:60 (message):
Could not find libsndfile

Is anybody else facing similar issue. I am running ubuntu 16.04.

convert own audio files into wav2letter audio format

I am trying to use pre-trained model to transcribe my own audio files. but when I am doing that I am getting this error
"inconsistent tensor size, expected tensor [24 x 24] and src [30 x 30] to have the same number of elements, but got 576 and 900 elements respectively at /tmp/luarocks_torch-scm-1-5171/torch7/lib/TH/generic/THTensorCopy.c:86"

did the facebook guys provided any info about how to convert our audio files into their format ?

ps: I am using .flac files

test.lua > Running Pre-trained Model on CPU only

Error: unknown Torch class <torch.CudaTensor>**

Issue Description

I keep getting an error when attempting to run test.lua via CPU while using the pre-trained model. I've attempted modifying the test script to suit my system and remove dependencies on GPU/CUDA.

Is there something additional I need to do to be able to avoid any CUDA dependencies at this point?

System Specs

Model: iMac Late 2015 Retina 5K
CPU: 3.2 GHz Intel Core i5
RAM: 32 GB 1867 MHz DDR3
Software: Mac OS X 10.13.2 running Intel MKI
Fresh Torch install
AMD GPU with no CUDA/cuDNN support

Fixes Attempted

I've changed all references to in wav2letter to torch.FloatTensor which should be the CPU equivalent, but still getting the error returned.

I did also comment out require cunn and cudnn dependencies and set gpu command to default of zero.
cmd:option('-gpu', 0, 'gpu device')

Code Output

luajit ./test.lua ./librispeech-glu-highdropout.bin -progress -show -test dev-clean -save -datadir ./librispeech-proc/ -dictdir ./librispeech-proc/

| number of classes (network) = 30
| reloading model <./librispeech-glu-highdropout.bin>
luajit: ...rs/macbookpro/torch/install/share/lua/5.1/torch/File.lua:343: unknown Torch class <torch.CudaTensor>
stack traceback:
	[C]: in function 'error'
	...rs/macbookpro/torch/install/share/lua/5.1/torch/File.lua:343: in function 'readObject'
	...rs/macbookpro/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	/Users/macbookpro/torch/install/share/lua/5.1/nn/Module.lua:192: in function 'read'
	...rs/macbookpro/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
	...rs/macbookpro/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	...rs/macbookpro/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	/Users/macbookpro/torch/install/share/lua/5.1/nn/Module.lua:192: in function 'read'
	...rs/macbookpro/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
	...rs/macbookpro/torch/install/share/lua/5.1/torch/File.lua:369: in function 'readObject'
	...orch/install/share/lua/5.1/wav2letter/runtime/serial.lua:57: in function 'loadmodel'
	./test.lua:97: in main chunk
	[C]: at 0x010b5bbea0

Pre-trained model in CPU

Can you please briefly tell the process to run the pre-trained model in cpu. Does running the model in cpu require Cuda. ?

CUDA driver version is insufficient for CUDA runtime version

HI
when I am running the below command
'luajit /wav2letter/test.lua /experiments/hello_librispeech/001_model_dev-clean.bin -progress -show -test dev-clean -save'

I am getting the following error 'CUDA driver version is insufficient for CUDA runtime version'. can someone help me solve this.

Ps- i am trying to run the pretrained model in gcloud gpu instance

No rule to make target `/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/System/Library/Frameworks/Accelerate.framework', needed by `libsndfile.so'

while executing the Step: "cd wav2letter && luarocks make rocks/wav2letter-scm-1.rockspec && cd .." in the wav2letter directory's sub-directory wav2letter, the below error ran out, has anyone meet the same problem? thanks~~~

I have already install the libsndfile through the "brew install libsndfile" cmd before running the step

"
No rule to make target /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/System/Library/Frameworks/Accelerate.framework', needed by libsndfile.so'
"

experiments folder not found

luajit test.lua ~/experiments/hello_librispeech/001_model_dev-clean.bin -progress -show -test dev-clean -save
luajit: cannot open </home/saisrinivas_chetti/experiments/hello_librispeech/001_model_dev-clean.bin> in mode r at /tmp/luarocks_torch-scm-1-5171/torch7/lib/TH/THDiskFile.c:673
stack traceback:
[C]: at 0x7f09a425e450
[C]: in function 'DiskFile'
/torch/install/share/lua/5.1/wav2letter/runtime/serial.lua:54: in function 'loadmodel'
test.lua:53: in main chunk
[C]: at 0x00405d50

can someone help me resolve this error ?

ai-group.speechdata reference?

While running the following:

~/Documents/librispeech-glu-highdropout-cpu.bin -progress -show -test ~Documents/librispeech-proc/dev-clean/84/121123 -save

It looks for a random file path and then crashes:
/Users/dpeskov/usr/bin/luajit: ./wav2letter/runtime/data.lua:146: bad argument #1 to 'lines' (/data/local/packages/ai-group.speechdata/latest/speech/letters.lst: No such file or directory)

I didn't find this path anywhere in runtime/data.lau, and even creating that exact file path did not resolve the issue.

Disclaimer: I did not run create/ create.sz scripts due to Flac issues, but I don't think should affect this.

could not found libri-speech folder

Hi in the running decoder section there is a librispeech-proc folder but i could not find any folder in my system. Can anyone please help.

PS: I am trying to run pre-trained model on cpu

Facing "attempt to index a nil value" during creating LibriSpeech training data

I am creating the LibriSpeech training dataset.

I have already downloaded the required dataset, such as "dev-clean".... When I executed the command "luajit data/librispeech/create.lua LibriSpeech/ librispeech-proc", I encountered an error as follows.

| parsing SPEAKERS.TXT for gender...
luajit: data/librispeech/create.lua:51: attempt to index a nil value
stack traceback:
data/librispeech/create.lua:51: in main chunk
[C]: at 0x00405d50

Is there any ideas about this error? Thanks.

inference in streaming mode

There is any plans to release code for inference in streaming mode?

issue in data pre-processing

Hi @VitaliyLi
I am trying to transcribe sample audio files using pre-trained model. I have stored my audio files in dev-clean2 folder. When I am running this command" luajit data/librispeech/create.lua Audios audio-proc
| parsing SPEAKERS.TXT for gender...
| analyzing Audios/dev-clean2...
| writing audio-proc/dev-clean2...
luajit: data/librispeech/create.lua:31: bad argument #1 to 'writeShort' (dimension 2 size must be equal to the number of channels)
stack traceback:
[C]: in function 'writeShort'
data/librispeech/create.lua:31: in function 'copytoflac'
data/librispeech/create.lua:95: in function 'createidx'
data/librispeech/create.lua:150: in main chunk
[C]: at 0x00405d50
"
I am getting this error can someone help me please

Sanity check fails (bad argument #2 to 'v' )

This is a long output, but I am new to Torch and Lua, so I don't want to skip anything.

I am using only dev-clean as train, dev, and test set, just to check, if the code compiles and my installation sent ok. This is the output and error I get:

luajit ~/wav2letter/train.lua --train -rundir ~/experiments -runname hello_librispeech -arch ~/wav2letter/arch/librispeech-glu-highdropout -lr 0.1 -lrcrit 0.0005 -gpu 1 -linseg 1 -linlr 0 -linlrcrit 0.005 -onorm target -nthread 6 -dictdir ~/librispeech-proc -datadir ~/librispeech-proc -train dev-clean -valid dev-clean -test dev-clean -gpu 0 -sqnorm -mfsc -melfloor 1 -surround "|" -replabel 2 -progress -wnorm -normclamp 0.2 -momentum 0.9 -weightdecay 1e-05
| experiment path: /home/user/experiments/hello_librispeech
| experiment runidx: 1
| number of classes (network) = 30
[Network spec]
C NCHANNEL 400 13 1
GLU
DO 0.2
C 200 440 14 1
GLU
DO 0.214
C 220 484 15 1
GLU
DO 0.22898
C 242 532 16 1
GLU
DO 0.2450086
C 266 584 17 1
GLU
DO 0.262159202
C 292 642 18 1
GLU
DO 0.28051034614
C 321 706 19 1
GLU
DO 0.30014607037
C 353 776 20 1
GLU
DO 0.321156295296
C 388 852 21 1
GLU
DO 0.343637235966
C 426 936 22 1
GLU
DO 0.367691842484
C 468 1028 23 1
GLU
DO 0.393430271458
C 514 1130 24 1
GLU
DO 0.42097039046
C 565 1242 25 1
GLU
DO 0.450438317792
C 621 1366 26 1
GLU
DO 0.481969000038
C 683 1502 27 1
GLU
DO 0.51570683004
C 751 1652 28 1
GLU
DO 0.551806308143
C 826 1816 29 1
GLU
DO 0.590432749713
C 908 1816 1 1
GLU
DO 0.590432749713
C 908 NLABEL 1 1

[network]
nn.Sequential {
[input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> (31) -> (32) -> (33) -> (34) -> (35) -> (36) -> (37) -> (38) -> (39) -> (40) -> (41) -> (42) -> (43) -> (44) -> (45) -> (46) -> (47) -> (48) -> (49) -> (50) -> (51) -> (52) -> (53) -> (54) -> (55) -> output]
(1): nn.WeightNorm @ nn.TemporalConvolution
(2): nn.GatedLinearUnit
(3): nn.Dropout(0.200000)
(4): nn.WeightNorm @ nn.TemporalConvolution
(5): nn.GatedLinearUnit
(6): nn.Dropout(0.214000)
(7): nn.WeightNorm @ nn.TemporalConvolution
(8): nn.GatedLinearUnit
(9): nn.Dropout(0.228980)
(10): nn.WeightNorm @ nn.TemporalConvolution
(11): nn.GatedLinearUnit
(12): nn.Dropout(0.245009)
(13): nn.WeightNorm @ nn.TemporalConvolution
(14): nn.GatedLinearUnit
(15): nn.Dropout(0.262159)
(16): nn.WeightNorm @ nn.TemporalConvolution
(17): nn.GatedLinearUnit
(18): nn.Dropout(0.280510)
(19): nn.WeightNorm @ nn.TemporalConvolution
(20): nn.GatedLinearUnit
(21): nn.Dropout(0.300146)
(22): nn.WeightNorm @ nn.TemporalConvolution
(23): nn.GatedLinearUnit
(24): nn.Dropout(0.321156)
(25): nn.WeightNorm @ nn.TemporalConvolution
(26): nn.GatedLinearUnit
(27): nn.Dropout(0.343637)
(28): nn.WeightNorm @ nn.TemporalConvolution
(29): nn.GatedLinearUnit
(30): nn.Dropout(0.367692)
(31): nn.WeightNorm @ nn.TemporalConvolution
(32): nn.GatedLinearUnit
(33): nn.Dropout(0.393430)
(34): nn.WeightNorm @ nn.TemporalConvolution
(35): nn.GatedLinearUnit
(36): nn.Dropout(0.420970)
(37): nn.WeightNorm @ nn.TemporalConvolution
(38): nn.GatedLinearUnit
(39): nn.Dropout(0.450438)
(40): nn.WeightNorm @ nn.TemporalConvolution
(41): nn.GatedLinearUnit
(42): nn.Dropout(0.481969)
(43): nn.WeightNorm @ nn.TemporalConvolution
(44): nn.GatedLinearUnit
(45): nn.Dropout(0.515707)
(46): nn.WeightNorm @ nn.TemporalConvolution
(47): nn.GatedLinearUnit
(48): nn.Dropout(0.551806)
(49): nn.WeightNorm @ nn.TemporalConvolution
(50): nn.GatedLinearUnit
(51): nn.Dropout(0.590433)
(52): nn.WeightNorm @ nn.TemporalConvolution
(53): nn.GatedLinearUnit
(54): nn.Dropout(0.590433)
(55): nn.WeightNorm @ nn.TemporalConvolution
}
[network kw=341 dw=1]
| neural network number of parameters: 208863942
| LinearSegCriterion uses FullConnectCriterion (C based)
| AutoSegCriterion uses ForceAlignCriterion (C based)
| AutoSegCriterion uses FullConnectCriterion (C based)
| Using momentum: 0.9
| Using L2 regularization: 1e-05
| dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: 2703 files found
2703 files found
2703 files found
2703 files found
2703 files found
2703 files found
| dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: 2703 files found
2703 files found
2703 files found
2703 files found
2703 files found
2703 files found
| 2703/2703 filtered samples
| 2703/2703 filtered samples
| 2703/2703 filtered samples| 2703/2703 filtered samples
| 2703/2703 filtered samples| batchresolution:
| 2703/2703 filtered samples
| batchresolution: 363

| batchresolution: 363
363
| batchresolution:| batchresolution: 363
363
| batchresolution: 363
| dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: 2703 files found
2703 files found
2703 files found
2703 files found
| dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: 2703 files found
2703 files found
| dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: 2703 files found
2703 files found
2703 files found
2703 files found
2703 files found
2703 files found
| 2703/2703 filtered samples
| batchresolution: 363
| 2703/2703 filtered samples
| batchresolution: 363
| 2703/2703 filtered samples
| batchresolution: 363
| 2703/2703 filtered samples
| batchresolution: 363
| 2703/2703 filtered samples
| batchresolution: 363
| 2703/2703 filtered samples
| batchresolution: 363
| dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: 2703 files found
| dataset </home/user/librispeech-proc/dev-clean>: 2703 files found
2703 files found
| dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: 2703 files found
2703 files found
2703 files found
| dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: | dataset </home/user/librispeech-proc/dev-clean>: 2703 files found
2703 files found
2703 files found
2703 files found
2703 files found
2703 files found
| 2703/2703 filtered samples
| batchresolution: 363
| 2703/2703 filtered samples
| batchresolution: 363
| 2703/2703 filtered samples
| batchresolution: 363
| 2703/2703 filtered samples
| batchresolution: 363
| 2703/2703 filtered samples
| batchresolution: 363
| 2703/2703 filtered samples
| batchresolution: 363
| resampling: size=2703
luajit: /usr/local/share/lua/5.1/nn/Container.lua:67: ..................................] ETA: 0ms | Step: 0ms
In 4 module of nn.Sequential:
/usr/local/share/lua/5.1/nn/THNN.lua:110: bad argument #2 to 'v' (invalid input frame size. Got: 400, Expected: 200 at /tmp/luarocks_nn-scm-1-7380/nn/lib/THNN/generic/TemporalConvolution.c:30)
stack traceback:
[C]: in function 'v'
/usr/local/share/lua/5.1/nn/THNN.lua:110: in function 'TemporalConvolution_updateOutput'
/usr/local/share/lua/5.1/nn/TemporalConvolution.lua:41: in function 'updateOutput'
/usr/local/share/lua/5.1/nn/WeightNorm.lua:114: in function </usr/local/share/lua/5.1/nn/WeightNorm.lua:110>
[C]: in function 'xpcall'
/usr/local/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
/usr/local/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
/usr/local/share/lua/5.1/torchnet/engine/sgdengine.lua:122: in function 'train'
/home/user/wav2letter/train.lua:907: in function 'train'
/home/user/wav2letter/train.lua:928: in main chunk
[C]: at 0x004057a0

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
/usr/local/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
/usr/local/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
/usr/local/share/lua/5.1/torchnet/engine/sgdengine.lua:122: in function 'train'
/home/user/wav2letter/train.lua:907: in function 'train'
/home/user/wav2letter/train.lua:928: in main chunk
[C]: at 0x004057a0

I would appreciate any help

while trainning, error "attempt to call field 'WeightNorm' (a nil value)" happened

while executing the training step, error "attempt to call field 'WeightNorm' (a nil value)"
has any body meet the same error what's the problem? , there were no error while installing the previous cudnn,cunn and torch installing step.....

luajit ~/wav2letter/train.lua --train -rundir ~/experiments -runname hello_librispeech -arch ~/wav2letter/arch/librispeech-glu-highdropout -lr 0.1 -lrcrit 0.0005 -gpu 1 -linseg 1 -linlr 0 -linlrcrit 0.005 -onorm target -nthread 6 -dictdir ~/librispeech-proc -datadir ~/librispeech-proc -train train-clean-100+train-clean-360+train-other-500 -valid dev-clean+dev-other -test test-clean+test-other -gpu 1 -sqnorm -mfsc -melfloor 1 -surround "|" -replabel 2 -progress -wnorm -normclamp 0.2 -momentum 0.9 -weightdecay 1e-05

——————————————————————————————————————————————

| experiment path: /data1/experiments/hello_librispeech
| experiment runidx: 1
| number of classes (network) = 30
[Network spec]
C NCHANNEL 400 13 1
GLU
DO 0.2
C 200 440 14 1
GLU
DO 0.214
C 220 484 15 1
GLU
DO 0.22898
C 242 532 16 1
GLU
DO 0.2450086
C 266 584 17 1
GLU
DO 0.262159202
C 292 642 18 1
GLU
DO 0.28051034614
C 321 706 19 1
GLU
DO 0.30014607037
C 353 776 20 1
GLU
DO 0.321156295296
C 388 852 21 1
GLU
DO 0.343637235966
C 426 936 22 1
GLU
DO 0.367691842484
C 468 1028 23 1
GLU
DO 0.393430271458
C 514 1130 24 1
GLU
DO 0.42097039046
C 565 1242 25 1
GLU
DO 0.450438317792
C 621 1366 26 1
GLU
DO 0.481969000038
C 683 1502 27 1
GLU
DO 0.51570683004
C 751 1652 28 1
GLU
DO 0.551806308143
C 826 1816 29 1
GLU
DO 0.590432749713
C 908 1816 1 1
GLU
DO 0.590432749713
C 908 NLABEL 1 1

luajit: .../usr/share/lua/5.1/wav2letter/runtime/netutils.lua:25: attempt to call field 'WeightNorm' (a nil value)
stack traceback:
.../usr/share/lua/5.1/wav2letter/runtime/netutils.lua:25: in function 'TemporalConvolution'
.../usr/share/lua/5.1/wav2letter/runtime/netutils.lua:209: in function 'create'
/data1/wav2letter/train.lua:368: in main chunk
[C]: at 0x00406020

Using pretrained model

Hi
I want to transcribe a sample audio file using cpu pretained model. what changes should I make in Data-preprocessing steps and running the decoder.

ps-need it urgently , Thanks

Using Pre-trained model!

Hi,

I am just trying to use the pre-trained model. I am getting the following error "ead error: read 217450 blocks instead of 43500464". Snapshot below:

$ ~/usr/bin/luajit ~/wav2letter/test.lua ~/librispeech-glu-highdropout-cpu.bin -progress -show -test dev-clean -save -datadir ~/librispeech-proc/ -dictdir ~/librispeech-proc/ -gfsai
| number of classes (network) = 30
| reloading model </home/sanchit/librispeech-glu-highdropout-cpu.bin>
/home/sanchit/usr/bin/luajit: /home/sanchit/usr/share/lua/5.1/torch/File.lua:351: read error: read 217450 blocks instead of 43500464 at /tmp/luarocks_torch-scm-1-1789/torch7/lib/TH/THDiskFile.c:356
stack traceback:
[C]: in function 'read'
/home/sanchit/usr/share/lua/5.1/torch/File.lua:351: in function </home/sanchit/usr/share/lua/5.1/torch/File.lua:245>
[C]: in function 'read'
/home/sanchit/usr/share/lua/5.1/torch/File.lua:351: in function 'readObject'
/home/sanchit/usr/share/lua/5.1/torch/File.lua:369: in function 'readObject'
/home/sanchit/usr/share/lua/5.1/nn/Module.lua:192: in function 'read'
/home/sanchit/usr/share/lua/5.1/torch/File.lua:351: in function 'readObject'
/home/sanchit/usr/share/lua/5.1/torch/File.lua:369: in function 'readObject'
/home/sanchit/usr/share/lua/5.1/torch/File.lua:369: in function 'readObject'
/home/sanchit/usr/share/lua/5.1/nn/Module.lua:192: in function 'read'
/home/sanchit/usr/share/lua/5.1/torch/File.lua:351: in function 'readObject'
/home/sanchit/usr/share/lua/5.1/torch/File.lua:369: in function 'readObject'
.../sanchit/usr/share/lua/5.1/wav2letter/runtime/serial.lua:57: in function 'loadmodel'
/home/sanchit/wav2letter/test.lua:102: in main chunk
[C]: at 0x00404ab0

Crash when trying README example CPU-only on macOS 10.12.6

I just tried the CPU-only model that @VitaliyLi provided to fix #7. To get this far, this also required my PR #4. Here is the README example I ran (mind the -gpu 0):

luajit ~/wav2letter/test.lua ~/librispeech-glu-highdropout-cpu.bin -progress -show -test dev-clean -save -datadir ~/librispeech-proc/ -dictdir ~/librispeech-proc/ -gpu 0 -gfsai

I got this crash:

| dataset </Users/jan/librispeech-proc/dev-clean>: | dataset </Users/jan/librispeech-proc/dev-clean>: | dataset </Users/jan/librispeech-proc/dev-clean>: | dataset </Users/jan/librispeech-proc/dev-clean>: | dataset </Users/jan/librispeech-proc/dev-clean>: | dataset </Users/jan/librispeech-proc/dev-clean>: 2703 files found
| dataset </Users/jan/librispeech-proc/dev-clean>: 2703 files found
| dataset </Users/jan/librispeech-proc/dev-clean>: 2703 files found
| dataset </Users/jan/librispeech-proc/dev-clean>: 2703 files found
| dataset </Users/jan/librispeech-proc/dev-clean>: 2703 files found
| dataset </Users/jan/librispeech-proc/dev-clean>: 2703 files found
| dataset </Users/jan/librispeech-proc/dev-clean>: 0 files found
0 files found
0 files found
0 files found
0 files found
0 files found
luajit: ...Projects/torch/install/share/lua/5.1/threads/threads.lua:183: [thread 3 callback] ...nstall/share/lua/5.1/wav2letter/numberedfilesdataset.lua:71: no file found in </Users/jan/librispeech-proc/dev-clean/?????????.flacsz> nor in </Users/jan/librispeech-proc/dev-clean/00000/?????????.flacsz>
stack traceback:
	[C]: in function 'assert'
	...nstall/share/lua/5.1/wav2letter/numberedfilesdataset.lua:71: in function '__init'
	...ment/Projects/torch/install/share/lua/5.1/torch/init.lua:91: in function <...ment/Projects/torch/install/share/lua/5.1/torch/init.lua:87>
	[C]: in function 'closure'
	.../torch/install/share/lua/5.1/wav2letter/runtime/data.lua:319: in function 'mapconcat'
	.../torch/install/share/lua/5.1/wav2letter/runtime/data.lua:417: in function 'datasetwithfeatures'
	.../torch/install/share/lua/5.1/wav2letter/runtime/data.lua:476: in function 'closure'
	...are/lua/5.1/torchnet/dataset/paralleldatasetiterator.lua:79: in function <...are/lua/5.1/torchnet/dataset/paralleldatasetiterator.lua:78>
	[C]: in function 'xpcall'
	...Projects/torch/install/share/lua/5.1/threads/threads.lua:234: in function 'callback'
	...t/Projects/torch/install/share/lua/5.1/threads/queue.lua:65: in function <...t/Projects/torch/install/share/lua/5.1/threads/queue.lua:41>
	[C]: in function 'pcall'
	...t/Projects/torch/install/share/lua/5.1/threads/queue.lua:40: in function 'dojob'
	[string "  local Queue = require 'threads.queue'..."]:13: in main chunk
stack traceback:
	[C]: in function 'error'
	...Projects/torch/install/share/lua/5.1/threads/threads.lua:183: in function 'dojob'
	...Projects/torch/install/share/lua/5.1/threads/threads.lua:264: in function 'synchronize'
	...Projects/torch/install/share/lua/5.1/threads/threads.lua:142: in function 'specific'
	...Projects/torch/install/share/lua/5.1/threads/threads.lua:125: in function 'Threads'
	...are/lua/5.1/torchnet/dataset/paralleldatasetiterator.lua:85: in function '__init'
	...ment/Projects/torch/install/share/lua/5.1/torch/init.lua:91: in function <...ment/Projects/torch/install/share/lua/5.1/torch/init.lua:87>
	[C]: in function 'newiterator'
	...do/Development/Projects/wav2letter/test.lua:154: in main chunk
	[C]: at 0x0109b07e60

Maybe there still is a piece missing from my setup puzzle, but I don’t know what it might be.

Viable for Macbook?

In retrospect this would have been better to ask beforehand.

How much RAM do you need to transcribe with a pre-trained model? I imagine most of the CPU-only folks are trying to do this.

After setting up everything, I ended getting this on the decoding part, with 4 GB of free RAM.
PANIC: unprotected error in call to Lua API (not enough memory)

Bringing down the beam-size to 25 from (25,000) still was too cumbersome. Any other hyper parameters that would reduce the memory?

Has anybody successfully got this working on an MacOS? I debugged lots of small snags, but ultimately couldn't get it working (flac files not being seen- comment on closed issue about Flac, lua runs out of space). Might make sense to only install this on a Linux with GPU for the time being.

wav2letter compile error

hello

i download codes from git

git clone https://github.com/facebookresearch/wav2letter.git
cd wav2letter
cd gtn && luarocks make rocks/gtn-scm-1.rockspec && cd .. -> OK
cd speech && luarocks make rocks/speech-scm-1.rockspec && cd .. --> OK
cd torchnet-optim && luarocks make rocks/torchnet-optim-scm-1.rockspec && cd .. --> OK

cd wav2letter && luarocks make rocks/wav2letter-scm-1.rockspec && cd .. --> error

error message is following sentences

wav2letter git:(master) ✗ cd wav2letter && luarocks make rocks/wav2letter-scm-1.rockspec && cd ..
-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is GNU 5.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Try OpenMP C flag = [-fopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Success
-- Try OpenMP CXX flag = [-fopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Success
-- Found OpenMP: -fopenmp
-- Compiling with OpenMP support
-- Found Torch7 in /home/nuri/torch/install
-- Configuring done
-- Generating done
-- Build files have been written to: /home/nuri/git/wav2letter/wav2letter/build.luarocks
Scanning dependencies of target waveval
[ 12%] Building C object CMakeFiles/waveval.dir/libwaveval/waveval.c.o
make[2]: *** No rule to make target '/home/nuri/torch/install/lib/libluajit.so', needed by 'libwaveval.so'. Stop.
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/waveval.dir/all' failed
make[1]: *** [CMakeFiles/waveval.dir/all] Error 2
Makefile:127: recipe for target 'all' failed
make: *** [all] Error 2

Why use FullConnectCriterion to pretrain the transition?

I noticed in the code, before training the network and asgcriterion, the transition is trained by FullConnectCriterion as a kind of pretraining.
What's the intuition behind this? Thank you.

luajit: /torch/install/share/lua/5.1/wav2letter/runtime/data.lua:146: bad argument #1 to 'lines' (/data/local/packages/ai-group.speechdata/latest/speech/letters.lst: No such file or directory)

HI @VitaliyLi
I am getting the following while using pretarined model to transcribe the audio file.Can someone pls help me solve this

luajit test.lua librispeech-glu-highdropout-cpu.bin -progress -show -test dev-clean2 -save -datadir audio-proc/ -datadir audio-proc/ -gfsai
luajit: /torch/install/share/lua/5.1/wav2letter/runtime/data.lua:146: bad argument #1 to 'lines' (/data/local/packages/ai-group.speechdata/latest/speech/letters.lst: No such file or directory)
stack traceback:
[C]: in function 'lines'
/torch/install/share/lua/5.1/wav2letter/runtime/data.lua:146: in function 'newdict'
test.lua:72: in main chunk
[C]: at 0x00405d50

No rule to make target '/root/usr/lib/libkenlm.a', needed by 'libbeamer.so'.

I am new to lua and I run into an error I have not found any information on.

`luarocks make rocks/beamer-scm-1.rockspec
-- Using kenlm library /root/usr/lib/libkenlm.a
-- Using kenlm include path /kenlm
-- Found Torch7 in /root/usr
-- Configuring done
-- Generating done
-- Build files have been written to: /home/abetz/github/wav2letter/beamer/build.luarocks
make[2]: *** No rule to make target '/root/usr/lib/libkenlm.a', needed by 'libbeamer.so'. Stop.
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/beamer.dir/all' failed
make[1]: *** [CMakeFiles/beamer.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2

Error: Build error: Failed building.`

Time taken to transcribe using pretrained cpu model.

how much time does it take to transcribe using the pretrained model.
when I am trying to transcribe speech using "luajit ~/wav2letter/test.lua ~/librispeech-glu-highdropout-cpu.bin -progress -show -test dev-clean -save -datadir ~/librispeech-proc/ -dictdir ~/librispeech-proc/ -gfsai" command it is showling

[Sentence WER: 000.00%, dataset WER: 003.70%]
[.................... 2/2703 ..................] ETA: 6D13h | Step: 3m29s

will it really take 6 days to complete ?

transcription error

when i transcribed my audio some how the repeated letters in a word is getting replaced by '1'.
example:
original:
two is a whole lot of our analytics around what customers are looking for based on we mine a lot of data from our website on sales searches on what gets requested in branches so we gotten smarter and smarter about the skus that we add meaning that we are adding things that we are pretty certain have some built in demand

wav2letter:
who a whole lot of our own alet around ane part the mo wil1 for be dano when we mind a want a ban ong that night on bel1 dirty on on one any weped it an brachet to wen got omalen about that we had ne1ner1ing thing that would plas certain have dome built into man

error in decoding

i am having am issue
I am using this command "luajit decode.lua librispeech-proc/ dev-other -show -letters librispeech-proc/letters-rep.lst -words dict.lst -lm 3-gram.pruned.3e-7.arpa -lmweight 3.1639 -beamsize 25000 -beamscore 40 -nthread 10 -smearing max -show"
but I am geting the following error :

/home/jhkang/usr/bin/luajit: bad argument #2 to '?' (out of bounds at /home/jhkang/torch/pkg/torch/lib/TH/generic/THStorage.c:202)
stack traceback:
[C]: at 0x7fb529f28b30
[C]: in function '__index'
...nstall/share/lua/5.1/torchnet/dataset/indexeddataset.lua:413: in function '__init'
/home/jhkang/torch/install/share/lua/5.1/torch/init.lua:91: in function </home/jhkang/torch/install/share/lua/5.1/torch/init.lua:87>
[C]: in function 'IndexedDatasetReader'
/home/jhkang/git/wav2letter/decode.lua:83: in function 'test'
/home/jhkang/git/wav2letter/decode.lua:187: in main chunk
[C]: at 0x00406020

ps: I am also using .idx file

Read 0 blocks instead of 1 at /tmp/luarocks_torch-scm-1-1308/torch7/lib/TH/THDiskFile.c:352

I'm trying to run this command:

luajit test-one-file.lua ~/librispeech-glu-highdropout-cpu.bin my-audio-file.wav

And, I'm getting this error message:

| best valid (test?) error was: table: 0x406e7940
luajit: /home/vraj/usr/share/lua/5.1/torch/File.lua:259: read error: read 0 blocks instead of 1 at /tmp/luarocks_torch-scm-1-1308/torch7/lib/TH/THDiskFile.c:352
stack traceback:
[C]: in function 'readInt'
/home/vraj/usr/share/lua/5.1/torch/File.lua:259: in function 'readObject'
test-one-file.lua:37: in main chunk
[C]: at 0x00404ab0

I've also tried re-downloading the model.

Assertion Failed error/

Hi
When I am trying to run the "luajit data/librispeech/create.lua Audios audio-proc"

I am getting the following error,
| parsing SPEAKERS.TXT for gender...
| analyzing Audios/dev-clean2...
| writing audio-proc/dev-clean2...
luajit: data/librispeech/create.lua:91: assertion failed!
stack traceback:
[C]: in function 'assert'
data/librispeech/create.lua:91: in function 'createidx'
data/librispeech/create.lua:150: in main chunk
[C]: at 0x00405d50

Can anyone explain what this error means

inconsistent tensor size, expected tensor [31 x 31] and src [30 x 30] to have the same number of elements, but got 961 and 900 elements respectively at /tmp/luarocks_torch-scm-1-4200/torch7/lib/TH/generic/THTensorCopy.c:86

getting this error when I am using the pretrained model to transcribe my own .flac files. Can someone help me solve this

"inconsistent tensor size, expected tensor [31 x 31] and src [30 x 30] to have the same number of elements, but got 961 and 900 elements respectively at /tmp/luarocks_torch-scm-1-4200/torch7/lib/TH/generic/THTensorCopy.c:86"

has anybody meet the below problem under linux platform?

(root) fbBaseline@gpucommtest1[tt]:~> luajit ~/wav2letter/data/librispeech/create.lua ~/LibriSpeech ~/librispeech-proc
| parsing SPEAKERS.TXT for gender...
| analyzing /data1/LibriSpeech/train-clean-100...
| writing /data1/librispeech-proc/train-clean-100...
luajit: could not open file </data1/LibriSpeech/train-clean-100/103/1240/103-1240-0000.flac> (File contains data in an unimplemented format.)
stack traceback:
[C]: at 0x7fd80a9b4720
[C]: in function 'SndFile'
/data1/wav2letter/data/librispeech/create.lua:16: in function 'copytoflac'
/data1/wav2letter/data/librispeech/create.lua:95: in function 'createidx'
/data1/wav2letter/data/librispeech/create.lua:152: in main chunk
[C]: at 0x00406020

—— the all steps previous this step were all normal, and the libsndfile was installed from the libsndfile source code, and its include/lib path were all configured into the enviroment path

FFTW on RHEL

Hi all,
I am working RHEL release 7.3 (Maipo). I tried installing FFTW3 (fftw-3.3.7) manually (from source) but the default install builds only libfftw3 libraries and libfftw3f and libfft3l are missing. FindFFTW file in wav2letter repos gives me the impression that the missing (fftw3) libraries are mandatory. Am I mistaken? Or is there a way to build the missing libraries? I tried the --enable-sme2 option for making fftw3 but that did not help either.
Thanks

Share Low Dropout configuration arch

I'd like to ask if it is possible to share Low Dropout configuration arch in the same way as High Dropout (librispeech-glu-highdropout file), so that entire paper could be reproduced?

expected training time on a 1080ti

can you please share training time on different HW?

Problem with using wav2letter for inference

I downloaded the pre-trained model and tried to run the decoder
luajit ~/wav2letter/test.lua ~/librispeech-glu-highdropout.bin -progress -show -test dev-clean -save -datadir ~/librispeech-proc/ -dictdir ~/librispeech-proc/ -gfsai
I ended up getting this:

Found Environment variable CUDNN_PATH = /home/sid/cuda/lib64/libcudnn.so.5| number of classes (network) = 30
| reloading model </home/sid/librispeech-glu-highdropout.bin>
nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> (31) -> (32) -> (33) -> (34) -> (35) -> (36) -> (37) -> (38) -> (39) -> (40) -> (41) -> (42) -> (43) -> (44) -> (45) -> (46) -> (47) -> (48) -> (49) -> (50) -> (51) -> (52) -> (53) -> (54) -> (55) -> (56) -> (57) -> (58) -> (59) -> (60) -> (61) -> output]
  (1): nn.Copy
  (2): nn.Transpose
  (3): nn.View(40, 1, -1)
  (4): nn.WeightNorm @ cudnn.SpatialConvolution(40 -> 400, 13x1)
  (5): nn.GatedLinearUnit
  (6): nn.Dropout(0.200000)
  (7): nn.WeightNorm @ cudnn.SpatialConvolution(200 -> 440, 14x1)
  (8): nn.GatedLinearUnit
  (9): nn.Dropout(0.214000)
  (10): nn.WeightNorm @ cudnn.SpatialConvolution(220 -> 484, 15x1)
  (11): nn.GatedLinearUnit
  (12): nn.Dropout(0.228980)
  (13): nn.WeightNorm @ cudnn.SpatialConvolution(242 -> 532, 16x1)
  (14): nn.GatedLinearUnit
  (15): nn.Dropout(0.245009)
  (16): nn.WeightNorm @ cudnn.SpatialConvolution(266 -> 584, 17x1)
  (17): nn.GatedLinearUnit
  (18): nn.Dropout(0.262159)
  (19): nn.WeightNorm @ cudnn.SpatialConvolution(292 -> 642, 18x1)
  (20): nn.GatedLinearUnit
  (21): nn.Dropout(0.280510)
  (22): nn.WeightNorm @ cudnn.SpatialConvolution(321 -> 706, 19x1)
  (23): nn.GatedLinearUnit
  (24): nn.Dropout(0.300146)
  (25): nn.WeightNorm @ cudnn.SpatialConvolution(353 -> 776, 20x1)
  (26): nn.GatedLinearUnit
  (27): nn.Dropout(0.321156)
  (28): nn.WeightNorm @ cudnn.SpatialConvolution(388 -> 852, 21x1)
  (29): nn.GatedLinearUnit
  (30): nn.Dropout(0.343637)
  (31): nn.WeightNorm @ cudnn.SpatialConvolution(426 -> 936, 22x1)
  (32): nn.GatedLinearUnit
  (33): nn.Dropout(0.367692)
  (34): nn.WeightNorm @ cudnn.SpatialConvolution(468 -> 1028, 23x1)
  (35): nn.GatedLinearUnit
  (36): nn.Dropout(0.393430)
  (37): nn.WeightNorm @ cudnn.SpatialConvolution(514 -> 1130, 24x1)
  (38): nn.GatedLinearUnit
  (39): nn.Dropout(0.420970)
  (40): nn.WeightNorm @ cudnn.SpatialConvolution(565 -> 1242, 25x1)
  (41): nn.GatedLinearUnit
  (42): nn.Dropout(0.450438)
  (43): nn.WeightNorm @ cudnn.SpatialConvolution(621 -> 1366, 26x1)
  (44): nn.GatedLinearUnit
  (45): nn.Dropout(0.481969)
  (46): nn.WeightNorm @ cudnn.SpatialConvolution(683 -> 1502, 27x1)
  (47): nn.GatedLinearUnit
  (48): nn.Dropout(0.515707)
  (49): nn.WeightNorm @ cudnn.SpatialConvolution(751 -> 1652, 28x1)
  (50): nn.GatedLinearUnit
  (51): nn.Dropout(0.551806)
  (52): nn.WeightNorm @ cudnn.SpatialConvolution(826 -> 1816, 29x1)
  (53): nn.GatedLinearUnit
  (54): nn.Dropout(0.590433)
  (55): nn.WeightNorm @ cudnn.SpatialConvolution(908 -> 1816, 1x1)
  (56): nn.GatedLinearUnit
  (57): nn.Dropout(0.590433)
  (58): nn.WeightNorm @ cudnn.SpatialConvolution(908 -> 30, 1x1)
  (59): nn.View(30, -1)
  (60): nn.Transpose
  (61): nn.Copy
}
| dataset </home/sid/librispeech-proc/dev-clean>: 0 files found
luajit: /home/sid/torch/install/share/lua/5.1/threads/threads.lua:183: [thread 3 callback] ...nstall/share/lua/5.1/wav2letter/numberedfilesdataset.lua:71: no file found in </home/sid/librispeech-proc/dev-clean/?????????.flac> nor in </home/sid/librispeech-proc/dev-clean/00000/?????????.flac>
stack traceback:
	[C]: in function 'assert'
	...nstall/share/lua/5.1/wav2letter/numberedfilesdataset.lua:71: in function '__init'
	/home/sid/torch/install/share/lua/5.1/torch/init.lua:91: in function </home/sid/torch/install/share/lua/5.1/torch/init.lua:87>
	[C]: in function 'closure'
	.../torch/install/share/lua/5.1/wav2letter/runtime/data.lua:319: in function 'mapconcat'
	.../torch/install/share/lua/5.1/wav2letter/runtime/data.lua:417: in function 'datasetwithfeatures'
	.../torch/install/share/lua/5.1/wav2letter/runtime/data.lua:468: in function 'closure'
	...are/lua/5.1/torchnet/dataset/paralleldatasetiterator.lua:79: in function <...are/lua/5.1/torchnet/dataset/paralleldatasetiterator.lua:78>
	[C]: in function 'xpcall'
	/home/sid/torch/install/share/lua/5.1/threads/threads.lua:234: in function 'callback'
	/home/sid/torch/install/share/lua/5.1/threads/queue.lua:65: in function </home/sid/torch/install/share/lua/5.1/threads/queue.lua:41>
	[C]: in function 'pcall'
	/home/sid/torch/install/share/lua/5.1/threads/queue.lua:40: in function 'dojob'
	[string "  local Queue = require 'threads.queue'..."]:13: in main chunk
stack traceback:
	[C]: in function 'error'
	/home/sid/torch/install/share/lua/5.1/threads/threads.lua:183: in function 'dojob'
	/home/sid/torch/install/share/lua/5.1/threads/threads.lua:264: in function 'synchronize'
	/home/sid/torch/install/share/lua/5.1/threads/threads.lua:142: in function 'specific'
	/home/sid/torch/install/share/lua/5.1/threads/threads.lua:125: in function 'Threads'
	...are/lua/5.1/torchnet/dataset/paralleldatasetiterator.lua:85: in function '__init'
	/home/sid/torch/install/share/lua/5.1/torch/init.lua:91: in function </home/sid/torch/install/share/lua/5.1/torch/init.lua:87>
	[C]: in function 'newiterator'
	/home/sid/wav2letter/test.lua:154: in main chunk
	[C]: at 0x00405d50
| dataset </home/sid/librispeech-proc/dev-clean>: 0 files found
| dataset </home/sid/librispeech-proc/dev-clean>: 0 files found
| dataset </home/sid/librispeech-proc/dev-clean>: 0 files found
| dataset </home/sid/librispeech-proc/dev-clean>: 0 files found
| dataset </home/sid/librispeech-proc/dev-clean>: 0 files found

I've been trying to fix this by renaming some of the folders. To try to find out what's wrong, I made a folder 00000 and copied a file 000000000.flac into that folder. I ran the command to test it. It said 1 file was found but the predictions were nowhere to be found. Is there any fix?

Facing "attempt to index a nil value

Issues in Pre-trained model for CPU

Hi @VitaliyLi

I am trying to use pre-trained model for cpu to transcribe the librispeech. I ave a few question regarding this. Hope you people will reply soon.

For this does it require Cuda Installation ?
Do we need to run the decoder ? because for running this command "luajit ~/wav2letter/test.lua ~/experiments/hello_librispeech/001_model_dev-clean.bin -progress -show -test dev-clean -save"

we need experiments folder which is created after training but what is the point in training when we want to use pre-trained model.

How much time does it take to transcribe using the pretrained model.
when I am trying to transcribe speech using "luajit ~/wav2letter/test.lua ~/librispeech-glu-highdropout-cpu.bin -progress -show -test dev-clean -save -datadir ~/librispeech-proc/ -dictdir ~/librispeech-proc/ -gfsai" command it is showling

[Sentence WER: 000.00%, dataset WER: 003.70%]
[.................... 2/2703 ..................] ETA: 6D13h | Step: 3m29s

will it really take 6 days to complete ?

Thanks & Regards

using language model in pre-trained model

in the command
"luajit ~/wav2letter/test.lua ~/librispeech-glu-highdropout.bin -progress -show -test dev-clean -save -datadir ~/librispeech-proc/ -dictdir ~/librispeech-proc/ -gfsai"

will the kenlm language model be included automatically or we need to any parameter to the above command

files found but then not

when I try run test.lua to transcribe some flac files. It finds them but then says 0 files found after and dies. This issues seems to be related to threading. any ideas?

luajit ~/wav2letter/test.lua ~/models/libri/librispeech-glu-highdropout-cpu.bin -progress -show -test LibriSpeech/dev-clean -save -dictdir librispeech-proc -datadir .

LibriSpeech/dev-clean
| dataset </Users/jdp/Projects/pinch/ml/LibriSpeech/dev-clean>: 10000 files found
| dataset </Users/jdp/Projects/pinch/ml/LibriSpeech/dev-clean>: 0 files found
luajit: /Users/jdp/usr/share/lua/5.1/threads/threads.lua:183: [thread 1 callback] ...dp/usr/share/lua/5.1/wav2letter/numberedfilesdataset.lua:71: no file found in </Users/jdp/Projects/pinch/ml/LibriSpeech/dev-clean/?????????.flacsz> nor in </Users/jdp/Projects/pinch/ml/LibriSpeech/dev-clean/00000/?????????.flacsz>
stack traceback:
[C]: in function 'assert'
...dp/usr/share/lua/5.1/wav2letter/numberedfilesdataset.lua:71: in function '__init'
/Users/jdp/usr/share/lua/5.1/torch/init.lua:91: in function </Users/jdp/usr/share/lua/5.1/torch/init.lua:87>
[C]: in function 'closure'
/Users/jdp/usr/share/lua/5.1/wav2letter/runtime/data.lua:319: in function 'mapconcat'
/Users/jdp/usr/share/lua/5.1/wav2letter/runtime/data.lua:417: in function 'datasetwithfeatures'
/Users/jdp/usr/share/lua/5.1/wav2letter/runtime/data.lua:476: in function 'closure'
...are/lua/5.1/torchnet/dataset/paralleldatasetiterator.lua:79: in function <...are/lua/5.1/torchnet/dataset/paralleldatasetiterator.lua:78>
[C]: in function 'xpcall'
/Users/jdp/usr/share/lua/5.1/threads/threads.lua:234: in function 'callback'
/Users/jdp/usr/share/lua/5.1/threads/queue.lua:65: in function </Users/jdp/usr/share/lua/5.1/threads/queue.lua:41>
[C]: in function 'pcall'
/Users/jdp/usr/share/lua/5.1/threads/queue.lua:40: in function 'dojob'
[string " local Queue = require 'threads.queue'..."]:13: in main chunk
stack traceback:
[C]: in function 'error'
/Users/jdp/usr/share/lua/5.1/threads/threads.lua:183: in function 'dojob'
/Users/jdp/usr/share/lua/5.1/threads/threads.lua:264: in function 'synchronize'
/Users/jdp/usr/share/lua/5.1/threads/threads.lua:142: in function 'specific'
/Users/jdp/usr/share/lua/5.1/threads/threads.lua:125: in function 'Threads'
...are/lua/5.1/torchnet/dataset/paralleldatasetiterator.lua:85: in function '__init'
/Users/jdp/usr/share/lua/5.1/torch/init.lua:91: in function </Users/jdp/usr/share/lua/5.1/torch/init.lua:87>
[C]: in function 'newiterator'
/Users/jdp/wav2letter/test.lua:150: in main chunk
[C]: at 0x010c2bf350

bad argument #1 to 'readShort' (the number of frames must be positive)

Issue

I keep getting an error when attempting to run create.lua command. I have already commented the data that I am not using 'train-other-500'.

Is there anything that I can do?

I am running :
luajit ~/wav2letter/data/librispeech/create.lua ~/LibriSpeech/ ~/librispeech-proc

Here is the error that I am getting:

wav2letter/data/librispeech/create.lua:35: bad argument #1 to 'readShort' (the number of frames must be positive)
stack traceback:
[C]: in function 'readShort'
/wav2letter/data/librispeech/create.lua:35: in function 'copytoflac'
/wav2letter/data/librispeech/create.lua:95: in function 'createidx'
/wav2letter/data/librispeech/create.lua:152: in main chunk
[C]: at 0x00405e90

Missing dict.lst

The dict.lst file required for rescoring the librispeech output with a language model seems to be missing from the repository.

I tried to recreate one with the following command
zcat 3-gram.pruned.3e-7.arpa.gz | perl -ne 'chomp;$_=lc;@a=split /\t/;if(/^\\1-grams:/.../^$/){$w=$a[1]; $w=~s/(.)(\1+)/$1.length($2)/e; print "$a[1] $w\n"}' | grep -v "<\|^ *$\|[3-9]" > dict.lst

But I get a WER of 6.73 on dev-clean after rescoring. I would have expected something in the 4-5% as reported in the paper.

luajit ./wav2letter/decode.lua ./models/ dev-clean -show -letters ./data/librispeech-proc/letters-rep.lst  -words ./dict.lst -lm ./models/3-gram.pruned.3e-7.bin -lmweight 3.1639 -beamsize 25000 -beamscore 40 -nthread 10 -smearing max -show
...
[Memory usage: 411.62 Mb]
[Decoded 2703 sequences in 3258.00 s (actual: 29223.70 s)]
[WER on dev-clean = 6.73%, LER = 2.42%]