Giter Site home page Giter Site logo

ctcdecode's People

Contributors

annagrr avatar brennv avatar bstriner avatar ctogle avatar joemathai avatar joshemorris avatar karimtarabishy avatar nikhilnagaraj avatar rbracco avatar reuben avatar ryanleary avatar seannaren avatar stas6626 avatar stefanocortinovis avatar unixnme avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ctcdecode's Issues

_ctc_decode import issue

Hello,

I am still experiencing the _ctc_decode import issue, with the latest pytorch-ctc checkout from the Git repository.

The relevant error lines are:

python

Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.

import torch
import pytorch_ctc
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/pytorch_ctc/init.py", line 4, in
from ._ctc_decode import lib as _lib, ffi as _ffi
SystemError: dynamic module not initialized properly

Incorrect maintaining of states for words?

One of our colleagues got in touch with @willfrey and he mentioned an issue with the pytorch-ctc implementation:

He pointed out that the scorer in pytorch-ctc wasn't maintaining state between words properly, rendering it a "spell checker" (only scoring unigrams basically)

I'll do some investigation into this claim and report back!

Why is there a constant score for OOV?

This line gives a score of -1000 (which is declared here), to any n-gram which contains an OOV. Is this the right way to approach it? Isn't it possible to get the score for <unk> tokens from the LM and use that instead of using a hardcoded score?

Make KenLM optional

The dependency adds a fair amount of compilation time (order: seconds) and may not be necessary for all people.

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-req-build-j5vlyP/

When I git clone and type 'pip install .', I get the following error:

How can I resolve this? Thanks!

Complete output from command python setup.py egg_info:
zip_safe flag not set; analyzing archive contents...

Installed /tmp/pip-req-build-j5vlyP/.eggs/wget-3.2-py2.7.egg
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/tmp/pip-req-build-j5vlyP/setup.py", line 55, in <module>
    os.path.join(this_file, "build.py:ffi")
  File "/home/byuns9334/anaconda2/lib/python2.7/site-packages/setuptools/__init__.py", line 129, in setup
    return distutils.core.setup(**attrs)
  File "/home/byuns9334/anaconda2/lib/python2.7/distutils/core.py", line 111, in setup
    _setup_distribution = dist = klass(attrs)
  File "/home/byuns9334/anaconda2/lib/python2.7/site-packages/setuptools/dist.py", line 372, in __init__
    _Distribution.__init__(self, attrs)
  File "/home/byuns9334/anaconda2/lib/python2.7/distutils/dist.py", line 287, in __init__
    self.finalize_options()
  File "/home/byuns9334/anaconda2/lib/python2.7/site-packages/setuptools/dist.py", line 528, in finalize_options
    ep.load()(self, ep.name, value)
  File "/home/byuns9334/anaconda2/lib/python2.7/site-packages/cffi/setuptools_ext.py", line 204, in cffi_modules
    add_cffi_module(dist, cffi_module)
  File "/home/byuns9334/anaconda2/lib/python2.7/site-packages/cffi/setuptools_ext.py", line 49, in add_cffi_module
    execfile(build_file_name, mod_vars)
  File "/home/byuns9334/anaconda2/lib/python2.7/site-packages/cffi/setuptools_ext.py", line 25, in execfile
    exec(code, glob, glob)
  File "/tmp/pip-req-build-j5vlyP/build.py", line 22, in <module>
    'third_party/openfst-1.6.3.tar.gz')
  File "/tmp/pip-req-build-j5vlyP/build.py", line 15, in download_extract
    tar = tarfile.open(dl_path)
  File "/home/byuns9334/anaconda2/lib/python2.7/tarfile.py", line 1680, in open
    raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-req-build-j5vlyP/

Remove `merge_repeated` option

The merge_repeated behavior is incorrect when True (it does two passes of merge_repeated causing incorrect results). The way that the decoder works implicitly merges any repeated characters. Therefore, this option should be removed.

IOError: [Errno socket error] [Errno 101] Network is unreachable

When I git clone and type ' pip install . ', I get the following IOError:Network is unreachable ?
any solution will be appreciated.

root@hxh:/home/hxh/common_use/ctcdecode# python setup.py install
zip_safe flag not set; analyzing archive contents...

Installed /home/hxh/common_use/ctcdecode/.eggs/wget-3.2-py2.7.egg
Traceback (most recent call last):
  File "setup.py", line 55, in <module>
    os.path.join(this_file, "build.py:ffi")
  File "/home/hxh/anaconda2/lib/python2.7/site-packages/setuptools/__init__.py", line 129, in setup
    return distutils.core.setup(**attrs)
  File "/home/hxh/anaconda2/lib/python2.7/distutils/core.py", line 111, in setup
    _setup_distribution = dist = klass(attrs)
  File "/home/hxh/anaconda2/lib/python2.7/site-packages/setuptools/dist.py", line 333, in __init__
    _Distribution.__init__(self, attrs)
  File "/home/hxh/anaconda2/lib/python2.7/distutils/dist.py", line 287, in __init__
    self.finalize_options()
  File "/home/hxh/anaconda2/lib/python2.7/site-packages/setuptools/dist.py", line 476, in finalize_options
    ep.load()(self, ep.name, value)
  File "/home/hxh/anaconda2/lib/python2.7/site-packages/cffi/setuptools_ext.py", line 193, in cffi_modules
    add_cffi_module(dist, cffi_module)
  File "/home/hxh/anaconda2/lib/python2.7/site-packages/cffi/setuptools_ext.py", line 49, in add_cffi_module
    execfile(build_file_name, mod_vars)
  File "/home/hxh/anaconda2/lib/python2.7/site-packages/cffi/setuptools_ext.py", line 25, in execfile
    exec(code, glob, glob)
  File "build.py", line 22, in <module>
    'third_party/openfst-1.6.7.tar.gz')
  File "build.py", line 14, in download_extract
    out=dl_path)
  File "build/bdist.linux-x86_64/egg/wget.py", line 526, in download
  File "/home/hxh/anaconda2/lib/pyt``hon2.7/urllib.py", line 98, in urlretrieve
    return opener.retrieve(url, filename, reporthook, data)
  File "/home/hxh/anaconda2/lib/python2.7/urllib.py", line 245, in retrieve
    fp = self.open(url, data)
  File "/home/hxh/anaconda2/lib/python2.7/urllib.py", line 213, in open
    return getattr(self, name)(url)
  File "/home/hxh/anaconda2/lib/python2.7/urllib.py", line 443, in open_https
    h.endheaders(data)
  File "/home/hxh/anaconda2/lib/python2.7/httplib.py", line 1038, in endheaders
    self._send_output(message_body)
  File "/home/hxh/anaconda2/lib/python2.7/httplib.py", line 882, in _send_output
    self.send(msg)
  File "/home/hxh/anaconda2/lib/python2.7/httplib.py", line 844, in send
    self.connect()
  File "/home/hxh/anaconda2/lib/python2.7/httplib.py", line 1255, in connect
    HTTPConnection.connect(self)
  File "/home/hxh/anaconda2/lib/python2.7/httplib.py", line 821, in connect
    self.timeout, self.source_address)
  File "/home/hxh/anaconda2/lib/python2.7/socket.py", line 575, in create_connection
    raise err
IOError: [Errno socket error] [Errno 101] Network is unreachable

Improve error handling

Throw an exception when files do not exist rather than exiting the Python interpreter.

certificate verify failed while install pip .

When I type 'pip install .', it causes error below :
I am using python 3.6, ubuntu 14.04.
Any idea how to resolve this?

kenkim@node10:/data3/kenkim/deepspeech.pytorch/ctcdecode$ pip install .
Processing /data3/kenkim/deepspeech.pytorch/ctcdecode
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "/home/kenkim/anaconda3/lib/python3.6/urllib/request.py", line 1318, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "/home/kenkim/anaconda3/lib/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/home/kenkim/anaconda3/lib/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/home/kenkim/anaconda3/lib/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/home/kenkim/anaconda3/lib/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/home/kenkim/anaconda3/lib/python3.6/http/client.py", line 964, in send
self.connect()
File "/home/kenkim/anaconda3/lib/python3.6/http/client.py", line 1400, in connect
server_hostname=server_hostname)
File "/home/kenkim/anaconda3/lib/python3.6/ssl.py", line 401, in wrap_socket
_context=self, _session=session)
File "/home/kenkim/anaconda3/lib/python3.6/ssl.py", line 808, in init
self.do_handshake()
File "/home/kenkim/anaconda3/lib/python3.6/ssl.py", line 1061, in do_handshake
self._sslobj.do_handshake()
File "/home/kenkim/anaconda3/lib/python3.6/ssl.py", line 683, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/tmp/pip-w0kw5634-build/setup.py", line 55, in <module>
    os.path.join(this_file, "build.py:ffi")
  File "/home/kenkim/anaconda3/lib/python3.6/distutils/core.py", line 108, in setup
    _setup_distribution = dist = klass(attrs)
  File "/home/kenkim/anaconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/dist.py", line 318, in __init__
  File "/home/kenkim/anaconda3/lib/python3.6/distutils/dist.py", line 281, in __init__
    self.finalize_options()
  File "/home/kenkim/anaconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/setuptools/dist.py", line 375, in finalize_options
  File "/home/kenkim/anaconda3/lib/python3.6/site-packages/cffi/setuptools_ext.py", line 187, in cffi_modules
    add_cffi_module(dist, cffi_module)
  File "/home/kenkim/anaconda3/lib/python3.6/site-packages/cffi/setuptools_ext.py", line 49, in add_cffi_module
    execfile(build_file_name, mod_vars)
  File "/home/kenkim/anaconda3/lib/python3.6/site-packages/cffi/setuptools_ext.py", line 25, in execfile
    exec(code, glob, glob)
  File "/tmp/pip-w0kw5634-build/build.py", line 24, in <module>
    'third_party/boost_1_63_0.tar.gz')
  File "/tmp/pip-w0kw5634-build/build.py", line 14, in download_extract
    out=dl_path)
  File "/home/kenkim/anaconda3/lib/python3.6/site-packages/wget-3.2-py3.6.egg/wget.py", line 526, in download
  File "/home/kenkim/anaconda3/lib/python3.6/urllib/request.py", line 248, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/home/kenkim/anaconda3/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/home/kenkim/anaconda3/lib/python3.6/urllib/request.py", line 526, in open
    response = self._open(req, data)
  File "/home/kenkim/anaconda3/lib/python3.6/urllib/request.py", line 544, in _open
    '_open', req)
  File "/home/kenkim/anaconda3/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/home/kenkim/anaconda3/lib/python3.6/urllib/request.py", line 1361, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/home/kenkim/anaconda3/lib/python3.6/urllib/request.py", line 1320, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)>

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-w0kw5634-build/

Potential memory leak

Check bindings to ensure that all dynamically allocated memory is properly freed when no longer needed.

what is timestep?

I'm wondering why some solutions by beam search have the same timesteps.

For example, TextDecoders.test_beam_search_decoder_1() for the unit test in tests/test.py,
we can get top 20 solutions by beam search,
but top 8 of them have the same timesteps.
Some other solutions also have the same timesteps.
Is it OK with this situation?

Support for explicit word separators and explicit dictionary for use in handwriting recognition

Hi,
I have a question about how to use the decoder when there is an explicit word separator besides the blank symbol. Some background: I'm trying to use the decoder for neural handwriting recognition. As you may be aware, this application is similar to neural speech recognition, and the same technology is suitable to a large extend. However, there is one issue. In neural handwriting recognition, the common practice is to keep the word separator symbols that are in the training material, and let the model reproduce them in addition to the "normal" symbols.
(See for example [https://arxiv.org/abs/1312.4569]), section IV C)

When not using the language model, this is fine, and you can get output like this (using "|" as the special word separator symbol):

Without language model:

evaluate_mdrnn - output: ""|BeTle|asd|Robbe|Mamnygard|.|"|"|Whati's|he|ben"
reference: ""|Better|ask|Robbie|Munyard|.|"|"|What|'s|he|been" --- wrong
evaluate_mdrnn - output: "Comuon|rerlet|,|wse|should|nok|be|elle|to"
reference: "Common|Market|,|we|should|not|be|able|to" --- wrong

However, when using the language model, it is not clear how to integrate the special word separator symbol (not the same as the CTC blank symbol!). When training the language model on "normal" text, such as the LOB [http://ota.ox.ac.uk/desc/0167] or Brown corpus, the word separator symbol won't be present obviously, and hence the decoder won't produce it.

With language model:

evaluate_mdrnn - output: "" Bethea Robbie Munyard . " what she ben"
reference: ""|Better|ask|Robbie|Munyard|.|"|"|What|'s|he|been" --- wrong
evaluate_mdrnn - output: "Common relative should not be elle to"
reference: "Common|Market|,|we|should|not|be|able|to" --- wrong

This is likely to harm performance, since the "|" symbol is still produced by the model, and needs to be "consumed" by the decoder somehow.

One hack I attempted is to train the language model with semi-artificial data, in which I add a separator between every word, for example:

gold-hunting | Kennedy | shocks | Dr | A | .
Germany | must | pay | .
offer | of | +357 | m | is | too | small | .
President | Kennedy | is | ready | to | get | tough | over | West | Germany's | cash | offer | to | help | America's | balance | of | payments | position | .

However, this also has undesired side-effects, such as leading to problem with Kneser-Ney discounting during language model training.

I think in decoders that use finite state transducers the finite state model is sometimes tailored with special states or transitions to deal with this problem. Perhaps this issue never occurs in speech, but I think actually it might occur if you explicitly mark long pauses for example (similar to explicit separators between words).

Do you have any suggestion how I might deal with this while using ctcdecode?
Neither using the language model trained on the original data, which cannot produce the word separator symbols, nor hacking the language model training data are very effective solutions it seems till now...

Another important and somewhat related issue seems to be the fact that there is no explicit vocabulary used in the decoder, only the language model? If one would like to restrict the vocabulary to say the 50K most frequent words would the (only) way be to change the language model training data, replacing all the words not in the 50K most frequent words with an INFREQUNT_WORD symbol or something? (This could work but again seems like quite an ugly hack which I would rather avoid if there is a way to provide an explicit vocabulary to the decoder.)

Thanks in advance for your help!

Gideon

import pytorch_ctc error

I have installed pytorch_ctc,but got the import error.

from pytorch_ctc import CTCBeamDecoder as CTCBD
Traceback (most recent call last):
File "", line 1, in
File "/home/bliu/anaconda3/lib/python3.5/site-packages/pytorch_ctc/init.py", line 4, in
from ._ctc_decode import lib as _lib, ffi as _ffi
ImportError: /home/bliu/anaconda3/lib/python3.5/site-packages/pytorch_ctc/_ctc_decode.cpython-35m-x86_64-linux-gnu.so: undefined symbol: _ZTVNSt7__cxx1118basic_stringstreamIcSt11char_traitsIcESaIcEEE

Rename to ctcdecode

This library may bindings to other learning systems, so rename to properly reflect that it is, first, a C++ CTC decoding implementation.

Making the vocabulary trie

@ryanleary I was trying to work with your fork at https://github.com/ryanleary/ctcdecode.
The vanilla decoder works like a charm. But I cant figure out how the trie is being made using the function you mentioned in the README.

import pytorch_ctc
 
lexicon = '~/language_modelling/Jaderberg_90k_lexicon.txt'
output_path = '~/tries/4gram_JaderbergLexicon/'
kenlm_path = '~/language_modelling/lm_4gram_on_lob_and_brown.klm'
labels = '_0123456789abcdefghijklmnopqrstuvwxyz '

pytorch_ctc.generate_lm_trie(lexicon, kenlm_path, output_path, labels, 0, 37)

Above is my script to generate the trie . The script runs without any errors. But nothing is being created at the specified output path

Could you please tell me If I am doing it right

Support gzip for models/tries

The binary LMs and ASCII tries are very large. Loading will be faster and use less space on disk if gzip'd. Gzip support can/should be optional based on libraries installed on system building the plugin.

Multiple characters in a label cause a segfault

When I try to use a multiple-character string as a label, I get a segfault.
For example:

import torch
import ctcdecode
labels = ["_", "SIL", "A"]
decoder = ctcdecode.CTCBeamDecoder(labels, blank_id=0)
decoder.decode(torch.randn(3,3,3))

triggers a segfault, whereas

import torch
import ctcdecode
labels = ["_", "A", "B"]
decoder = ctcdecode.CTCBeamDecoder(labels, blank_id=0)
decoder.decode(torch.randn(3,3,3))

does not. Is it possible to do this somehow?

ImportError in mac

Python 2.7.15 |Anaconda, Inc.| (default, May 1 2018, 18:37:05)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

import ctcdecode
Traceback (most recent call last):
File "", line 1, in
File "/Users/marxia/anaconda3/envs/py2/lib/python2.7/site-packages/ctcdecode/init.py", line 1, in
from ._ext import ctc_decode
File "/Users/marxia/anaconda3/envs/py2/lib/python2.7/site-packages/ctcdecode/_ext/ctc_decode/init.py", line 3, in
from ._ctc_decode import lib as _lib, ffi as _ffi
ImportError: dlopen(/Users/marxia/anaconda3/envs/py2/lib/python2.7/site-packages/ctcdecode/_ext/ctc_decode/_ctc_decode.so, 2): Symbol not found: __ZNSt12future_errorD1Ev
Referenced from: /Users/marxia/anaconda3/envs/py2/lib/python2.7/site-packages/ctcdecode/_ext/ctc_decode/_ctc_decode.so
Expected in: flat namespace
in /Users/marxia/anaconda3/envs/py2/lib/python2.7/site-packages/ctcdecode/_ext/ctc_decode/_ctc_decode.so

Could you provide some examples/tutorials about "how to decode with a pre-trained language model"?

Hi, I recently tried to reproduce the experimental results on handwriting recognition in some papers. Fortunately, I find your implementation about ctc decoding is very helpful. However, I cannot find out any examples or tutorials about decoding with a pre-trained language model with your implementation. I am confused about following issues:

  • How to pre-train a language model if the model is purely character-based?
  • How to decode with a pre-trained language with your implementation (ctcdecode )?

It would be grateful if you could provide any suggestions/examples.
Thanks.

Num_time_steps calculation for batch inputs is wrong

When using ctcdecode with sequential data of variable output lengths, the smaller outputs are generally padded with zeros to compensate for the extra size of the largest sample. So, logically, when the ctc_beam_search_decoder loops through the timesteps of probs_seq at Link for code, it should stop at the timestep corresponding to the actual size of that sample's output instead of the length of the probs_seq, since probs_seq also has extra padding in batch mode. This causes in ctcdecode to add extra garbage characters at the end of its actual output.

Examples of such outputs are:

Example#1:
Prediction: didn't do before a ooooooh i o o t h o e l l e e e e e e e e e e e e e e o a o o ghx xxx xxx eee e

Reference: didn't god before
Example#2:
Prediction: and it may be a lot of things that are kind of true ornette e e l e e e e e e e e e e e a u n ghx xxx eee et

Reference: and it may be a lot of things that are kind of truer

I am using ctcdecode with the outputs from deepspeech.pytorch

I can think of two possible solutions for this:

  1. Pass the num_time_steps to ctc_beam_search_decoder as an argument:
    i.e. instead of
    size_t num_time_steps = probs_seq.size(); at line, it should be
    size_t num_time_steps = size # which is passed as an argument

  2. Add a check for some impossible probability outputs, such as -1 and break the loop whenever its true.
    I am currently using this hack in our system, and it seems to work! You can find it here
    For this to work, the outputs of the DeepSpeech model are changed a bit. The extra timestep values are intentionally set to -1. The changes are here

The transcripts for same examples, after using the second hacky method are:

Example#1:
Prediction: didn't do before
Reference: didn't god before
Example#2:
Prediction: and it may be a lot of things that are kind of true or
Reference: and it may be a lot of things that are kind of truer

CPU RAM memory leak when using the beam search decoder

Hi,

I used the ctc beam decoder from this link https://github.com/joshemorris/pytorch-ctc. However, I found that after I finished decoding one utterance, the decoder does not release RAM memory. After decoding more and more sentences, the RAM was full. This is especially true if I use a large beam width such as 100, in which case RAM usage quickly blow up.

My code looks like this:

import pytorch_ctc
from pytorch_ctc import Scorer

decoder = pytorch_ctc.CTCBeamDecoder(Scorer(), labels, top_paths = 1, beam_width = 100, blank_index = 0, space_index = -1, merge_repeated=False)

for i in range(total_num_utterances):
decoded, _, out_seq_len = decoder.decode(prob_tensor_i, seq_len_i)

Anyone has any ideas how to fix this issue?

Thank you very much.

Support for PyTorch 0.4

I'm trying to decode using a KenLM language model with pytorch 0.4 and I'm getting a seg fault (core dumped), probably because of the new tensor syntax.

What are the plans for pytorch 0.4 support?

Best,
Miguel

Add dictionary-only scorer

Support decodes based on a dictionary lexicon. Should be able to leverage the trie data structure and eliminate the LM.

ModuleNotFoundError: No module named 'ctcdecode.ctcdecode._ext.ctc_decode._ctc_decode'

Hi,
after installation, I got modulenotfounderror when I call
from ctcdecode.ctcdecode import CTCBeamDecoder

File "/home-nfs/xx/speech/model_wsj_3layers.py", line 9, in <module>
    from ctcdecode.ctcdecode import CTCBeamDecoder
  File "/home-nfs/xx/speech/ctcdecode/ctcdecode/__init__.py", line 1, in <module>
    from ._ext import ctc_decode
  File "/home-nfs/xx/speech/ctcdecode/ctcdecode/_ext/ctc_decode/__init__.py", line 3, in <module>
    from ._ctc_decode import lib as _lib, ffi as _ffi
ModuleNotFoundError: No module named 'ctcdecode.ctcdecode._ext.ctc_decode._ctc_decode'

I dont see ._ctc_decode in the directory. How should I solve this problem? Thanks

Confusion about trie files

I'm unclear as to what to expect from the generated trie files. I've run generate_lm_trie.py from deepspeech.pytorch, as well as directly within a python shell, by importing pytorch_ctc. With the former, the process takes around one second (tested with: 17GB/50k-vocab/5-gram, 2.5GB/50k-vocab/3-gram, and 6.2GB/100K-vocab/5-gram KenLM binaries), producing <3kB trie files with exactly 869 lines of mostly -1s (and 0s on every third line up to line 74), with no error messages. With the latter, the process takes some 20-30 minutes, producing ~10MB files, with 3.45M lines (still of mostly -1s) for the two aforementioned 50k-vocab binaries.

What are the expected formats and sizes of the trie files, and is there some reference against which I might compare mine?

Secondly, using the parameters quoted in the table here (beam_width 100, lm_alpha 4.0, lm_beta1 0.0, lm_beta2 5.0) increases WER/CER on a test set I'm using from 26.86/10.35 with greedy or argmax decoders to 100.00/29.63. I have not gridsearched, but I haven't found any configurations that improve WER or CER.

Is there something obvious I'm missing?

Thanks!

Import Error

Hi !

I can't import ctcdecode :

tbelos2@asus:~/socr$ python3
Python 3.6.5 |Anaconda, Inc.| (default, Apr 29 2018, 16:14:56) 
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ctcdecode
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/tbelos2/anaconda3/lib/python3.6/site-packages/ctcdecode/__init__.py", line 1, in <module>
    from ._ext import ctc_decode
  File "/home/tbelos2/anaconda3/lib/python3.6/site-packages/ctcdecode/_ext/ctc_decode/__init__.py", line 3, in <module>
    from ._ctc_decode import lib as _lib, ffi as _ffi
ImportError: /home/tbelos2/anaconda3/lib/python3.6/site-packages/ctcdecode/_ext/ctc_decode/_ctc_decode.abi3.so: undefined symbol: _Z17paddle_get_scorerddPKcS0_i
>>> 

make PyPI package

It would be helpful if this could just be installed via pip install pytorch-ctc or so.

Update Documentation

The API has changed to enable initialization of the scorer and decoders once for multiple decodings. Also adds kenlm support.

ToDos:

  • Document new API
  • Add acknowledgements
  • Document new scorers/installation requirements

SSL Error

I updated the certificate on my Linux machine, but it did not work. Any ideas?

$ pip install .
Processing /home/jennifer/git/ctcdecode
Complete output from command python setup.py egg_info:
zip_safe flag not set; analyzing archive contents...

Installed /tmp/pip-_3xikmb0-build/.eggs/wget-3.2-py3.6.egg
Traceback (most recent call last):
  File "/home/jennifer/anaconda3/lib/python3.6/urllib/request.py", line 1318, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/home/jennifer/anaconda3/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/home/jennifer/anaconda3/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/home/jennifer/anaconda3/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/home/jennifer/anaconda3/lib/python3.6/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/home/jennifer/anaconda3/lib/python3.6/http/client.py", line 964, in send
    self.connect()
  File "/home/jennifer/anaconda3/lib/python3.6/http/client.py", line 1400, in connect
    server_hostname=server_hostname)
  File "/home/jennifer/anaconda3/lib/python3.6/ssl.py", line 407, in wrap_socket
    _context=self, _session=session)
  File "/home/jennifer/anaconda3/lib/python3.6/ssl.py", line 814, in __init__
    self.do_handshake()
  File "/home/jennifer/anaconda3/lib/python3.6/ssl.py", line 1068, in do_handshake
    self._sslobj.do_handshake()
  File "/home/jennifer/anaconda3/lib/python3.6/ssl.py", line 689, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:833)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/tmp/pip-_3xikmb0-build/setup.py", line 55, in <module>
    os.path.join(this_file, "build.py:ffi")
  File "/home/jennifer/anaconda3/lib/python3.6/site-packages/setuptools/__init__.py", line 129, in setup
    return distutils.core.setup(**attrs)
  File "/home/jennifer/anaconda3/lib/python3.6/distutils/core.py", line 108, in setup
    _setup_distribution = dist = klass(attrs)
  File "/home/jennifer/anaconda3/lib/python3.6/site-packages/setuptools/dist.py", line 372, in __init__
    _Distribution.__init__(self, attrs)
  File "/home/jennifer/anaconda3/lib/python3.6/distutils/dist.py", line 281, in __init__
    self.finalize_options()
  File "/home/jennifer/anaconda3/lib/python3.6/site-packages/setuptools/dist.py", line 528, in finalize_options
    ep.load()(self, ep.name, value)
  File "/home/jennifer/anaconda3/lib/python3.6/site-packages/cffi/setuptools_ext.py", line 204, in cffi_modules
    add_cffi_module(dist, cffi_module)
  File "/home/jennifer/anaconda3/lib/python3.6/site-packages/cffi/setuptools_ext.py", line 49, in add_cffi_module
    execfile(build_file_name, mod_vars)
  File "/home/jennifer/anaconda3/lib/python3.6/site-packages/cffi/setuptools_ext.py", line 25, in execfile
    exec(code, glob, glob)
  File "/tmp/pip-_3xikmb0-build/build.py", line 24, in <module>
    'third_party/boost_1_63_0.tar.gz')
  File "/tmp/pip-_3xikmb0-build/build.py", line 14, in download_extract
    out=dl_path)
  File "/tmp/pip-_3xikmb0-build/.eggs/wget-3.2-py3.6.egg/wget.py", line 526, in download
  File "/home/jennifer/anaconda3/lib/python3.6/urllib/request.py", line 248, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/home/jennifer/anaconda3/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/home/jennifer/anaconda3/lib/python3.6/urllib/request.py", line 526, in open
    response = self._open(req, data)
  File "/home/jennifer/anaconda3/lib/python3.6/urllib/request.py", line 544, in _open
    '_open', req)
  File "/home/jennifer/anaconda3/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/home/jennifer/anaconda3/lib/python3.6/urllib/request.py", line 1361, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/home/jennifer/anaconda3/lib/python3.6/urllib/request.py", line 1320, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:833)>

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-_3xikmb0-build/

Parallelize Decode

Use threads to parallelize decoder either at utterance (ie item in batch) and/or beam level.

segfault src/path_trie.cpp: No such file or directory.

Hello,

When I use language model in a binary format I get a segfault. I tried to run in gdb and it seems that path_trie.cpp is missing. What could be a problem?

Thread 24 "python" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fff45fff700 (LWP 9951)] PathTrie::get_path_trie (this=this@entry=0x7fff45ffebe0, new_char=new_char@entry=1, new_timestep=new_timestep@entry=0, reset=reset@entry=true) at /tmp/pip-qveo70c7-build/ctcdecode/src/path_trie.cpp:56 56 /tmp/pip-qveo70c7-build/ctcdecode/src/path_trie.cpp: No such file or directory.

Add support for time-alignments

Return a matrix of time-alignment information to indicate at what index in the original output a character occurred. This will provide the capability of roughly aligning the audio to the output text.

Import issue with _ctc_decode

I'm running into errors trying to generate a trie, having tried in existing (deepspeech.pytorch) and clean conda environments, as well as virtualenvs with python 2.7 and 3.5. They all seem to point to a dependency issue with importing _ctc_decode via pytorch_ctc.

(pytorch) ubuntu@ds-worker:~/deepspeech.pytorch$ python generate_lm_trie.py -h
Traceback (most recent call last):
  File "generate_lm_trie.py", line 1, in <module>
    import pytorch_ctc
  File "/home/ubuntu/miniconda2/envs/pytorch/lib/python2.7/site-packages/pytorch_ctc/__init__.py", line 4, in <module>
    from ._ctc_decode import lib as _lib, ffi as _ffi
SystemError: dynamic module not initialized properly

directly importing in python shell:

>>> import _ctc_decode                                                                                                                                                   
Traceback (most recent call last):                                                                                                                                       
  File "<stdin>", line 1, in <module>                                                                                                                                    
ImportError: /home/ubuntu/pytorch-ctc/build/lib.linux-x86_64-3.5/pytorch_ctc/_ctc_decode.so: undefined symbol: THIntTensor_set2d            
>>>

Full build log:

generating build/ctc_decode/_ctc_decode.c
(already up-to-date)
running install
running build
running build_py
running build_ext
building 'pytorch_ctc._ctc_decode' extension
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c build/ctc_decode/_ctc_decode.c -o build/temp.linux-x86_64-3.5/build/ctc_decode/_ctc_decode.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/pytorch_ctc/src/cpu_binding.cpp -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/pytorch_ctc/src/cpu_binding.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/pytorch_ctc/src/util/status.cpp -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/pytorch_ctc/src/util/status.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/parallel_read.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/parallel_read.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/mmap.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/mmap.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/string_piece.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/string_piece.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/exception.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/exception.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/file_piece.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/file_piece.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/bit_packing.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/bit_packing.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/ersatz_progress.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/ersatz_progress.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/integer_to_string.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/integer_to_string.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/float_to_string.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/float_to_string.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/scoped.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/scoped.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/murmur_hash.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/murmur_hash.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/read_compressed.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/read_compressed.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/usage.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/usage.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/spaces.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/spaces.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/file.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/file.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/pool.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/pool.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/config.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/config.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/virtual_interface.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/virtual_interface.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/bhiksha.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/bhiksha.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/search_trie.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/search_trie.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/binary_format.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/binary_format.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/value_build.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/value_build.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/read_arpa.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/read_arpa.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/trie_sort.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/trie_sort.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/trie.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/trie.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/sizes.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/sizes.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/quantize.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/quantize.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/search_hashed.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/search_hashed.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/lm_exception.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/lm_exception.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/vocab.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/vocab.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/lm/model.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/model.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/bignum-dtoa.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/bignum-dtoa.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/diy-fp.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/diy-fp.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/cached-powers.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/cached-powers.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/fast-dtoa.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/fast-dtoa.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/fixed-dtoa.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/fixed-dtoa.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/double-conversion.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/double-conversion.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/strtod.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/strtod.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/gcc-5 -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include -I/home/ubuntu/ctc_py3/lib/python3.5/site-packages/torch/utils/ffi/../../lib/include/TH -Ithird_party/eigen3 -Ithird_party/utf8 -Ithird_party/kenlm -I/usr/include/python3.5m -I/home/ubuntu/ctc_py3/include/python3.5m -c /home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/bignum.cc -o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/bignum.o -std=c++11 -fPIC -w -O3 -DNDEBUG -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB -DINCLUDE_KENLM -DKENLM_MAX_ORDER=6
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/bin/g++-5 -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.5/build/ctc_decode/_ctc_decode.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/pytorch_ctc/src/cpu_binding.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/pytorch_ctc/src/util/status.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/parallel_read.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/mmap.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/string_piece.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/exception.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/file_piece.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/bit_packing.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/ersatz_progress.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/integer_to_string.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/float_to_string.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/scoped.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/murmur_hash.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/read_compressed.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/usage.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/spaces.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/file.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/pool.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/config.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/virtual_interface.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/bhiksha.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/search_trie.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/binary_format.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/value_build.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/read_arpa.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/trie_sort.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/trie.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/sizes.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/quantize.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/search_hashed.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/lm_exception.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/vocab.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/lm/model.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/bignum-dtoa.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/diy-fp.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/cached-powers.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/fast-dtoa.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/fixed-dtoa.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/double-conversion.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/strtod.o build/temp.linux-x86_64-3.5/home/ubuntu/pytorch-ctc/third_party/kenlm/util/double-conversion/bignum.o -lstdc++ -lz -lbz2 -llzma -o build/lib.linux-x86_64-3.5/pytorch_ctc/_ctc_decode.cpython-35m-x86_64-linux-gnu.so
running install_lib
copying build/lib.linux-x86_64-3.5/pytorch_ctc/_ctc_decode.so -> /home/ubuntu/ctc_py3/lib/python3.5/site-packages/pytorch_ctc
copying build/lib.linux-x86_64-3.5/pytorch_ctc/_ctc_decode.cpython-35m-x86_64-linux-gnu.so -> /home/ubuntu/ctc_py3/lib/python3.5/site-packages/pytorch_ctc
running install_egg_info
Removing /home/ubuntu/ctc_py3/lib/python3.5/site-packages/pytorch_ctc-0.1.egg-info
Writing /home/ubuntu/ctc_py3/lib/python3.5/site-packages/pytorch_ctc-0.1.egg-info
not modified: 'build/ctc_decode/_ctc_decode.c'
/usr/lib/python3.5/distutils/dist.py:261: UserWarning: Unknown distribution option: 'install_requires'
  warnings.warn(msg)
/usr/lib/python3.5/distutils/dist.py:261: UserWarning: Unknown distribution option: 'setup_requires'
  warnings.warn(msg)

Pip install fails

Processing /workspace/speech_recognition/ctcdecode
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-req-build-ffdgasnw/setup.py", line 55, in <module>
        os.path.join(this_file, "build.py:ffi")
      File "/opt/conda/lib/python3.6/site-packages/setuptools/__init__.py", line 140, in setup
        return distutils.core.setup(**attrs)
      File "/opt/conda/lib/python3.6/distutils/core.py", line 108, in setup
        _setup_distribution = dist = klass(attrs)
      File "/opt/conda/lib/python3.6/site-packages/setuptools/dist.py", line 370, in __init__
        k: v for k, v in attrs.items()
      File "/opt/conda/lib/python3.6/distutils/dist.py", line 281, in __init__
        self.finalize_options()
      File "/opt/conda/lib/python3.6/site-packages/setuptools/dist.py", line 529, in finalize_options
        ep.load()(self, ep.name, value)
      File "/opt/conda/lib/python3.6/site-packages/cffi/setuptools_ext.py", line 204, in cffi_modules
        add_cffi_module(dist, cffi_module)
      File "/opt/conda/lib/python3.6/site-packages/cffi/setuptools_ext.py", line 49, in add_cffi_module
        execfile(build_file_name, mod_vars)
      File "/opt/conda/lib/python3.6/site-packages/cffi/setuptools_ext.py", line 25, in execfile
        exec(code, glob, glob)
      File "/tmp/pip-req-build-ffdgasnw/build.py", line 9, in <module>
        from torch.utils.ffi import create_extension
      File "/opt/conda/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 1, in <module>
        raise ImportError("torch.utils.ffi is deprecated. Please use cpp extensions instead.")
    ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead.
    
    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-req-build-ffdgasnw/

I follow the exact instuctions given in readme with a recursive clone..

word_count_weight and valid_word_count_weight

Hi, I cannot still understand the word_count_weight and valid_word_count_weight. What they affect to the scoring? I've found that each default value is 0. and 1. in ctc_beam_scorer_klm.h

How to understand the output scores?

Hi,

During decoding, I used top-5 best paths. When I look at the output scores returned by the decoder, I found 2 things that I don't understand:

  1. The scores are in ascending order, starting from the smallest one. Since it's the "top" N best, why is the first score the smallest?
  2. Some scores (the largest ones) are greater than 1. I am wondering since they are log probabilities, should they be less than 0?

So, how should I interpret the scores?

Thank you so much for any help

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.