dspavankumar / keras-kaldi Goto Github PK

View Code? Open in Web Editor NEW

121.0 121.0 41.0 49 KB

Keras Interface for Kaldi ASR

License: GNU General Public License v3.0

Python 62.98% Shell 37.02%

deep-neural-networks speech-recognition

keras-kaldi's People

Contributors

Stargazers

Watchers

Forkers

datavizweb nvb2005 prashanttz abhi3p wenlin-zhang shuang777 miail giovannirescia jingyonghou michaelfeng87 guanlongzhao yfliao entonytang mbencherif hugo-w giribushan zhangzhaofeng mhy-kevin-dev zhuleiustc tifosi528 bekerov germany-zhu pikaliov pikaliovco hitxujian shubhampachori12110095 0x38 komjii2 rollingstone aascode liuyanfeier gaoyiyeah gym0569 zw76859420 artoriastheabysswalkergael ishine dannyggbond zhiyu-deep scott0910 dathu

keras-kaldi's Issues

getBinaryLabels function not defined

File "/tools/ASR/steps_kt/dataGenSequences.py", line 134, in getNextSplitData
labelMat = self.getBinaryLabels(labels[uid])
AttributeError: 'dataGenSequences' object has no attribute 'getBinaryLabels'

exp/tri2b Data Processing

Hi,

I am currently trying to run the scripts within the kaldi folder where I have successfully run both pdnn and kaldi recipes to train DNNs. But the run_kt.sh requires tri2b folder where the requirement on how to proceed to get the preprocessed data to execute the above script is not clear. Could you please elaborate on how to go about creating data for running run_kt.sh.

subprocess error

anaconda3/lib/python3.6/subprocess.py", line 1326, in _execute_child
raise child_exception_type(errno_num, err_msg)
PermissionError: [Errno 13] Permission denied

samples_per_epoch in steps_kt/train.py

Hi Mr Kumar,

Inside steps_kt/train.py, when you call m.fit_generator you set the sample_per_epoch to trGen.numFeats, like the following:

h = [m.fit_generator (trGen, samples_per_epoch=trGen.numFeats, 
        validation_data=cvGen, nb_val_samples=cvGen.numFeats,
        nb_epoch=learning['minEpoch']-1, verbose=1)]

Is there a particular reason that you sample_per_epoch is setted in this way?
My concern is that seems like my dataset has 16,563,999 number of trGen.numFeats, and the network is too slow to train.

My question is

what does trGen.numFeats stand for?
is this the only way to set samples_per_epoch ?

Any input and advice will be greatly appreciated!
Thanks in advance,

Can this code work on a tensorflow trained model

Hi Kumar
I see in steps_kt/decode_seq.sh ,line of 75
export KERAS_BACKEND=theano.
It seems that this code can only run on theano backend.

But in my case, the model is trained by tensorflow, will it cause any problem during decoding?

Data generator error when finishing epoch

At the end of the first epoch it crashed with the following traceback:

File "../python2.7/threading.py", line 801 in __bootstrap_inner
self.run
File ".../threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "../engine/training.py", line 606 in data_generator_task
generator_output = next(self._generator)
File "...dataGenerator.py", line 153 in next
x, y = self.getNextSplitData()
File "..dataGenerator.py", line 135, in getNextSplitData
return (numpy.vstack(featList), numpy.vstack(labelList))
File "...numpy/core/shape_base.py", line 234, in vstack
return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
ValueError: need at least one array to concatenate

Looks like featList or labelList were empty? Not sure why this would happen. Any thoughts?

If relevant, I did have to change the call to fit_generator in train.py from samples_per_epoch=trGen.numFeats to steps_per_epoch=trGen.numFeats//learning['batchSize'] to make it compatible with Keras 2.0.

Thanks very much for releasing this code! :-)

problem during runing run_kt.sh

Hi, when i run the sh run_kt.sh ,it occur the problem as follows:
The path is
train=/kaldi-trunk/egs/timit/s5/data/train
test=/kaldi-trunk/egs/timit/s5/data/test
lang=/kaldi-trunk/egs/timit/s5/data/lang
gmm=/kaldi-trunk/egs/timit/s5/exp/tri2
exp=/kaldi-trunk/egs/timit/s5/exp/dnn4_pretrain-dbn_dnn_smbr
Is my path set wrong that lead to the problem?

sudo sh run_kt.sh
steps/align_si.sh --nj 4 --cmd /kaldi-trunk/egs/timit/s5/data/train_cv05 /kaldi-trunk/egs/timit/s5/data/lang /kaldi-trunk/egs/timit/s5/exp/tri2 /kaldi-trunk/egs/timit/s5/exp/tri2_ali_cv05
steps/align_si.sh: empty argument to --cmd option
steps/align_si.sh --nj 4 --cmd /kaldi-trunk/egs/timit/s5/data/train_tr95 /kaldi-trunk/egs/timit/s5/data/lang /kaldi-trunk/egs/timit/s5/exp/tri2 /kaldi-trunk/egs/timit/s5/exp/tri2_ali_tr95
steps/align_si.sh: empty argument to --cmd option
Using TensorFlow backend.
/usr/bin/miniconda3/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "steps_kt/train.py", line 57, in
compute_priors (exp, ali_tr, ali_cv)
File "/kaldi-trunk/egs/aishell/s5/steps_kt/compute_priors.py", line 33, in compute_priors
dim = read_output_feat_dim (exp)
File "/kaldi-trunk/egs/aishell/s5/steps_kt/compute_priors.py", line 26, in read_output_feat_dim
p = Popen (['am-info', exp+'/final.mdl'], stdout=PIPE)
File "/usr/bin/miniconda3/lib/python3.6/subprocess.py", line 709, in init
restore_signals, start_new_session)
File "/usr/bin/miniconda3/lib/python3.6/subprocess.py", line 1344, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'am-info': 'am-info'
steps_kt/decode.sh --nj 4 --add-deltas true --norm-vars true --splice-opts --left-context=5 --right-context=5 /kaldi-trunk/egs/timit/s5/data/test /kaldi-trunk/egs/timit/s5/exp/tri2/graph /kaldi-trunk/egs/timit/s5/exp/dnn4_pretrain-dbn_dnn_smbr /kaldi-trunk/egs/timit/s5/exp/dnn4_pretrain-dbn_dnn_smbr/decode
steps_kt/decode.sh: feature: splice(--left-context=5 --right-context=5) norm_vars(true) add_deltas(true)
steps_kt/decode.sh: line 84: JOB=1:4: command not found

steps/score_kaldi.sh --cmd '' /kaldi-trunk/egs/timit/s5/data/test /kaldi-trunk/egs/timit/s5/exp/tri2/graph /kaldi-trunk/egs/timit/s5/exp/dnn4_pretrain-dbn_dnn_smbr/decode
steps/score_kaldi.sh --cmd /kaldi-trunk/egs/timit/s5/data/test /kaldi-trunk/egs/timit/s5/exp/tri2/graph /kaldi-trunk/egs/timit/s5/exp/dnn4_pretrain-dbn_dnn_smbr/decode
steps/score_kaldi.sh: empty argument to --cmd option

question final.mdl

Hi Pavan,
Your work looks great, I am quite interested in trying it.
I am not familiar with the nnet1 framework, more with the nnet3.
I don't exactly understand the full pipeline.
I undertstand the final.mdl just comes from the gmm training as is and is not changed.
This means that the .h5 nn model is required at decoding time.

Am I correct ? would there be a way to recompute a final.mdl that could be nnet3 compatible?

also did you notice slow training time or similar ?

and at decoding time, big difference with GPU vs CPU ?

thanks, and congrats again.

Different form of input?

Is it possible given this implementation to train an acoustic models given a different kind of inputs than audio frames?.. In my case spectrograms of audio files.

I an currently seeking a way in which i can implement a CNN-HMM using the kaldi interface, Training the CNN part is possible in keras, but connecting it to kaldi seem to cause some problems.

Is it possible to create such an acoustic model using your implementation, and still be able to decode using the kale interface?

ImportError

I am trying to run train.py and it is giving me the error:

Traceback (most recent call last):
File "train.py", line 23, in
from dataGenerator import dataGenerator
File "/home/niraj/projects/keras-kaldi-master/steps_kt/dataGenerator.py", line 20, in
from subprocess import Popen, PIPE, DEVNULL
ImportError: cannot import name DEVNULL

How to resolve this error?

Phone error rates below ~50%

Hi Kumar
I got your code run with timit datasets.
but I can't get same phone error rate like you.
your is DNN (3 hidden layers of 1024 nodes, ReLU activations): 23.71%. but mine is only ~50%
I used latest kaldi. use run_kt.sh.
Would u post your log for reference.
Thanks.

  btw, I use tri2, and the script got 2000+ pdf from final.mdl.

Problem with 'dataGenSequences' object has no attribute shape

Hi, I'm trying to run the run_kt_LSTM.sh but it gives me this error:

File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_generator.py", line 377, in convert_to_generator_like
    num_samples = int(nest.flatten(data)[0].shape[0])
AttributeError: 'dataGenSequences' object has no attribute 'shape'

I don't know how to solve it.

self.inputFeatDim in steps_kt/dataGenerator.py

Hello Mr. Kumar,

I noticed that you set

self.inputFeatDim = 429 ## IMPORTANT: HARDCODED. Change if necessary.

I am wondering how can I check the inputFeatDim of my dataset?

Thank you very much!

Training an cnn acoustic model with already trained model in keras

I've trained a CNN model in keras, which given a context window of 50 frames can predict the center phoneme of the utterance. The model is stored. Is it possible to train the acoustic model to given a pretrained model.

And if yes, how should i format train/test data.. It is currently stored as numpy arrays, which are stored in h5 files.

Are there other things I should be aware about?

Any pretrained model available?

Hi Kumar
I'm learning about speech recognition and I want to know if there is any pretrained model that I can download to test it?