dspavankumar / keras-kaldi Goto Github PK
View Code? Open in Web Editor NEWKeras Interface for Kaldi ASR
License: GNU General Public License v3.0
Keras Interface for Kaldi ASR
License: GNU General Public License v3.0
File "/tools/ASR/steps_kt/dataGenSequences.py", line 134, in getNextSplitData
labelMat = self.getBinaryLabels(labels[uid])
AttributeError: 'dataGenSequences' object has no attribute 'getBinaryLabels'
Hi,
I am currently trying to run the scripts within the kaldi folder where I have successfully run both pdnn and kaldi recipes to train DNNs. But the run_kt.sh requires tri2b folder where the requirement on how to proceed to get the preprocessed data to execute the above script is not clear. Could you please elaborate on how to go about creating data for running run_kt.sh.
anaconda3/lib/python3.6/subprocess.py", line 1326, in _execute_child
raise child_exception_type(errno_num, err_msg)
PermissionError: [Errno 13] Permission denied
Hi Mr Kumar,
Inside steps_kt/train.py
, when you call m.fit_generator
you set the sample_per_epoch
to trGen.numFeats
, like the following:
h = [m.fit_generator (trGen, samples_per_epoch=trGen.numFeats,
validation_data=cvGen, nb_val_samples=cvGen.numFeats,
nb_epoch=learning['minEpoch']-1, verbose=1)]
Is there a particular reason that you sample_per_epoch
is setted in this way?
My concern is that seems like my dataset has 16,563,999 number of trGen.numFeats, and the network is too slow to train.
My question is
trGen.numFeats
stand for?samples_per_epoch
?Any input and advice will be greatly appreciated!
Thanks in advance,
Hi Kumar
I see in steps_kt/decode_seq.sh ,line of 75
export KERAS_BACKEND=theano.
It seems that this code can only run on theano backend.
But in my case, the model is trained by tensorflow, will it cause any problem during decoding?
At the end of the first epoch it crashed with the following traceback:
File "../python2.7/threading.py", line 801 in __bootstrap_inner
self.run
File ".../threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "../engine/training.py", line 606 in data_generator_task
generator_output = next(self._generator)
File "...dataGenerator.py", line 153 in next
x, y = self.getNextSplitData()
File "..dataGenerator.py", line 135, in getNextSplitData
return (numpy.vstack(featList), numpy.vstack(labelList))
File "...numpy/core/shape_base.py", line 234, in vstack
return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
ValueError: need at least one array to concatenate
Looks like featList or labelList were empty? Not sure why this would happen. Any thoughts?
If relevant, I did have to change the call to fit_generator in train.py from samples_per_epoch=trGen.numFeats
to steps_per_epoch=trGen.numFeats//learning['batchSize']
to make it compatible with Keras 2.0.
Thanks very much for releasing this code! :-)
Hi, when i run the sh run_kt.sh ,it occur the problem as follows:
The path is
train=/kaldi-trunk/egs/timit/s5/data/train
test=/kaldi-trunk/egs/timit/s5/data/test
lang=/kaldi-trunk/egs/timit/s5/data/lang
gmm=/kaldi-trunk/egs/timit/s5/exp/tri2
exp=/kaldi-trunk/egs/timit/s5/exp/dnn4_pretrain-dbn_dnn_smbr
Is my path set wrong that lead to the problem?
sudo sh run_kt.sh
steps/align_si.sh --nj 4 --cmd /kaldi-trunk/egs/timit/s5/data/train_cv05 /kaldi-trunk/egs/timit/s5/data/lang /kaldi-trunk/egs/timit/s5/exp/tri2 /kaldi-trunk/egs/timit/s5/exp/tri2_ali_cv05
steps/align_si.sh: empty argument to --cmd option
steps/align_si.sh --nj 4 --cmd /kaldi-trunk/egs/timit/s5/data/train_tr95 /kaldi-trunk/egs/timit/s5/data/lang /kaldi-trunk/egs/timit/s5/exp/tri2 /kaldi-trunk/egs/timit/s5/exp/tri2_ali_tr95
steps/align_si.sh: empty argument to --cmd option
Using TensorFlow backend.
/usr/bin/miniconda3/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float
to np.floating
is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type
.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "steps_kt/train.py", line 57, in
compute_priors (exp, ali_tr, ali_cv)
File "/kaldi-trunk/egs/aishell/s5/steps_kt/compute_priors.py", line 33, in compute_priors
dim = read_output_feat_dim (exp)
File "/kaldi-trunk/egs/aishell/s5/steps_kt/compute_priors.py", line 26, in read_output_feat_dim
p = Popen (['am-info', exp+'/final.mdl'], stdout=PIPE)
File "/usr/bin/miniconda3/lib/python3.6/subprocess.py", line 709, in init
restore_signals, start_new_session)
File "/usr/bin/miniconda3/lib/python3.6/subprocess.py", line 1344, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'am-info': 'am-info'
steps_kt/decode.sh --nj 4 --add-deltas true --norm-vars true --splice-opts --left-context=5 --right-context=5 /kaldi-trunk/egs/timit/s5/data/test /kaldi-trunk/egs/timit/s5/exp/tri2/graph /kaldi-trunk/egs/timit/s5/exp/dnn4_pretrain-dbn_dnn_smbr /kaldi-trunk/egs/timit/s5/exp/dnn4_pretrain-dbn_dnn_smbr/decode
steps_kt/decode.sh: feature: splice(--left-context=5 --right-context=5) norm_vars(true) add_deltas(true)
steps_kt/decode.sh: line 84: JOB=1:4: command not found
Hi Pavan,
Your work looks great, I am quite interested in trying it.
I am not familiar with the nnet1 framework, more with the nnet3.
I don't exactly understand the full pipeline.
I undertstand the final.mdl just comes from the gmm training as is and is not changed.
This means that the .h5 nn model is required at decoding time.
Am I correct ? would there be a way to recompute a final.mdl that could be nnet3 compatible?
also did you notice slow training time or similar ?
and at decoding time, big difference with GPU vs CPU ?
thanks, and congrats again.
Is it possible given this implementation to train an acoustic models given a different kind of inputs than audio frames?.. In my case spectrograms of audio files.
I an currently seeking a way in which i can implement a CNN-HMM using the kaldi interface, Training the CNN part is possible in keras, but connecting it to kaldi seem to cause some problems.
Is it possible to create such an acoustic model using your implementation, and still be able to decode using the kale interface?
I am trying to run train.py and it is giving me the error:
Traceback (most recent call last):
File "train.py", line 23, in
from dataGenerator import dataGenerator
File "/home/niraj/projects/keras-kaldi-master/steps_kt/dataGenerator.py", line 20, in
from subprocess import Popen, PIPE, DEVNULL
ImportError: cannot import name DEVNULL
How to resolve this error?
Hi Kumar
I got your code run with timit datasets.
but I can't get same phone error rate like you.
your is DNN (3 hidden layers of 1024 nodes, ReLU activations): 23.71%. but mine is only ~50%
I used latest kaldi. use run_kt.sh.
Would u post your log for reference.
Thanks.
btw, I use tri2, and the script got 2000+ pdf from final.mdl.
Hi, I'm trying to run the run_kt_LSTM.sh but it gives me this error:
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training_generator.py", line 377, in convert_to_generator_like
num_samples = int(nest.flatten(data)[0].shape[0])
AttributeError: 'dataGenSequences' object has no attribute 'shape'
I don't know how to solve it.
Hello Mr. Kumar,
I noticed that you set
self.inputFeatDim = 429 ## IMPORTANT: HARDCODED. Change if necessary.
I am wondering how can I check the inputFeatDim of my dataset?
Thank you very much!
I've trained a CNN model in keras, which given a context window of 50 frames can predict the center phoneme of the utterance. The model is stored. Is it possible to train the acoustic model to given a pretrained model.
And if yes, how should i format train/test data.. It is currently stored as numpy arrays, which are stored in h5 files.
Are there other things I should be aware about?
Hi Kumar
I'm learning about speech recognition and I want to know if there is any pretrained model that I can download to test it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.