Giter Site home page Giter Site logo

question final.mdl about keras-kaldi HOT 6 CLOSED

dspavankumar avatar dspavankumar commented on August 17, 2024
question final.mdl

from keras-kaldi.

Comments (6)

dspavankumar avatar dspavankumar commented on August 17, 2024

Hello,

Thank you. As you mentioned, the final.mdl is replicated from the GMM directory. The neural network dnn.nnet.h5 has no transition model, and so we need the final.mdl during decoding (transitions are not trained in this setup).

I'm haven't used nnet3, so I can't compare them. However, I guess nnet3 uses C++, so it should be faster than Python.

Regarding decoding on CPU vs GPU, I used small (3-hidden layer) models to test short utterances (less than five seconds). GPU actually took a little longer, perhaps for moving data in and out of its memory. If we tested larger models and longer utterances, the forward pass would be considerably faster on GPU. Training on GPU was surely several folds faster than on CPU.

To convert the models to nnet3, the DNN first needs to be converted into nnet3's raw format. Weights and biases of each DNN layer need to be printed from Python. The model can be loaded using load_model() method of keras.models. Each layer from the list layers of the model has a get_weights() method that retrieves the bias vector and the weight matrix as another list. After we get the DNN in raw format, we could feed nnet3-am-init with the available final.mdl and initialise with the trained weights and GMM's transition model.

Thanks,
Pavan.

from keras-kaldi.

vince62s avatar vince62s commented on August 17, 2024

thanks.
Never tempted to do a full DNN training from the filter banks features directly ?

from keras-kaldi.

dspavankumar avatar dspavankumar commented on August 17, 2024

I believe filterbanks perform similarly to MFCCs; they are just a linear transformation of untruncated MFCCs. I did test them sometime ago; I have no results currently.

from keras-kaldi.

vince62s avatar vince62s commented on August 17, 2024

oh yes I know, I meant without the GMM/HMM transition model. features directly feeding the network.

from keras-kaldi.

dspavankumar avatar dspavankumar commented on August 17, 2024

Like a sequence-to-sequence RNN? No I didn't, but it's definitely interesting to look at.

from keras-kaldi.

vince62s avatar vince62s commented on August 17, 2024

Yes that's what I have in mind but what I don't get is that most implementation for this are with a CTC loss function stuff. purely seq 2 seq does not seem to be easily doable.

from keras-kaldi.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.