beer-asr / beer Goto Github PK

View Code? Open in Web Editor NEW

53.0 10.0 23.0 99.29 MB

Bayesian spEEch Recognizer

License: MIT License

Python 5.37% Makefile 0.01% Shell 2.00% Jupyter Notebook 92.09% Perl 0.53%

beer's Introduction

BEER: the Bayesian spEEch Recognizer

Beer is a toolkit that provide Bayesian machine learning tools for speech technologies.

Beer is currently under construction and many things are subject to change !

Requirements

Beer is built upon the pytorch and several other third party packages. To make sure that all the dependencies are installed we recommend to create a new anaconda environment with the given environment file:

   $ conda env create -f condaenv.yml

This will create a new environment name beer. Then you need to install pytorch in the beer environment:

   $ source activate beer
   $ conda install -c pytorch pytorch

Note that it is necessary to install pytorch as the last step.

Installation

Assuming that you have already created the beer environment, type in a terminal.

  $ source activate beer
  $ python setup.py install

Usage

Have a look to the recipe to get started with beer.

beer's People

Stargazers

Watchers

beer's Issues

HMM doesn't fit Variational distribution of VAE

Hi, Iondel!
I’ve checked the result of HMM-VAE, there are some questions.
We know that VAE is used to extract features from high dimension to low one. Through the encoder of VAE, we can have its posteriors (in this line). So if we put 600 dataset into the encoder, we will have 600 posteriors. We use the means as the new dataset, which will be the observation of HMM, the prior of VAE.
As you can see, the posterior mean can be covered by three normal distribution from HMM. But the variances of normal distribution stay large, which cannot fit the posterior means during the training process.
I think it’s quite strange because simple HMM can fit data well. There are no reasons that it crashes down when we introduce VAE.
I try to use another form of distribution to describe the posterior of VAE, but there are some troubles to substitute the NoramalDiagonalCovariance to NormalWishart or something. Yeah, that’s a biggest problem! There are no concrete documents about params, distributions, modelset, conjugate updates and so on ...... I feel dizzy when step into the embedded class. I will appreciate that you can provide some related theory or papers about these framework and models.
Anyway, thanks very much for your attention and help!

cli/subcommands/hmm/mkphoneloop has no optional argument --concentration

Hi developers,

When calling

steps/hsubspace_aud.sh

in training AUD system an optional argument fed to
beer hmm mkphoneloop --concentration
is unrecognized. Please see

steps/hsubspace_aud.sh

Can I remove this option? Thank you!

Deprecation warning from YAML

When running the main recipe there is the following warning:
utils/nmi.py:16: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Lo ader is unsafe. Please read https://msg.pyyaml.org/load for full details.

Similarly for mkphones.py

mbn_babel.yml file in AUD recipe

The AUD recipe requires a config file named mbn_babel for feature extraction, but the conf directory does not contain any such file. Am I missing something?

can't find file bayesmodel.py

There are many headers include from .bayesmodel import xxx, but I can't find bayesmodel.py in you repo, which causes lots of problems.

Import failed for HSHMM

Here is the error when trying to run beer:
cannot import name 'setprior' from 'beer.cli.subcommands.hshmm'

The problem comes from:

beer/beer/cli/subcommands/hshmm/__init__.py

Line 5 in bdf4904

from . import setprior

Refactoring parallel scripts

We need to merge sge/gnu parallel script in such a way that there is only one decoding/training script and the user chooses which parallel environment to use.

Do we need to keep the "mlpmodel" module ?

I'm moving the talk on gitHub so we can eventually keep track of the discussion.

ibenes wrote:

Reparametrization.
I was aware that the for loop was going to be slower, but my primary concern was that the VAE should not know how to sample, actually it should not know what the encoder produces.

Using the "encoder output".sample() was the first step. Next, I'm planning to give the NormalDiagonalCovariance_MLP.sample() a parameter equivalent to nsamples in your example, restoring the speed of your solution.

FixedNormal
Actually, I first tried to construct a NormalDiagonalCovariance and use it as the latent model. (Apart from being complicated), I think there is something wrong with it. And until I can tell you "here is an error" -- e.g. by a test -- I wanted a model I can understand :-)
MLPModel in recipes
Whoa! This didn't come to my mind, it may actually be cool.
Some unorganized thought
One thing I do not like about the VAE (as it is in the master) is how much it is coupled with a Gaussian distribution coming from the encoder.

The models.normal.NormalDiagonalCovariance is IMHO wrong, here are the clues: cov() and mean() do directly evalue.view(4,-1), instead of using _normalwishart_split_nparams(). Therefore, it fails for latent dimension different from 2 and I think it does some nonsense even in the 2-dim case.

We currently have two kinds of distributions. Bayesian ones as latent models and ordinary ones coming out of encoder/decoder. Is there any chance to unite them? Also, PyTorch now has a pretty nice set of distributions. Unfortunately, I think we cannot really use it, can we?

Unable to create conda environment

conda version: 4.8.2
Platform: Windows 10 64-bit

Using instructions as provided on README, conda env create command fails with following error:

Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:
  - ncurses=6.1
  - pango=1.42.4
  - readline=7.0
  - fribidi=1.0.5
  - glib=2.56.2
  - graphviz=2.40.1
  - gettext=0.19.8.1
  - expat=2.2.6
  - harfbuzz=1.8.8
  - graphite2=1.3.12
  - libedit=3.1.20170329
  - fontconfig=2.13.0

Failed to interpret mfcc.npz file as a pickle

During creating dataset for alffa database in hshmm recipe, I encountered an error message

File "/mnt/hdd/user/anaconda3/envs/beer/lib/python3.7/site-package/numpy/lib/npyio.py", line 440, in load
     return pickle.load(fid, **pickle_kwargs)
_pickle.UnpicklingError: A load persistent id instruction was encountered

with another exception error message

OSError: Failed to interpret file '/mnt/hdd/user/workspace/HMM/features/alffa/sw/train/mfcc.npz' as a pickle

One possible reason I could think of is that during feature extraction the message said

utils/parallel/sge/parallel.sh: line 21: qsub: command not found
INFO: created archive from 0 feature files

but I am not sure if this is the case since the code kept running after that message.

Excuse me if this is a minor issue, I am new to the community.

Parallel decoding

HMM decoding in parallel

Currently 2 parallel systems (gnu parallel / sge).
For the purpose of the ZRC, probably the gnu parallel is the priority.

Originally posted by @iondel in #68

zrc2019 dataset

Hello, I'm a newer to learn AUD, following your readme documents to run this beer system. But I can't find zrc2019 dataset on Google. Could you provide me a share link of this dataset?

is GPU training possible

I wonder if it is possible to run

recipes/aud

scripts on GPU?