Giter Site home page Giter Site logo

montrealcorpustools / montreal-forced-aligner Goto Github PK

View Code? Open in Web Editor NEW
1.3K 36.0 242.0 85.74 MB

Command line utility for forced alignment using Kaldi

Home Page: https://montrealcorpustools.github.io/Montreal-Forced-Aligner/

License: MIT License

Python 99.95% F* 0.02% Dockerfile 0.03%
kaldi forced-alignment grapheme-to-phone pronunciation-dictionary acoustic-model python

montreal-forced-aligner's Introduction

Montreal Forced Aligner

Continuous Integration codecov Documentation Status Interrogate Status DOI

The Montreal Forced Aligner is a command line utility for performing forced alignment of speech datasets using Kaldi (http://kaldi-asr.org/).

Please see the documentation http://montreal-forced-aligner.readthedocs.io for installation and usage.

If you run into any issues, please check the mailing list for fixes/workarounds or to post a new issue.

Installation

You can install MFA either entirely through conda or a mix of conda for Kaldi and Pynini dependencies and Python packaging for MFA itself

Conda installation

MFA is hosted on conda-forge and can be installed via:

conda install -c conda-forge montreal-forced-aligner

in your environment of choice.

Source installation

If you'd like to install a local version of MFA or want to use the development set up, the easiest way is first create the dev environment from the yaml in the repo root directory:

conda env create -n mfa-dev -f environment.yml

Alternatively, the dependencies can be installed via:

conda install -c conda-forge python=3.11 kaldi librosa biopython praatio tqdm requests colorama pyyaml pynini openfst baumwelch ngram

MFA can be installed in develop mode via:

pip install -e .[dev]

You should be able to see appropriate output from mfa version

Development

The test suite is run via tox -e py38-win or tox -e py38-unix depending on the OS, and the docs are generated via tox -e docs

Quick links

montreal-forced-aligner's People

Contributors

a-coles avatar amogh-gulati avatar cveaux avatar errayeren avatar fncokg avatar g-thor avatar galaxiet avatar harshcasper avatar jofrhwld avatar lifeiteng avatar michaelasocolof avatar mmcauliffe avatar muhrifqii avatar ntt123 avatar potipot avatar qwaker00 avatar taras-sereda avatar tmestrou avatar vannawillerton avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

montreal-forced-aligner's Issues

Allow for graceful exit

Currently ctrl+c events are not handled very well and can leave hanging python processes.

Running MFA in server mode

Hi creators of MFA! First thank you very much for creating such an excellent tool for forced alignment, with a better feature set and trainability than comparable tools like Gentle. I am trying to integrate MFA into one of our online services and have the following questions:

  1. How likely is MFA going to be maintained in the next couple of years?
  2. Currently the tool is mainly used via the command line interface with noticeable startup time for each run, which renders it un-optimal for real-time alignment tasks (e.g. where the speaker makes a recording in the browser to be aligned in real time on the server). I wonder if MFA can be run as a server listening on a certain port (like what Gentle is doing)?
  3. Are there any examples on how to use the tool programmatically in Python (not via CLI)?

Thank you very much in advance!
Victor

wav files with 8k, not restarting, 0 division...

Hi all! Thanks for putting this together.

First I had problems when trying to align wav files that were 8k (ZeroDivisionError) then after converting the files the aligner was still stuck in the ZeroDivisionError. To fix it, I had to remove the TEMP file for the dataset I was using.

Can you please investigate this?

Adding per-frame/per-phone/per-word score for the alignment

Hi Michael,

Do you have plan adding the likelihood of the per-frame, per-phone and per-word output in the alignment?

This potentially will help the analysis of non-native or abnormal speech (for example pathological speech) with ASR.

TypeError: 'NoneType' object is not subscriptable Failed to execute script align

I have used mfa_align for a week , it has no problem , but yesterday it suddenly couldn't work.This is error message:

Traceback (most recent call last):
  File "aligner/command_line/align.py", line 186, in <module>
  File "aligner/command_line/align.py", line 146, in validate_args
  File "aligner/command_line/align.py", line 69, in align_corpus
TypeError: 'NoneType' object is not subscriptable
Failed to execute script align

I didn't change anything, what's the problem?

Ask confirmation for cleaning the output directory

I realize this is due to my own mistake (RTFM...), but I did not expect MFA to first remove all files and sub-directories from the specified output folder. I don't know if it is essential for the output folder to be empty, but I'd prefer a solution in which files are simply overwritten if they already exist. Even if it is essential to have an empty output directory, it might be good to ask the user (unless e.g., a -nowarn flag or so is set) confirmation for removing all contents of the folder. Otherwise, good work :)

Suggestion: output to command line

Hi,

I have a suggestion for an alternate output which might be easy to accommodate. If no output folder is given, MFA could output the .TextGrid to the command line (cout, stdout). It is great for testing. And on a more selfish note, this would make my life easier as I could avoid disk writes and folder deletion/creation, which can get tangled in permission issues.

Thanks,
Pif

Throw appropriate error when there is an issue generating features

Currently, when features aren't generated properly, there will be an exception thrown during initialization of monophone training, but during alignment from a pretrained model, some issues in feature generation don't throw any errors, and just result in an empty output directory.

Confidence Measure and Automatic Pronunciation Evaluation Using MFA

I am wondering if there is a way to use MFA to obtain phone-level alignment likelihoods. Say, if I have some audio files and transcriptions of non-native English speech. Then, the summation of MFA alignment likelihoods divided by the number of phones of the recognized words can probably be used as a distance measure in the deviation from the native reference model since the default acoustic and language model used in MFA are trained on native English speech.

AttributeError: 'AcousticModel' object has no attribute 'is_tmpdir'

Good morning,

Thank you for your work.
I try for the first time this software, I read the documentation and download pre-trained model for dictionary and language but I don't succeed to launch the alignment.
Do you have an idea of this error pls ?

Thanks

✘ maurice@MBP-de-Herve  ~/Downloads/montreal-forced-aligner/bin  ./mfa_align ../data ../pretrained_models/english.dict.txt english ../output
Setting up corpus information...
Number of speakers in corpus: 1, average number of utterances per speaker: 1.0
Traceback (most recent call last):
File "/Users/mmcauliffe/dev/Montreal-Forced-Aligner/aligner/command_line/align.py", line 186, in
File "/Users/mmcauliffe/dev/Montreal-Forced-Aligner/aligner/command_line/align.py", line 144, in validate_args
File "/Users/mmcauliffe/dev/Montreal-Forced-Aligner/aligner/command_line/align.py", line 139, in align_included_model
File "/Users/mmcauliffe/dev/Montreal-Forced-Aligner/aligner/command_line/align.py", line 86, in align_corpus
File "/Users/mmcauliffe/dev/Montreal-Forced-Aligner/aligner/models.py", line 37, in init
ValueError: '/Users/maurice/Downloads/montreal-forced-aligner/pretrained_models/english.zip' is a bomb.
Failed to execute script align
Exception ignored in: <object repr() failed>
Traceback (most recent call last):
File "/Users/mmcauliffe/dev/Montreal-Forced-Aligner/aligner/models.py", line 69, in del
AttributeError: 'AcousticModel' object has no attribute 'is_tmpdir'
✘ maurice@MBP-de-Herve  ~/Downloads/montreal-forced-aligner/bin 

wave.Error: unknown format: 3

Hi! We recorded some WAVs on Audacity (44.1kHz, mono, 32 bit) and the aligner/possibly Python is not happy about them. We've tried changing the bitrate to 24, to no avail. Any suggestions?

Thanks!

(in case it's needed, here's our command line output when we try to run the MFA:)

montreal-forced-aligner user$ bin/mfa_align [corpus] [dictionary] [english model] [output]
Setting up corpus information...
Traceback (most recent call last):
  File "/Users/mmcauliffe/dev/Montreal-Forced-Aligner/aligner/command_line/align.py", line 186, in <module>
  File "/Users/mmcauliffe/dev/Montreal-Forced-Aligner/aligner/command_line/align.py", line 146, in validate_args
  File "/Users/mmcauliffe/dev/Montreal-Forced-Aligner/aligner/command_line/align.py", line 84, in align_corpus
  File "/Users/mmcauliffe/dev/Montreal-Forced-Aligner/aligner/corpus.py", line 309, in __init__
  File "/Users/mmcauliffe/dev/Montreal-Forced-Aligner/aligner/corpus.py", line 151, in get_sample_rate
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/wave.py", line 499, in open
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/wave.py", line 163, in __init__
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/wave.py", line 143, in initfp
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/wave.py", line 260, in _read_fmt_chunk
wave.Error: unknown format: 3
Failed to execute script align
montreal-forced-aligner user$ 

Add acoustic models for more languages

List of models to train/upload:

  • GlobalPhone Japanese
  • GlobalPhone Arabic
  • NCHLT Afrikaans
  • NCHLT English
  • NCHLT isiNdebele
  • NCHLT isiXhosa
  • NCHLT isiZulu
  • NCHLT Setswana
  • NCHLT Sesotho sa Leboa
  • NCHLT Sesotho
  • NCHLT siSwati
  • NCHLT Tshivenda
  • NCHLT Xitsonga

Be clearer about which files failed to align and why

Currently if a file can't be aligned, no mention is made in the console. MFA should inspect the aligning logs to find which files could not be decoded even with the retry beam, and make some suggestions for why they could not be aligned (errors in transcription, too long for the beam size, increasing the beam, etc).

TextGrid import error after building from source

Hi,
After following the steps for building from source. I got the following error, when try to execute mfa_align.

Traceback (most recent call last):
  File "aligner/command_line/align.py", line 34, in <module>
    from aligner.corpus import Corpus
  File "/usr/local/lib/python2.7/site-packages/PyInstaller/loader/pyimod03_importers.py", line 389, in load_module
    exec(bytecode, module.__dict__)
  File "aligner/__init__.py", line 10, in <module>
    import aligner.aligner as aligner
  File "/usr/local/lib/python2.7/site-packages/PyInstaller/loader/pyimod03_importers.py", line 389, in load_module
    exec(bytecode, module.__dict__)
  File "aligner/aligner/__init__.py", line 2, in <module>
    from .trainable import TrainableAligner
  File "/usr/local/lib/python2.7/site-packages/PyInstaller/loader/pyimod03_importers.py", line 389, in load_module
    exec(bytecode, module.__dict__)
  File "aligner/aligner/trainable.py", line 10, in <module>
    from ..multiprocessing import (align, mono_align_equal, compile_train_graphs,
  File "/usr/local/lib/python2.7/site-packages/PyInstaller/loader/pyimod03_importers.py", line 389, in load_module
    exec(bytecode, module.__dict__)
  File "aligner/multiprocessing.py", line 8, in <module>
    from .textgrid import ctm_to_textgrid, parse_ctm
  File "/usr/local/lib/python2.7/site-packages/PyInstaller/loader/pyimod03_importers.py", line 389, in load_module
    exec(bytecode, module.__dict__)
  File "aligner/textgrid.py", line 5, in <module>
    from textgrid import TextGrid, IntervalTier
ImportError: cannot import name TextGrid
Failed to execute script align

I have installed all the pre-requisite in the requirements.txt file and also checked that import textgrid works in some other python script.

Error running MFA on 10.12.2

Running the mfa_align (using the precompiled English model) or mfa_train_and_align on one of my machines (macOS 10.12.2) is not working. Both commands yield the same error, when applied to a few files from the LibriSpeech corpus. I've attached the terminal output to this message, here:terminal-output.txt. Any help would be greatly appreciated!

Collapse multiple silence intervals

Currently when aligning a long sound file, silences from the the beginnings and ends of segments will result in two silence intervals next to each other. These should be detected and collapsed.

MFA only parses part of LibriSpeech dev set

I'm having trouble getting MFA working with LibriSpeech.

The Librispeech example provided on the readthedocs, works fine:
bin/mfa_align -v ../LibriSpeech/pre/ ../librispeech-lexicon.txt english ../MFA_textgrids/
correctly creates textgrids in the MFA_textgrids folder.

However, when I try using librispeech dev set, only 2 speakers end up getting transcribed. So it appears as if my preprocessing works but only with 2 of the speakers. Here is what the console reads:

$ bin/mfa_align -v ../LibriSpeech/dev-clean/ ../librispeech-lexicon.txt english ../MFA_textgrids/
Setting up corpus information...
Number of speakers in corpus: 97, average number of utterances per speaker: 27.8659793814433
Creating dictionary information...
Using previous MFCCs
Number of speakers in corpus: 97, average number of utterances per speaker: 27.8659793814433
Done with setup.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:12<00:00,  6.32s/it]
Done! Everything took 22.379481077194214 seconds

Here is my preprocessing notebook as a gist:
https://gist.github.com/anonymous/9f0616895dc068ccaae840abc8d0a5b9

This is on Ubuntu 16 - and the example Librispeech works perfectly fine.
Thanks for the help!

generate new language dict failed.

I want to generate mandarin dict using the g2p model advised in the tutorial. But when i run the example, errors like "FileNotFoundError: [Errno 2] No such file or directory: 'phonetisaurus-g2pfst'" occurred. Can anybody tells me how to fix this problem?

Very few files processed by the aligner

Hi,

I need to process a huge audio database with the Montreal Forced Aligner tool. Typically, I have 184 actors folders and in each of these folders there are hundreds of audios with their corresponding .lab files.
However, I encounter a problem when I want to process this database with the pre-trained English model that is given with the tool
bin/mfa_align /path/to/librispeech/dataset /path/to/librispeech/lexicon.txt english ~/Documents/aligned_librispeech
With this function, only 3 of my 184 folders have been processed by the aligner and contain .TextGrid files. Other folders are inexistant in ~/Documents/aligned_librispeech

My audio database is of good quality, then I don't really understand why the aligner remove the large part of the data.

Any help would be useful. Thanks.

"--clean" might be a bit dangerous, and a permissions issue.

Great software! I do have nagging issue: If the "-c" argument is used, and a user were to enter "/" or "/Documents" as their output location is there any protection against erasing the whole User or whole Documents folder?

Also it seems whether "--clean" is used or not 'mfa_align' will delete the whole output folder and replace it with a identically named folder but with different permissions than the one it is replacing. It is causing me some headaches. Thanks!

FileNotFoundError: [Errno 2]

Hello,

Currently trying to run v.1.0.0 in Ubuntu but I am having the following error:

bin/mfa_align ~/Downloads/LibriSpeech ~/Documents/librispeech-lexicon.txt english ~/Documents/aligned_speech
Setting up corpus information...
Number of speakers in corpus: 40, average number of utterances per speaker: 65.5
Creating dictionary information...
Setting up training data...
Calculating MFCCs...
Traceback (most recent call last):
File "aligner/command_line/align.py", line 186, in
File "aligner/command_line/align.py", line 144, in validate_args
File "aligner/command_line/align.py", line 139, in align_included_model
File "aligner/command_line/align.py", line 93, in align_corpus
File "aligner/aligner/pretrained.py", line 71, in init
File "aligner/aligner/pretrained.py", line 117, in setup
File "aligner/aligner/base.py", line 80, in setup
File "aligner/corpus.py", line 970, in initialize_corpus
File "aligner/corpus.py", line 848, in create_mfccs
File "aligner/corpus.py", line 859, in _combine_feats
FileNotFoundError: [Errno 2] No such file or directory: '/Documents/MFA/LibriSpeech/train/mfcc/raw_mfcc.0.scp'
Failed to execute script align

Any thoughts on that? Seems that it was an issue in previous versions. I have tried the solution proposed (remove the train directory), but still cannot manage to run it successfully.

Waiting for your response,
Constantinos

Detect and report on malformed pronunciation dictionary lines

background: I have used MFA to successfully train two aligners on the same corpus, one using eng.dict, and one using a novel dictionary I wrote by applying appropriate phonological rules to eng.dict

I attempted to train a third model, invoking:

bin/mfa_train_and_align ./CORAAL/CORAAL_DC1 ./AAL1.dict ./CORAAL_Aligned3 -o coraal3.zip -f -v

it tells me it's setting up the corpus information, creates the output directory, then crashes with the error:

Traceback (most recent call last): File "/Users/mmcauliffe/dev/Montreal-Forced-Aligner/aligner/command_line/train_and_align.py", line 165, in <module>

File "/Users/mmcauliffe/dev/Montreal-Forced-Aligner/aligner/command_line/train_and_align.py", line 53, in align_corpus

File "/Users/mmcauliffe/dev/Montreal-Forced-Aligner/aligner/dictionary.py", line 115, in #__init__ IndexError: list index out of range

Failed to execute script train_and_align

I have attempted to rewrite the dictionary multiple times, but cannot seem to fix the error. The aligner still works with the other novel dictionary...I'm not sure what the problem could be.

MFA didn't generate all the textgrid file for wav

I have trained my own acoustic model and align some wav files, but the strange thing is that it only generate some textgrid files and ignore other wav file (these audio are recorded in the same environment). Do there anyone have the same problem?

UnicodeDecodeError

When running MFA using the retained model (~/montreal-forced-aligner/pretrained_models/english.zip), I kept encountering the unicode decode error, as shown below:
Setting up corpus information...
Traceback (most recent call last):
File "aligner/command_line/align.py", line 140, in
File "aligner/command_line/align.py", line 61, in align_corpus
File "aligner/corpus.py", line 279, in init
File "aligner/helper.py", line 13, in load_text
File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/codecs.py", line 321, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc9 in position 190: invalid continuation byte
Failed to execute script align

Here is the command line that I was using where I specified the pretained model, the corpus directory with wav and lab files and an output directory:
$ bin/mfa_align /Users/ziweizh/Documents/ISU_ALT/2017_Spring/ENGL_515/Final_Project/Tools/montreal-forced-aligner/pretrained_models/english.zip /Users/ziweizh/Documents/ISU_ALT/2017_Spring/ENGL_515/Final_Project/Data/UIUC/Corpus2_withUTF8 /Users/ziweizh/Documents/ISU_ALT/2017_Spring/ENGL_515/Final_Project/Tools/montreal-forced-aligner/output

Even though I tried converting the encoding of the lab files using shell scripts: $ find . -name '*.lab' -exec iconv --verbose -t utf-8 {} > {} ;
MFA still three me this decode error.

Compilation issue

Hi, thanks for this great tool! I'm trying it install it on a ppc64 machine with Ubuntu-16.04.
It didn't work from the tar.gz archive for Linux, as I got a "wrong executable format" for thebin/mfa_train_and_align (which I guess is because I'm on a ppc64). Therefore I'm trying to compile it from source (I have already Kaldi installed). However, when I run freezing/freeze.sh I got this error message:

Failure: objcopy: Unable to recognise the format of the input file /home/support/alignement/Montreal-Forced-Aligner-1.0.0/build/train_and_align/train_and_align'

Can you tell me which format are these files so I can specify it manually to objcopy?

Thanks

Support features other than MFCCs

Kaldi has several features available for use, these should be options in MFA:

  • MFCC + pitch features
  • LDA on MFCC features
  • PLP features

The current default is MFCCs + deltas.

All features should be benchmarked and the default should be the best performing.

Separate dictionary from acoustic models

  • Code separation for command line
  • Update model generation code (add meta file to archive, remove dictionary)
  • Add validation for dictionaries and acoustic models
  • Update documentation and tests

Meta file in archive would include phone set information, version of the aligner that it was built with, which model architecture (GMM-HMM vs NN of some kind in the future), what kind of features it uses (MFCC vs +pitch vs PLP vs LDA), etc.

Basic grapheme-to-phoneme utilities

Add basic utilities for creating pronunciation dictionaries from trained G2P models

  • Command line utility for generating pronunciation dictionary from corpus to be aligned
  • Command line utility for generating G2P model from pronunciation dictionary
  • Documentation and tests
  • Archive format for G2P models

Dictionary check before calculating MFCCs

Calculating MFCCs can take a long time for a large corpus, and it is likely that many users will walk away from the computer before the prompt asking about fixing OOVs, which pauses the alignment.

Start time of word later than end time: textgrid creation fails

It seems the forced aligner is able to assign a start time of a chunck which is later than the end time. This subsequently yields and error and a crash due to textgrid/textgrid.py, as the textgrid package does not allow a later start time than end time in the creation of a textgrid. To reproduce this, you can use the wav file http://www.let.rug.nl/accents/wav/amazigh1.wav and the lab file with the content: Please call Stella. Ask her to bring these things with her from the store: Six spoons of fresh snow peas, five thick slabs of blue cheese, and maybe a snack for her brother Bob. We also need a small plastic snake and a big toy frog for the kids. She can scoop these things into three red bags, and we will go meet her Wednesday at the train station.

Kaldi nnet3 support

Hi Michael,

How difficult do you think it would be to allow using "nnet2" or "nnet3" models for alignment purposes?
I'm interested in using your tool with my own models, but my most powerful models are nnet ones.

Thanks!
Miguel
p.s.: Nice work on this tool! Seems very cool!

Compiling training graphs for Lexique fails

Error output from compile-graphs.log:

compile-train-graphs --read-disambig-syms=/data/mmcauliffe/temp/MFA/FR/dictionary/phones/disambig.int /data/mmcauliffe/temp/MFA/FR/mono/tree /data/mmcauliffe/temp/MFA/FR/mono/0.mdl /data/mmcauliffe/temp/MFA/FR/dictionary/L.fst ark:- ark:- 
ASSERTION_FAILED (compile-train-graphs[5.1.77~1-8b9e8]:CompileGraphs():training-graph-compiler.cc:194) : 'phone2word_fst.Start() != kNoStateId && "Perhaps you have words missing in your lexicon?"' 

[ Stack-Trace: ]

kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::MessageLogger::~MessageLogger()
kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)
kaldi::TrainingGraphCompiler::CompileGraphs(std::vector<fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, fst::VectorState<fst::ArcTpl<fst::TropicalWeightTpl<float> >, std::allocator<fst::ArcTpl<fst::TropicalWeightTpl<float> > > > > const*, std::allocator<fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, fst::VectorState<fst::ArcTpl<fst::TropicalWeightTpl<float> >, std::allocator<fst::ArcTpl<fst::TropicalWeightTpl<float> > > > > const*> > const&, std::vector<fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, fst::VectorState<fst::ArcTpl<fst::TropicalWeightTpl<float> >, std::allocator<fst::ArcTpl<fst::TropicalWeightTpl<float> > > > >*, std::allocator<fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, fst::VectorState<fst::ArcTpl<fst::TropicalWeightTpl<float> >, std::allocator<fst::ArcTpl<fst::TropicalWeightTpl<float> > > > >*> >*)
kaldi::TrainingGraphCompiler::CompileGraphsFromText(std::vector<std::vector<int, std::allocator<int> >, std::allocator<std::vector<int, std::allocator<int> > > > const&, std::vector<fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, fst::VectorState<fst::ArcTpl<fst::TropicalWeightTpl<float> >, std::allocator<fst::ArcTpl<fst::TropicalWeightTpl<float> > > > >*, std::allocator<fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, fst::VectorState<fst::ArcTpl<fst::TropicalWeightTpl<float> >, std::allocator<fst::ArcTpl<fst::TropicalWeightTpl<float> > > > >*> >*)
main
__libc_start_main
compile-train-graphs() [0x44d159]

Possible reasons:

  • Special symbols in Lexique (° §)
  • Numbers (5 8 9)

Compiling training graphs works with the ProsodyLab French dictionary (https://github.com/prosodylab/prosodylab.dictionaries/blob/master/fr.dict, based on Lexique), as it contains no numbers or special symbols.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.