The comse6998-speech-recognition from fagan2888

comse6998-speech-recognition's Introduction

need kaldi & montreal forced aligner
should create folders in root directory: org_audio, segmented_audio, train-result
run bash pipeline-train.sh
the model will be zip in model.zip in the root directory
run bash pipeline-test.sh

background music: piano music played by Tim Shevlyakov

country music: Mama tried

note lexicon dictionary comes from kaldi/egs/tedlium/s5_r3/data/local/lang_nosp/align_lexicon.txt

note kaldi-scp/*.scp comes from kaldi/egs/tedlium/s5_r3/data/train, kaldi/egs/tedlium/s5_r3/data/test, and kaldi/egs/tedlium/s5_r3/data/dev

note text comes from kaldi/egs/tedlium/s5_r3/data/train/text, kaldi/egs/tedlium/s5_r3/data/dev/text, and kaldi/egs/tedlium/s5_r3/data/test/text

note only partial data (under org_audio, segmented_audio, train-result and text) are uploaded, please use run kaldi/egs/tedlium/s5_r3/run.sh to get the full dataset, and run prepare_convert_to_wav.py to convert wav from sph files

Recommend Projects

fagan2888 / comse6998-speech-recognition Goto Github PK

comse6998-speech-recognition's Introduction

comse6998-speech-recognition's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent