Giter Site home page Giter Site logo

sunpengfei1122 / e2e-sincnet Goto Github PK

View Code? Open in Web Editor NEW

This project forked from tparcollet/e2e-sincnet

0.0 0.0 0.0 27.71 MB

E2E-SincNet: Toward fully end-to-end speech recognition

License: Apache License 2.0

Shell 46.95% Perl 4.77% Python 45.76% Makefile 0.18% Dockerfile 0.59% MATLAB 1.49% M 0.25%

e2e-sincnet's Introduction

E2E-SincNet: Toward Fully End-to-End Speech Recognition

This repository is a fork of ESPnet and contains the code for the paper E2E-SincNet: toward fully end-to-end speech Recognition that will be presented at ICASSP 2020. E2E-SincNet is partially integrated to ESPnet due to major differences in the input data pipeline. E2E-SincNet will be part of the SpeechBrain toolkit. The provided code makes it feasible to reproduce the results obtained in the paper E2E-SincNet: toward fully end-to-end speech Recognition.

  1. TIMIT recipe: Ready to be used.

Installation

This repository is an enhanced version of an ESPnet fork. Therefore, the installation procedure is equivalent to the ESPnet one, making it easier to deploy SincNet in already existing setups.

ASR datasets

The current version of E2E-SincNet supports an ASR recipe for the TIMIT dataset. Thus, a script named run_sincnet.sh is available in egs/timit/asr1 to reproduce the results observed on the paper E2E-SincNet: toward fully end-to-end speech Recognition. Please note that the steps described in this README can be transposed to any recipe of the ESPnet toolkit. Therefore, WSJ results can be reproduced by following the same steps and modifying the run.sh script.

Run the experiments

The proposed integration of SincNet to ESPnet relies on a bridge between the input features preparation of PyTorch-Kaldi to the standard ESPnet recipes. Let us consider the TIMIT experiment in this tutorial. Therefore 4 steps are needed:

  1. Run the standard Kaldi TIMIT recipe until DNN training (no need to go further). This will create all the files needed by the features pre-processing script.
  2. Go to egs/timit/local and open convert_sph_to_wav_kaldiscp_timit.py. Please modify all the needed path in the latter script with respect to your setup. Then, just call python convert_sph_to_wav_kaldiscp_timit.py. This script converts all the TIMIT .WAV files to the correct format for further processing. PLEASE NOTE THAT THIS SCRIPT DUPLICATES THE WAV FILES SO YOU NEED WRITE ACCESS
  3. Go to egs/timit/local and open save_raw_fea.py. Please modify all the needed path in the latter script with respect to your setup. More precisely, you will need to modify and call this script 3 times to generate the train/dev and test raw input features (python convert_sph_to_wav_kaldiscp_timit.py). This script is from PyTorch-Kaldi and will be soon modified so step 1 becomes unnecessary.
  4. You're good to finally launch the ESPnet recipe (run_sincnet.sh)!

All the configuration files are customizable in the same manner as ESPnet.

e2e-sincnet's People

Contributors

kan-bayashi avatar sw005320 avatar kamo-naoyuki avatar hirofumi0810 avatar shigekikarita avatar fhrozen avatar gtache avatar r9y9 avatar b-flo avatar bobchennan avatar simpleoier avatar jnishi avatar takenori-y avatar mn5k avatar potato-inoue avatar xiaofei-wang avatar creatorscan avatar masao-someki avatar sas91 avatar lumaku avatar jzmo avatar yosukehiguchi avatar m-wiesner avatar tparcollet avatar emrys365 avatar zh794390558 avatar xmhzz2018 avatar unnonouno avatar sknadig avatar akreal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.