Giter Site home page Giter Site logo

speech-transcriber's Introduction

Speech Transcriber

A web-app/library for transcribing speech

Installation

  1. Install Python 3.9
  2. Install ffmpeg
    • Windows: Download zip & add ffmpeg/bin to environment path
    • Linux: apt-get install ffmpeg
  3. pip install -r requirements.txt
  4. (Optional) Download punctuator model and save as INTERSPEECH-T-BRNN.pcl

Usages

Web app

Run pip install flask before running the web app.

Then run python app.py to open the web app at http://localhost:5000/

CLI

python main.py --path filename --transcriber transcriber

  • Path: Path to the audio/video file to transcribe
  • Transcriber: Transcription model to use, choose from:
    • cmu_sphinx
    • librispeech
    • silero
    • vosk
    • wav2vec2
    • wav2vec2_commonvoice
    • whisper

Transcription models

When selecting transcription models, the following requirements were used:

  1. Must be supported in Python 3.9
  2. Must work locally (without the usage of an API)
  3. Must have a straightforward installation process
    • Should not require building from source
    • Should not require additional OS libraries
    • Should not require manually downloading additional files

Below is a comparison of transcription model performance produced using the Librispeech test clean dataset and analysis script

Name Dependencies Model Size Average processing time Score
Wav2Vec2 CommonVoice speechbrain 1.18GB 3.351s 0.87
Librispeech torch, transformers, torchaudio, librosa 113MB 0.558s 0.85
Wav2Vec2 torch, transformers, torchaudio, librosa 360MB 1.325s 0.8
Whisper whisper 138MB 3.848s 0.77
Vosk vosk 67.7MB 1.206s 0.76
Silero torch, transformers, torchaudio, librosa, omegaconf 111MB 0.261s 0.68
CMU Sphinx SpeechRecognition, pocketsphinx 33.9MB* 1.123s 0.55

*size of pocketsphinx package

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.