Giter Site home page Giter Site logo

parthiv11 / indicwhisper-jax Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 101 KB

Optimizing speech-to-text transcription for citizen feedback calls in Hindi, English, and Hinglish using a converted Indic Whisper model on JAX for lightning-fast processing on TPUs

License: MIT License

Jupyter Notebook 100.00%

indicwhisper-jax's Introduction

IndicWhisper With JAX (70x faster)

What is IndicWhisper?

IndicWhisper is a cutting-edge speech recognition model fine-tuned specifically for Indian languages. It boasts impressive performance on various benchmarks, outperforming other publicly available models. IndicWhisper enables accurate transcription of speech in Indian languages, facilitating tasks such as voice commands, transcription of audio files, and more.

What is IndicWhisper-JAX?

IndicWhisper-JAX is an optimized version of IndicWhisper, leveraging the JAX library for high-performance computing. This enhancement significantly improves the speed and efficiency of the model, making it ideal for real-time transcription tasks. IndicWhisper-JAX maintains the same level of accuracy as the original model while offering unparalleled performance, particularly on TPUs and GPUs.

Overview

IndicWhisper achieves impressive Word Error Rates (WERs) on various benchmarks for Indian languages. It outperforms other publicly available models, making it a valuable asset for speech recognition tasks in Indian languages.

Performance on Vistaar Benchmark (Hindi Subset)

Model Kathbath Kathbath-Hard FLEURS CommonVoice IndicTTS MUCS Gramvaani Average
Google STT 14.3 16.7 19.4 20.8 18.3 17.8 59.9 23.9
IndicWav2vec 12.2 16.2 18.3 20.2 15 22.9 42.1 21
Azure STT 13.6 15.1 24.3 14.6 15.2 15.1 42.3 20
Nvidia-medium 14 15.6 19.4 20.4 12.3 12.4 41.3 19.4
Nvidia-large 12.7 14.2 15.7 21.2 12.2 11.8 42.6 18.6
IndicWhisper 10.3 12.0 11.4 15.0 7.6 12 26.8 13.6
  • For quickstat use kaggle to use IndicWhisper JAX on Kaggle. Kaggle

Why and How to Use it?

IndicWhisper-JAX offers several advantages over traditional speech recognition models:

  1. Enhanced Performance: With JAX optimization, IndicWhisper-JAX achieves remarkable speed improvements, enabling real-time transcription of speech in Indian languages.

  2. Ease of Use: Integrating IndicWhisper-JAX into your projects is seamless. With pre-trained checkpoints and straightforward API usage, you can start transcribing audio files with minimal setup.

To use IndicWhisper-JAX in your projects, simply install the necessary dependencies and load the model checkpoint using the provided API. With its superior speed and accuracy, IndicWhisper-JAX empowers developers, researchers, and government agencies to leverage the power of speech recognition in Indian languages for various applications.

Model Hosting

Hugging Face Indic Whisper JAX

The IndicWhisper-JAX models are hosted on Hugging Face's model hub:

Feel free to explore and utilize these models for your speech recognition tasks.

Acknowledgements

We extend our sincere gratitude to the following individuals and organizations for their contributions and support:

  • EkStep Foundation for their generous grant, which facilitated the establishment of the Centre for AI4Bharat at IIT Madras.
  • The Ministry of Electronics and Information Technology (NLTM) for its grant to support the creation of datasets and models for Indian languages under the Bhashini project.
  • The Centre for Development of Advanced Computing, India (C-DAC), for providing access to the Param Siddhi supercomputer for training our models.
  • Microsoft for its grant to create datasets, tools, and resources for Indian languages.
  • Contributors: Kaushal Bhogale, Sai Narayan Sundaresan, Abhigyan Raman, Tahir Javed, Mitesh Khapra, Pratyush Kumar.

Contributing

We welcome contributions from the community to further improve IndicWhisper. If you have any ideas, bug fixes, or enhancements, please feel free to submit a pull request.

Thank you for your interest in IndicWhisper! We hope it proves to be a valuable tool for your speech recognition needs in Indian languages.

indicwhisper-jax's People

Contributors

parthiv11 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.