Giter Site home page Giter Site logo

pythainlp / pythaiasr Goto Github PK

View Code? Open in Web Editor NEW
59.0 6.0 13.0 182 KB

Python Thai Automatic Speech Recognition

License: Apache License 2.0

Python 97.27% Dockerfile 2.73%
thai-language thai-nlp asr automatic-speech-recognition hacktoberfest hacktoberfest2022

pythaiasr's Introduction

PyThaiASR

Python Thai Automatic Speech Recognition

pypiLicenseDownloadCoverage Status

PyThaiASR is a Python package for Automatic Speech Recognition with focus on Thai language. It have offline thai automatic speech recognition model.

License: Apache-2.0 License

Google Colab: Link Google colab

Model homepage: https://huggingface.co/airesearch/wav2vec2-large-xlsr-53-th

Install

pip install pythaiasr

For Wav2Vec2 with language model: if you want to use wannaphong/wav2vec2-large-xlsr-53-th-cv8-* model with language model, you needs to install by the step.

pip install pythaiasr[lm]
pip install https://github.com/kpu/kenlm/archive/refs/heads/master.zip

Usage

from pythaiasr import asr

file = "a.wav"
print(asr(file))

API

asr(data: str, model: str = _model_name, lm: bool=False, device: str=None, sampling_rate: int=16_000)
  • data: path of sound file or numpy array of the voice
  • model: The ASR model
  • lm: Use language model (except airesearch/wav2vec2-large-xlsr-53-th model)
  • device: device
  • sampling_rate: The sample rate
  • return: thai text from ASR

Options for model

  • airesearch/wav2vec2-large-xlsr-53-th (default) - AI RESEARCH - PyThaiNLP model
  • wannaphong/wav2vec2-large-xlsr-53-th-cv8-newmm - Thai Wav2Vec2 with CommonVoice V8 (newmm tokenizer)
  • wannaphong/wav2vec2-large-xlsr-53-th-cv8-deepcut - Thai Wav2Vec2 with CommonVoice V8 (deepcut tokenizer)

You can read about models from the list:

Docker

To use this inside of Docker do the following:

docker build -t <Your Tag name> .
docker run docker run --entrypoint /bin/bash -it <Your Tag name>

You will then get access to a interactive shell environment where you can use python with all packages installed.

pythaiasr's People

Contributors

cstorm125 avatar finnkr avatar wannaphong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

pythaiasr's Issues

Out of memory

I test with 1:00 min wav file but it run out of GPU memory.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 138.00 MiB (GPU 0; 5.93 GiB total capacity; 4.99 GiB already allocated; 126.19 MiB free; 5.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

How could I run with a larger file?

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

Hello,

I always get this error when run print(asr(file)), any help with that? I'm running that colab.
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias)
296 _single(0), self.dilation, self.groups)
297 return F.conv1d(input, weight, bias, self.stride,
--> 298 self.padding, self.dilation, self.groups)
299
300 def forward(self, input: Tensor) -> Tensor:

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

SystemError: google/protobuf/pyext/descriptor.cc:358: bad argument to internal function

Great day to you and thanks a lot for your contribution and determination on this project.

In Colab, I have code

%%capture
!pip install pythaiasr
!pip -q install pydub

import IPython
from pythaiasr import asr
file = "/content/voice data/helloWeeHee.wav"
IPython.display.Audio(file)

#pythaiasr
asr(file, "airesearch/wav2vec2-large-xlsr-53-th")
asr(file, "wannaphong/wav2vec2-large-xlsr-53-th-cv8-newmm")

However, after I found an error, try some solutions from Stackoverflow such as importing tensorflow and !pip install pythaiasr[lm]

SystemError: google/protobuf/pyext/descriptor.cc:358: bad argument to internal function

Now it may involve with tensorflow as I got the same output after importing tensorflow after pythaiasr.

Thank you.

live audio feed

Hi :)

Is there a way to pipe audio from a soundcard directly into the asr?

Add dockerfile

PyThaASR needs the dockerfile for running this project in the docker.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.