Giter Site home page Giter Site logo

anwarvic / arabic-speech-recognition Goto Github PK

View Code? Open in Web Editor NEW
27.0 3.0 10.0 3.32 MB

This repository contains my attempt to use two famous speech recognition frameworks (Kaldi, CMU Sphinx4) for Arabic Language using the publicly-available dataset "Arabic Corpus of Isolated Words"

Python 37.29% Shell 51.70% Perl 11.01%
speech-recognition cmusphinx cmu-sphinx kaldi-asr asr kaldi automatic-speech-recognition arabic arabic-language arabic-nlp

arabic-speech-recognition's Introduction

Arabic Speech Recognition

In this repoitory, I'm going to create an Automatic Speech Recognition model for Arabic language using a couple of the most famous Automatic Speech Recognition free-ware framework:

  • Kaldi: The most famous ASR framework.
  • CMU-Sphinx: The famous framework by Carnegie Mellon University.

In this repository, you can see just two folders "Kaldi" and "Sphinx". The Kaldi directory contains my Arabic ASR model using kaldi, and the Sphinx directory contains my Arabic ASR model using cmu-sphinx4. Inside each directory, you can find README.md that explains how to download, install and use the framework.

Download Dataset

This dataset is a small open-source dataset called the "Arabic Corpus of Isolated Words" made by the University of Stirling located in the Central Belt of Scotland. This dataset can be downloaded from the official website right here. The "Arabic speech corpus for isolated words" contains about 10,000 utterances (9992 utterances to be precise) of 20 words spoken by 50 native male Arabic speakers. It has been recorded with a 44100 Hz sampling rate and 16-bit resolution in the raw format (.wav files). This corpus is free for noncommercial uses.

After downloading the dataset and extracting it, you will find about 50 folders with the name of "S+speakerId" like so S01, S02, ... S50. Each one of these folders contains around 200 audio files, each audio file contains the audio of the speaker speaking just one word. Notice that the naming of these audio files has certain information that we surely need. So for example the audio file named as "S01.02.03.wav", this means that the wav was created by the speaker whose id is "1", saying the word "03" which is "اثنان", for the "second" repetition. Each speaker has around 200 wav files, saying 20 different words 10 times. And these words are:

d = {
        "01": "صِفْرْ", 
        "02":"وَاحِدْ",
        "03":"إِثنَانِْ",
        "04":"ثَلَاثَةْ",
        "05":"أَربَعَةْ",
        "06":"خَمْسَةْ",
        "07":"سِتَّةْ",
        "08":"سَبْعَةْ",
        "09":"ثَمَانِيَةْ",
        "10":"تِسْعَةْ",
        "11":"التَّنْشِيطْ",
        "12":"التَّحْوِيلْ",
        "13":"الرَّصِيدْ",
        "14":"التَّسْدِيدْ",
        "15":"نَعَمْ",
        "16":"لَا",
        "17":"التَّمْوِيلْ",
        "18":"الْبَيَانَاتْ",
        "19":"الْحِسَابْ",
        "20":"إِنْهَاءْ"
        }

arabic-speech-recognition's People

Contributors

anwarvic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

arabic-speech-recognition's Issues

nnet models

This is the author by the way!!

In this repo, I created 8 models. All of them are flavors of the GMM/HMM model.. what about deep learning!!

CAN'T RUN

sudo python3.9 data_preparation.py
[sudo] password for osboxes:
Sorry, try again.
[sudo] password for osboxes:
Copying Train dataset: 0it [00:00, ?it/s]
Copying Train dataset: 0it [00:00, ?it/s]
Traceback (most recent call last):
File "/home/osboxes/apiai/kaldi/egs/arabic_corpus_of_isolated_words/data_preparation.py", line 355, in
obj.prepare_data()
File "/home/osboxes/apiai/kaldi/egs/arabic_corpus_of_isolated_words/data_preparation.py", line 319, in prepare_data
self.__create_spk2gender(self.TEST_DIR)
File "/home/osboxes/apiai/kaldi/egs/arabic_corpus_of_isolated_words/data_preparation.py", line 120, in __create_spk2gender
with open(os.path.join(group_dir, "spk2gender"), "w") as fout:
FileNotFoundError: [Errno 2] No such file or directory: '/home/osboxes/apiai/kaldi/egs/arabic_corpus_of_isolated_words/data/test/spk2gender

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.