This repo will contain a list of useful resources for Mongolian NLP. Feel free to contribute.
DATASET
LJSpeech like male voice TTS dataset created from the Mongolian Bible- used in tugstugi/pytorch-dc-tts
- use dl_and_preprop_dataset.py to download the audio files
DATASET
Eduge news classification dataset- used to train the Eduge.mn production news classifier
- 75K news with 9 categories:
урлаг соёл
,эдийн засаг
,эрүүл мэнд
,хууль
,улс төр
,спорт
,технологи
,боловсрол
andбайгал орчин
DATASET
11-11.mn government agency complaint dataset- 80K with 5 categories:
санал хүсэлт
,гомдол
,шүүмжлэл
,талархал
andөргөдөл
- 80K with 5 categories:
DATASET
online news corpus- 700 million words
- opendata.burtgel.gov.mn
DATASET
220K Mongolian personal namesDATASET
90K Mongolian clan/family namesDATASET
192K Mongolian company names
PYTORCH
tugstugi/pytorch-dc-ttsDEMO
Colab online demoDATASET
LJSpeech like male voice dataset created from the Mongolian Bible
TF
tugstugi/Tacotron-2 fork of Rayhane-mamah/Tacotron-2 adapted for the Mongolian Bible datasetDEMO
Colab online demoDEMO
speaker adaptation Colab online demo for the former Mongolian president Elbegdorj. The Tacotron model trained with the 5 hours Mongolian Bible dataset was fine tuned with a 10 minutes dataset created from a Elbegdorj's speech.
DEMO
HMM TTS online demo of the Mongolian National University- 1x male and 2x female voices
DEMO
Yet another HMM? TTS online demo from “Мон Спийч Ай Ти” ХХК- 1x male and 1x female
DEMO
Tacotron2? based TTS online demo of the Inner Mongolian university- 1x female
PRODUCT
NVDA/HTS screen reader developed by Innovation Development Center for the blind- 1x female (Mongolian National University voice)
PYTORCH
tugstugi/mongolian-speech-recognition- single voice demo
DEMO
Cyrillic to Mongolian script converter demo of the Inner Mongolian universityDEMO
Mongolian script OCR demo of the Inner Mongolian universityPYTORCH
tugstugi/bichig2cyrillic Mongolian script to (and back) cyrillic converterPYTORCH
Mongolian script OCR to be released
PYTORCH
tugstugi/forced_aligner Mongolian forced alignment tool using Rayhane-mamah/Tacotron-2 and readbeyond/aeneasDEMO
Colab online demo