grintaking19 / meshakkelaty.ai Goto Github PK

View Code? Open in Web Editor NEW

A neural and statistical engine for accurately adding diacritics (Tashkeel) to Arabic text. First-place winner on Kaggle 🥇

Python 0.07% Jupyter Notebook 99.93%

meshakkelaty.ai's Introduction

AI Arabic Diacritization Engine - مِشَكِّلاتي.ai

Overview

Welcome to مِشَكِّلاتي.ai ... An innovative Arabic text diacritization (Tashkeel) engine developed using advanced neural and statistical techniques. This project aims to accurately predict and add diacritics to Arabic text, enhancing readability and understanding. The مِشَكِّلاتي.ai model achieved first-place on Kaggle, showcasing its exceptional performance 🥇

Dual- Model Architecture

The مِشَكِّلاتي.ai diacritization system employs a dual-model architecture that consists of:

A Neural Bidirectional Stacked Long Short-Term Memory (BiLSTM) model - that captures sequential dependencies and context information within the Arabic text - inspired by this research paper, but on steroids!
A Statistical Post-Processing model that operates on the output generated by the neural model to further refine the diacritization results, inspired by this research paper

Meshakkelaty-Promo.mp4

Usage

To use مِشَكِّلاتي.ai, follow these steps:

Clone the repository
- git clone https://github.com/Omar-Al-Sharif/Meshakkelaty.ai.git
Install the necessary dependencies:
- pip install -r Meshakkelaty.ai/requirements.txt
Acquire your data and place them in data directory under the names train.txt and val.txt
Change the directory to scripts directory:
- cd Meshakkelaty.ai/scripts
Prepare your data by running the following command
- python tokenize_dataset.py
Train the neural model on your data
- python train_neural_model.py
Train the statistical model on your data
- python train_statistical_model.py
Put your input text inside:
- ../data/test_input.txt
Diacritize the input text by running:
- python predict.py

Contributors

Recommend Projects

grintaking19 / meshakkelaty.ai Goto Github PK

meshakkelaty.ai's Introduction

AI Arabic Diacritization Engine - مِشَكِّلاتي.ai

Overview

Dual- Model Architecture

Usage

Contributors

meshakkelaty.ai's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent