Giter Site home page Giter Site logo

quran-tajweed's Introduction

NOTE: This project is not actively maintained

Please consider using the Quran.com API instead.

quran-tajweed

Tajweed annotations for the Qur'an (riwayat hafs). The data is available as a JSON file with exact character indices for each rule, and as individual decision trees for each rule.

You can use this data to display the Qur'an with tajweed highlighting, refine models for Qur'anic speech recognition, or - if you enjoy decision trees - improve your own recitation.

The following tajweed rules are supported:

  • Ghunnah (ghunnah)
  • Idghaam...
    • With Ghunnah (idghaam_ghunnah)
    • Without Ghunnah (idghaam_no_ghunnah)
    • Mutajaanisain (idghaam_mutajaanisain)
    • Mutaqaaribain (idghaam_mutaqaaribain)
    • Shafawi (idghaam_shafawi)
  • Ikhfa...
    • Ikhfa (ikhfa)
    • Ikhfa Shafawi (ikhfa_shafawi)
  • Iqlab (iqlab)
  • Madd...
    • Regular: 2 harakat (madd_2)
    • al-Aarid/al-Leen: 2, 4, 6 harakat (madd_246)
    • al-Muttasil: 4, 5 harakat (madd_muttasil)
    • al-Munfasil: 4, 5 harakat (madd_munfasil)
    • Laazim: 6 harakat (madd_6)
  • Qalqalah (qalqalah)
  • Hamzat al-Wasl (hamzat_wasl)
  • Lam al-Shamsiyyah (lam_shamsiyyah)
  • Silent (silent)

This project was built using information from ReciteQuran.com, the Dar al-Maarifah tajweed masaahif, and others.

Using the tajweed JSON file

All the data you probably need is in output/tajweed.hafs.uthmani-pause-sajdah.json. It has the following schema:

[
    {
        "surah": 1,
        "ayah": 1,
        "annotations": [
            {
                "rule": "madd_6",
                "start": 245,
                "end": 247
            },
            ...
        ]
    },
    ...
]

The start and end indices of each annotation refer to the Unicode codepoint (not byte!) offset within the Tanzil.net Uthmani Qur'an text. NOTE: that the encoding of the files available from Tanzil.net has changed slightly since the annotations were generated, so please use this copy of the Qur'an text file: quran-uthmani.txt (downloaded ca. Apr 6, 2017). If you use a different Qur'an text file, you must rebuild the data file from scratch (at your own risk) - refer to the next section.

This data file is licensed under a Creative Commons Attribution 4.0 International License, while the original Tanzil.net text file linked above is made available under the Tanzil.net terms of use.

Using the decision trees

tajweed_classifier.py is a script that takes Tanzil.net "Text (with aya numbers)"-style input via STDIN, and produces the tajweed JSON file (as described above) via STDOUT. It reads the decision trees from rule_trees/*.json. Note that the trees have been built to function best with the Madani text; they rely on the prescence of pronunciation markers (e.g. maddah) that may not be present in other texts.

Ruleset reference

The following are renderings of the decision trees used to determine where each tajweed annotation starts and stops. Attributes are grouped by the letters they belong to, a letter being defined as a base character (e.g. ل) plus any diacritics that follow (codepoints in the Mn category). Superscript/dagger alif is counted as a base character. The numbers prefixing each attribute indicate which letter the attribute belongs to: negative referring to previous letters, positive to future letters. Attributes starting with 0_... refer to the exact character being considered. Annotations do not always start or stop on letter boundaries. Refer to tajweed_classifier.py for the definition of each attribute.

ghunnah

Start End
ghunnah start decision tree ghunnah end decision tree

hamzat_wasl

Start End
hamzat_wasl start decision tree hamzat_wasl end decision tree

idghaam_ghunnah

Start End
idghaam_ghunnah start decision tree idghaam_ghunnah end decision tree

idghaam_mutajanisayn

Start End
idghaam_mutajanisayn start decision tree idghaam_mutajanisayn end decision tree

idghaam_mutaqaribayn

Start End
idghaam_mutaqaribayn start decision tree idghaam_mutaqaribayn end decision tree

idghaam_no_ghunnah

Start End
idghaam_no_ghunnah start decision tree idghaam_no_ghunnah end decision tree

idghaam_shafawi

Start End
idghaam_shafawi start decision tree idghaam_shafawi end decision tree

ikhfa

Start End
ikhfa start decision tree ikhfa end decision tree

ikhfa_shafawi

Start End
ikhfa_shafawi start decision tree ikhfa_shafawi end decision tree

iqlab

Start End
iqlab start decision tree iqlab end decision tree

lam_shamsiyyah

Start End
lam_shamsiyyah start decision tree lam_shamsiyyah end decision tree

madd_2

Start End
madd_2 start decision tree madd_2 end decision tree

madd_246

Start End
madd_246 start decision tree madd_246 end decision tree

madd_6

Start End
madd_6 start decision tree madd_6 end decision tree

madd_munfasil

Start End
madd_munfasil start decision tree madd_munfasil end decision tree

madd_muttasil

Start End
madd_muttasil start decision tree madd_muttasil end decision tree

qalqalah

Start End
qalqalah start decision tree qalqalah end decision tree

silent

Start End
silent start decision tree silent end decision tree

quran-tajweed's People

Contributors

cpfair avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

quran-tajweed's Issues

Start and end meaning in the data file ..

"end": 167,
"rule": "ghunnah",
"start": 164

can any one tell me what the numbers here "167,164" mean?.
are they the value of the length of the rule , like ghunnah rule for the ayha audio starts at 164 ms and end at 167 ms which this give the length of the duration is 3 ms to achieve the ghunnah rule ?

Hello I need to run this code in another text version

Hello,
I run this code with same text from tanzil.net (with aya numbers ) and i got empty annotations for whole ayas
Do i make any changes or what ?
Here is sample output
{
"surah": 2,
"ayah": 13,
"annotations": []
},
{
"surah": 2,
"ayah": 14,
"annotations": []
},

Javascript usage, resultat differs from usual versions

Thanks for this phenomenal work.

I'm using Javascript to parse the rules and apply CSS classes to colorize the verses.
Nevertheless, it seems I miss something since the result I get (except, of course, that the colours rules are obviously not the same everywhere), is quite different from whatever I can find anywhere else.

This is what I get (using the original othmani json file available on this repository) :

2020-07-06_15h18_29

And this is what one PDF version I found outputs :

2020-07-06_15h19_05

The code I'm using to parse the rules is the following :

const createTajweedClasses = (classesArray, rules, index) => {
  const ruleStart = parseInt(rules[index].start);
  const ruleEnd = parseInt(rules[index].end);
  const ruleType = rules[index].type;
  const limit = ruleEnd - 1;
  for (let i = ruleStart; i <= limit; i += 1) {
    classesArray[i] = styles[ruleType];
  }
};
...

const classesArray = [];
tajweedRules.forEach((rule, index) => createTajweedClasses(classesArray, tajweedRules, index));

// It will fill classesArray with [ { Char Position } => Class Name ] elements

Then I have a CSS set of rules to colorize everything :

.idghaam_shafawi {
  color: rgba(190, 0, 0, 1);
}

.idghaam_mutajaanisain {
  color: rgba(190, 0, 0, 1);
}

.idghaam_mutaqaaribain {
  color: rgba(190, 0, 0, 1);
}

.idghaam_ghunnah {
  color: rgba(190, 0, 0, 1);
}

.idghaam_no_ghunnah {
  color: rgba(150, 60, 60, 1);
}

.ghunnah {
  color: rgba(0, 180, 0, 1);
}

.qalqalah {
  color: rgba(8, 80, 170, 1);
}

.iqlab {
  color: rgba(140, 50, 170, 1);
}

.ikhfa {
  color: rgba(50, 180, 160, 1);
}

.ikhfa_shafawi {
  color: rgba(50, 180, 160, 1);
}

/* Ikhfa mim saakin */
.madd_6 {
  color: rgba(200, 0, 0, 1);
}

.madd_246 {
  color: rgba(255, 180, 60, 1);
}

.madd_2 {
  color: rgba(50, 180, 160, 1);
}

.madd_muttasil {
  color: rgba(255, 0, 0, 1);
}

.madd_munfasil {
  color: rgba(0, 221, 147, 1);
}

.hamzat_wasl {
  color: rgba(150, 252, 0, 1);
}

.lam_shamsiyyah {
  color: rgba(175, 0, 221, 1);
}

.silent {
  color: rgb(180, 180, 180);
}

And for each verse, I split all the characters this way to apply the rules (simplified version of the code) :

ayah.split('').map((char, index) => applyClassOnChar(classesArray[index]);

Which will output a list of `span` elements with the appropriate class

It would be great if you could give your thoughts about what is being wrong!
@cpfair @Ysajid

Get unexpected result

I just follow the instructions, but got the wrong result.

Here is my code to extract the data

const tajweedRules = require("./tajweed.hafs.uthmani-pause-sajdah.json");

const rules = tajweedRules[8].annotations; // 2:2

const text = "ذَٰلِكَ ٱلْكِتَٰبُ لَا رَيْبَ ۛ فِيهِ ۛ هُدًى لِّلْمُتَّقِينَ";

rules.forEach(rule => {
  const subText = text.substring(rule.start, rule.end);
  console.log(rule.start + " - " + rule.end + ": " + rule.rule);
  console.log(subText);
});

And here is the result

Result

Am I doing something wrong?

Thanks in advance

Python script returning empty results

The python script tajweed_classifier.py is returning empty result when ran using the latest version of uthmani quran text exported from tanzil.net. I.e. the output contains all surah and verse numbers without any annotations:

[
  {
    "annotations": [],
    "ayah": 1,
    "surah": 1
  },
  {
    "annotations": [],
    "ayah": 2,
    "surah": 1
  },
  {
    "annotations": [],
    "ayah": 3,
    "surah": 1
  },
  ...
]

Additionally, the python scripts have no license (which automatically means that they're under a source-only license) making it impossible for us to re-use the scripts. Please, consider adding a license.

Pause marks and sajdah signs

Description of repository says:
Make sure to download the version with pause marks and sajdah signs, but without rub-el-hizb signs or me_quran tanween shapes. If you use different options or a different text entirely, you must rebuild the data file from scratch (at your own risk) - refer to the next section.
It looks like when downloading Quranic text from tanzil with checkmarks selected
for pause marks and sajdah signs, downloaded file doesn't include these symbols.

I guess it also happened when you were generating tajweed.hafs.uthmani-pause-sajdah.json . It works well with version without pause marks, but position are moved after pause marks.

I already reported a problem regarding missed signs on tanzil.net

screen shot 2018-01-05 at 12 24 53 am

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.