Giter Site home page Giter Site logo

dxcore35 / massive_pali-dictionary Goto Github PK

View Code? Open in Web Editor NEW
10.0 3.0 2.0 650 KB

Biggest Pali dictionary

License: GNU General Public License v3.0

pali pali-tripitaka dhamma translation dictionary transcription devanagari burmese thai-language scholars

massive_pali-dictionary's Introduction

PALI dictionary (correction tool)

accessibility text

Proposal for the building of the tool for:

  • collection of all existing Pali language dictionaries
  • rating and correction of the translation by Pali schollars or monastics
  • producing various dictionaries files as output

NEED HELP:

accessibility text

  • Anybody who is willing to help with data cleaning (scripting experts, regex experts, database experst)
  • Anybody with skills to build serious database and build web UI in top of it
  • Any additional ideas, feedback or new Pali dictionaries datasets

DONE:

  • Building the database structure (only MS Access solution)
  • Building of standard tables (partly done)
  • Collection of all Pali dictionaries (partly done)

TODO:

  • Standardizing of the data
  • Buidling the UI
  • Correction and rating of the translation by mainly by Pali schollars or monastics
  • Building the functionaliy to output the data
  • Conversion of data to various formats (Stardic, Kindle, CSV, Apple dictionary….)

1. Building the database structure:

Here is my proposal of the database structure. I started project as simple MS Access database, with queries executing VB scripts (mainly when transcribing the words to different language). But soon I have realized that the project is too big to be build in MS Acesss. As I don't have any experiences building the databases in other way, the regarding the database stops here. This is the database structure from MS Access:

2. Building of standard tables 2

Tables for diacritic transcription

    • Pali diacritic <> Slovak / Czech diacritic
    • Pali diacritic <> Burmese diacritic: Incomplete
    • Pali diacritic <> Thai diacritic: Incomplete
    • Pali diacritic <> Devanagari diacritic: Incomplete

Tables for cleaning / standardising of the text

    • Pali diacritic <> old ASIC Pali diacritic
    • Pali diacritic <> Pali without special characters

Tables for phonetic transcription

Tables for grammar information

    • Indeclinable table: Incomplete
    • Part of speach: Incomplete

Tables for additional research and experiments

    • Frequency of all Pali words from all Tipitaka: to be checked

3. Collection of all Pali dictionaries

Till now I have collected dictionaries listed below. Most of them are not standardized:

Czech / Slovak

  • Buddhist Dictionary (Buddhist Dictionary by NYANATILOKA MAHATHERA)

English

  • Pali > English Buddhist Dictionary (Buddhist Dictionary by NYANATILOKA MAHATHERA)
  • Pali <> English Concise Dictionary (Concise Pali-English Dictionary by A.P. Buddhadatta Mahathera)
  • English <> Pali Concise Dictionary (Concise Pali-English Dictionary by A.P. Buddhadatta Mahathera)
  • Pali > English PTS Dictionary (PTS Pali-English dictionary The Pali Text Society's Pali-English dictionary)
  • Pali > English Proper Names Dictionary (Buddhist Dictionary of Pali Proper Names by G P Malalasekera)
  • Pali > English Dictonary from VRI (Pali-Dictionary Vipassana Research Institute)
  • Pali > English GPBD_Glossary of Pali and Buddhist terms (unknown source, probably accesstoinsight.org)
  • Pali > English MettaNet-Lanka Dictionary (MettaNet-Lanka Dictionary)
  • Pali > English Buddhist Dictionary (Manual of Buddhist Terms and Doctrines)

Special:

  • Pali > English Buddhas India map (Details about the place with location and picture)
  • Pali > Tibetian, Sanskrit, English (PDOB_The Princeton dictionary of Buddhism)
  • Sanskrit > Pali, Tibetian, English (Buddhist Hybrid Sanskrit Dictionary)

Myanmar

  • Pali Roots Dictionary (Pali Roots Dictionary ဓါတ္အဘိဓာန္)
  • Tipiṭaka Pāḷi-Myanmar Dictionary (Tipiṭaka Pāḷi-Myanmar Dictionary တိပိဋက-ပါဠိျမန္မာ အဘိဓာန္)
  • Pali Myanmar Dictionary (Pali Word Grammar from Pali Myanmar Dictionary)
  • U Hau Sein’s Pāḷi-Myanmar Dictionary (U Hau Sein’s Pāḷi-Myanmar Dictionary ပါဠိျမန္မာ အဘိဓာန္(ဦးဟုတ္စိန္)

Chinese

  • パーリ语辞典 (パーリ语辞典 增补改订 日本水野弘元)
  • パーリ语辞典 (パーリ语辞典 日本水野弘元)
  • パーリ语辞典-勘误表 (水野弘元-巴利语辞典-勘误表 Bhikkhu Santagavesaka 覓寂尊者)
  • 巴利语入门 (巴利语入门释性恩(Dhammajīvī))
  • 巴利语字汇 (四念住课程开示集要巴利语字汇(葛印卡)
  • 巴利语汇解 (巴利语汇解&巴利新音译 玛欣德尊者)
  • 巴汉佛学辞汇 (巴利文-汉文佛学名相辞汇 翻译:张文明)
  • 巴汉词典 (巴汉词典Mahāñāṇo Bhikkhu编著)
  • 巴汉词典 (巴汉词典明法尊者增订)
  • 巴英术语汇编 (巴英术语汇编 法的医疗附 温宗堃)
  • 汉译パーリ语辞典 (汉译パーリ语辞典 黃秉榮譯)
  • 汉译パーリ语辞典 (汉译パーリ语辞典 李瑩譯)
  • Chinese > English (unknown probably extract from CKJ Buddhism Dictionary)

Vietnam

  • Pali Viet Abhi- Terms (Pali Viet Abhidhamma Terms )
  • Pali Viet Dictionary (Pali Viet Dictionary Bản dịch của ngài Bửu Chơn.)
  • Pali Viet Vinaya Terms (Pali Viet Vinaya Terms Từ điển các thuật ngữ về luật do tỳ khưu Giác Nguyên sưu tầm.)

4. Standardizing of the data

Almost all of the datasets are just translation in JSON, CSV and some in HTML format. Almost all of the data needs to be standardized and cleaned

For example this is description for word abbhantara from one of the dictionary:

Word Translation
abbhantara [nt.] ; . (adj.), ; internal.""

But this should be normalized into 4 columns and 2 rows

Word Translation Gender Type
abbhantara the inside neuter subject
abbhantara interior neuter subject
abbhantara inner adjective
abbhantara internal adjective

In this regard there is a lot of work to do. But my assumption is that sombody familiar with scripting, regular expressions and JSON can do it quite quickly.

FAR FUTURE

  • Buidling the UI
  • Correction and rating of the translation by mainly by Pali schollars or monastics
  • Building the functionaliy to output the data
  • Conversion of data to various formats (Stardic, Kindle, CSV, Apple dictionary….)

massive_pali-dictionary's People

Contributors

dxcore35 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

gobbletown norido

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.