Giter Site home page Giter Site logo

arabycia's Introduction

Arabycia

Arabic NLP tool Built using NLTK, Pyaramorph, and Sinai-corpus to perform:

  • Tokenization
  • Lemmatization
  • Segmentation
  • Transliteration
  • Reverse Transliteration
  • Sentence diacritization
  • Text Search
  • POS tagging
  • Translation
  • Find ambiguity

Usage

Input

text = 'يستعيد الكاتب في هذه الرواية كيف تحولت من مدينة للانوار الي مدينة للاشباح'
arabycia = Arabycia()
arabycia.set_raw_text(text)
arabycia.analyze()

Output

Sentence :
يستعيد الكاتب في هذه الرواية كيف تحولت من مدينة للانوار الي مدينة للاشباح
With Diacritics :
يَسْتَعِيد الكاتِب فِي هٰذِهِ الرِوايَة كَيْفَ تَحَوَّلْتُ مِن مَدِينَة لِلأَنْوار إِلَى مَدِينَة لِلأَشْباح
POS :
sotaEiyd/VERB_IMPERFECT kAtib/NOUN fiy/PREP h`*ihi/DEM_PRON_F riwAy/NOUN kayofa/REL_PRON taHaw~al/VERB_PERFECT min/PREP madiyn/NOUN >anowAr/NOUN <ilaY/PREP madiyn/NOUN >a$obAH/NOUN

Word  : 	يَسْتَعِيد	yasotaEiyd	{isotaEAd_1 
POS   : 	ya/IV3MS+	sotaEiyd/VERB_IMPERFECT 
Gloss : 	recover;regain;reclaim

Word  : 	هٰذِهِ	h`*ihi	h`*A_1 
POS   : 	h`*ihi/DEM_PRON_F 
Gloss : 	this/these

Word  : 	لِلأَنْوار	lil>anowAr	nuwr_2 
POS   : 	li/PREP+Al/DET+	>anowAr/NOUN 
Gloss : 	lights

Word  : 	لِلأَشْباح	lil>a$obAH	$abaH_1 
POS   : 	li/PREP+Al/DET+	>a$obAH/NOUN 
Gloss : 	specters;shapes

Word  : 	الكاتِب	AlkAtib	kAtib_1 
POS   : 	Al/DET+	kAtib/NOUN 
Gloss : 	writer;author

Word  : 	فِي	fiy	fiy_1 
POS   : 	fiy/PREP 
Gloss : 	in

Word  : 	الرِوايَة	AlriwAyap	riwAyap_1 
POS   : 	Al/DET+	riwAy/NOUN	+ap/NSUFF_FEM_SG 
Gloss : 	story;novel

Word  : 	كَيْفَ	kayofa	kayofa_1 
POS   : 	kayofa/REL_PRON 
Gloss : 	how

Word  : 	تَحَوَّلْتُ	taHaw~alotu	taHaw~al_1 
POS   : 	taHaw~al/VERB_PERFECT	+tu/PVSUFF_SUBJ:1S 
Gloss : 	be changed;be transformed

Word  : 	مِن	min	min_1 
POS   : 	min/PREP 
Gloss : 	from

Word  : 	مَدِينَة	madiynap	madiynap_1 
POS   : 	madiyn/NOUN	+ap/NSUFF_FEM_SG 
Gloss : 	city

Word  : 	إِلَى	<ilaY	<ilaY_1 
POS   : 	<ilaY/PREP 
Gloss : 	to;towards

Word  : 	مَدِينَة	madiynap	madiynap_1  
POS   : 	madiyn/NOUN	+ap/NSUFF_FEM_SG 
Gloss : 	city
Input

text = 'يستجمع المؤرخ أفكاره'
arabycia = Arabycia()
arabycia.set_raw_text(text)
search_result = arabycia.text_search("جمع")
print(search_result)

Output

['يستجمع']

Notes

Requirement

  • NLTK
  • Sinai-corpus

arabycia's People

Contributors

mohabmes avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.