Giter Site home page Giter Site logo

kurdi's Introduction

Various Kurdi related work done by Kurdish developers.

kurdi

There are some hopefully useful files/scripts/chunks etc. to share with Kurdi developers.

  1. kurdi_words.txt: a list of Kurdish words (currently 1,668,692), unique and alphabetically ordered (thanks to @dolanskurd). Note that in the bar chart below, each of (و) and (ی) counted as both vowel and consonant.

  2. unicode_list.txt: list of unicode values for Kurdish alphabet (Arabic script) standard accepted and published on http://unicode.ekrg.org/ku_unicodes.html

  3. gettext translations, includes ku.po for Drupal. Most of the translations come from https://localize.drupal.org/translate/languages/ku (now almost dead

  4. KRG health institutions data (lat/lng and names) throughout KRG (see health)

Now that we have some good unique nad cleaned up wordlist. We can do some statistics on them (in R for now):

w = readLines("https://raw.githubusercontent.com/layik/kurdi/master/corpus/kurdi_words.txt")
## Warning in readLines("https://raw.githubusercontent.com/layik/kurdi/
## master/corpus/kurdi_words.txt"): incomplete final line found on 'https://
## raw.githubusercontent.com/layik/kurdi/master/corpus/kurdi_words.txt'
length(unique(w)) == length(w)
## [1] TRUE
length(w)
## [1] 1668692
# sample of those including ئا
length(grep("ئا", w))
## [1] 49401
# read in list of Kurdi chars
ku_v = readLines("https://raw.githubusercontent.com/layik/kurdi/master/corpus/letters_lines.txt")
message("Kurdish alphabet: ",  length(ku_v), " letters.")
## Kurdish alphabet: 34 letters.
letters_used = sapply(ku_v, function(x){
  length(grep(x, w))
})

# change h to doucheshme
names(letters_used)[names(letters_used) == 'ه'] = "ھ"
letters_used = sort(letters_used, decreasing = TRUE)

library(ggplot2)
ggplot() + geom_bar(aes(x=names(letters_used),y=letters_used), stat='identity') + xlab('Alphabet') + ylab('Frequency') + theme(axis.text.x = element_text(face = "bold", size = 18)) + scale_y_continuous(labels = scales::comma) + 
  scale_x_discrete(limits=names(letters_used))

letters_used['ە']
##       ە 
## 1255122

kurdi's People

Contributors

layik avatar ziryan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

kurdi's Issues

Android Kurdi

  • Find out if there is CKB locale, my understanding is that there was as far back as 5.0?
  • Where are the locale?
  • Where is the equivalent of this?

کام هەنگاوانە بنرێت بۆ یەکخستنی زمانی کوردی

گریمان کەسێک سەرپشکی کردوویت لەوەی سێ هەنگاو بنێیت بۆ یەکخستنی نووسینی زمانی کوردی بە هەردوو شێوەی ئارامی و لاتینی. ئەو هەنگاوانە چی بن؟

  • بەردەستخستن و یەکخستنی تەختەکلیلەکانی سیستمە جیاوازەکان بە هەردوو پیتی ئارامی و لاتینی (لینوکس، ماک، ویندۆز، ئاندرۆید و ئای ئۆ ئێس هتد)
  • بەردەستخستنی یەک فەرهەنگی گشتگیر بۆ هەردوو زاراوە سەرەکیەکە بە هەردوو پیتی لاتینی و ئارامی
  • بەردەستخستنی (دوای پەرەپێدان) هەڵەبژێر/هەڵەچن بۆ سیستمە جیاوازەکان بۆ ڕێگرتن لە هەڵە نووسین
  • چی تر؟

سوپاس

CLDR

kake @jwtiyar, my understanding was that:

  1. you were trying to change some characters on Kurdi
  2. you sent a PR on a file called X
  3. they asked you to send the PR on file called Y

Why should we not just send the PR on file called Y?

Supas kake gyan.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.