Comments (20)
@Sixthtry you can use ask:shiftCodes
to specify special uppercase codes, in case the automatic guessing fails
from languagepack.
Done: https://github.com/AnySoftKeyboard/LanguagePack/tree/Turkish
Please make sure your branch is rebased on top latest https://github.com/AnySoftKeyboard/LanguagePack/tree/Turkish, and that you point your PR to my Turkish
branch.
Also, how are you creating the dictionary?
from languagepack.
I see. This is how to get the real gz file. First, clone the repo to your local machine:
git clone https://android.googlesource.com/platform/packages/inputmethods/LatinIME
then cd
to the dictionaries folder:
cd LatinIME/dictionaries
find the Turkish dictionary tr_wordlist.combined.gz
(it's 926338 bytes), and unzip it. Use the uncompressed file as the input for parseAospForEnglishDictionary
from languagepack.
Hello @Sixthtry,
I think that you have to focus on the Unicode first :
U+0130 = LATIN CAPITAL LETTER I WITH DOT ABOVE = İ
Wich has as lower case U+0069 = LATIN SMALL LETTER I = i
You have now to convert both to the right decimal on the keyboard starting with the small letter i.
Wainting for an answer.
Regards,
from languagepack.
Hi @BoFFire,
In the layout, I have both Latin Small Letter I, which is [ i ] and unicode "105", and Latin Small Letter Dotless I, which is [ ı ] and unicode 305. Both works well when writing in lowercase.
The problem is, when I turn on the uppercase writing, dotted little i with unicode 105 should turn to "LATIN CAPITAL LETTER I WITH DOT", which is [ İ ] and unicode 304 but instead it becomes "LATIN CAPITAL LETTER I", which is [ I ] and unicode 73.
Edit: Appearently, I misexplained here. Corrected the letters.
(Sorry for being late. I wrote an answer on phone but git client got crashed.)
from languagepack.
That was just what I needed for. Thank you very much sir. And may I ask you to open up a Turkish branch? I am trying to build a nice dictionary and then I will try to upload files here somehow in the next days(I am new to the Git).
from languagepack.
As I saw from a thread in here, "New Sicilian language pack" thread to be exact, I am collecting wikipedia pages. I also did WordsfromAosp but this is all I know about dictionaries.
from languagepack.
from languagepack.
Hey, umm,
I was trying to get as many pages as I can in Turkish manually yesterday. Today I realized that there is dump of wikipedia as said in build.gradle comments https://dumps.wikimedia.org/other/static_html_dumps/current downloaded the one for Turkish, it was around 300 MB but it extracted as 6.8 GB .tar file. Now there is .html's inside this but they are in files and I don't know how can I put them in build.gradle.
Also I found another file in wikimedia dumps. It is in https://dumps.wikimedia.org/trwikisource/20171201/ and there is an .xml (trwikisource-20171201-pages-articles.xml.bz2) file which contains "Articles, templates, media/file descriptions, and primary meta-pages" as title says in a file which is 70 MB after extract.
I am wondering, what I will get from these two files will be the same?
[I am sorry that I am asking a lot of questions, but I am doing this for the first time and I want it to be usable enough.]
from languagepack.
Wiki dumps are good, but there are two general problems with them:
- They include words in other languages, which sometime get added to the generated dictionary
- They are not "dialog“ source, so the word frequency might seem strange
from languagepack.
I gave up on aosp word list as it just does not generate any wordlist. I can see that it is a long file in editor but I can not read of course but the code can't either. The parseAospForEnglishDictionary task generates a 2 line file from that which contains only; </wordlist>
from languagepack.
@Sixthtry can you send me a link to that aosp file?
from languagepack.
Also, you'll need to fix/complete the issues reported here: https://271-4270172-gh.circle-artifacts.com/0/lint_reports/app/checkstyle/checkstyle.html
from languagepack.
https://android.googlesource.com/platform/packages/inputmethods/LatinIME/+/master/dictionaries/tr_wordlist.combined.gz Here is the aosp wordlist link. I downloaded it from the [txt]
link in the right corner
from languagepack.
Thank you for being so helpful, that solved the problem. I want to ask just one last question; Is there a document where I can find all ask:
and android:
codes?
from languagepack.
from languagepack.
I dont see the ask:shiftCodes to specify special uppercase codes
in https://github.com/AnySoftKeyboard/LanguagePack/blob/Turkish/src/main/res/xml/qwerty.xml ... perhaps dont read the good file ?
from languagepack.
@MK8T
Okay, sorry for that but there seems to be a problem with file. I will reupload it to github manually then, it works in my keyboard.
from languagepack.
@MK8T
Oh, I understand now. You are looking at the 'Turkish' branch of original AnySoftKeyboard repo. My repo did not merged yet with it. You should look at here = https://github.com/Sixthtry/LanguagePack/blob/master/src/main/res/xml/qwerty.xml
from languagepack.
Closing this in favor of #116
from languagepack.
Related Issues (20)
- Spanish numeric ordinal symbols º and ª are missing
- German Source HOT 7
- The Persian Keyboard includes a person's name!
- _
- Use language packs under MIUI "Second space""
- Options for 2-Bulsik HOT 2
- [UI Bug] [Brazilian Portuguese] Keyboard name is 'Root', should be 'Brazilian Portuguese' HOT 1
- Czech is called Afrikaans in F-droid HOT 3
- Info for New Layout HOT 3
- French never correct "a" → "à" HOT 3
- Add Custom Language to AnySoftKeyboard HOT 2
- German Fdroid build HOT 7
- Bad ergonomy on Brazilian Portuguese keyboard HOT 1
- English language pack popup characters are not sane HOT 2
- IPA (International Phonetic Alphabet) HOT 4
- migrate this repository into AnySoftKeyboard main repo HOT 6
- Bulgarian in fdroid is wrong, it is Czech
- German: Add "German+" Layout HOT 1
- German: Option to remove the "ß"-Key from the bottom right. HOT 1
- extending dictionaries HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from languagepack.