bentalagan / glaemscribe Goto Github PK
View Code? Open in Web Editor NEWGlaemscribe, the tolkienian languages/writings transcription engine.
Home Page: https://glaemscrafu.jrrvf.com/english/glaemscribe.html
License: Other
Glaemscribe, the tolkienian languages/writings transcription engine.
Home Page: https://glaemscrafu.jrrvf.com/english/glaemscribe.html
License: Other
Qenya ui is desambiguated as uy thus breaks disambiguation of qu and leads to wrong treatment of sequence 'qui' . Rules should be reversed.
Thanks to Roman Rausch for noticing!
IDE v1.0.0 in Firefox
When I add a new character in the charset panel (let's say, default charset is sarati_eldamar, and I add char "38" as "SARATI_P_FINAL"), the font in the charset panel switches back to roman (apparently no style attribute in the HTML).
Workaround: Just changing the font name back and forth refreshes the panel ;-)
There are many tehtar that don't have a below counterpart, should we add them to the fonts?
They aren't defined in the standard unicode layout and the Free Tengwar Font, and are somewhat specific (for now) just for my Vietnamese mode, so my first thought is I should add them manually for my own, but I don't know if there is a toolchain to build the sdfs? When I generated Tengwar Annatar GlaemUnicode for testing, I see a big different in size (mine is 70KB, yours is 97KB).
So if you don't want to add these specific tehtar to the standard fonts here, please give me some guidelines to build the sdfs. Thank you. :)
Documentation should state that conditions also accept the ! boolean operator for negation: it works well, yeah ;)
Hi!
Thank you for making this great framework! Would you consider creating a package.json file for npm compatibility? It would make it easier for us who use Glaemscribe to stay current.
Font SaratiEldamarRTLBarGlaemscrafu.ttf (for instance, not check with other fonts yet) installs on Windows as "Sarati Eldamar RTL Bar", i.e. the same name as the original Sarati Eldamar RTL Bar, prompting for its replacement on the system. Some users might want to keep the original font for other applications.
You may want to consider reversing the order of sa-rinke and vowel-tehta for Dan Smith fonts. Right now it is:
consonant:sa-rinke:vowel-tehta
That puts the vowel-tehta above the sa-rinke, which looks weird. If you reverse the order you will get a better result:
consonant:vowel-tehta:sa-rinke
I used apsa, otso and tixe to test this (tixe doesn't change).
In the raw tengwar side panel on the website, the "xanga" character is actually the "xungwe" character.
No charset has the definition for three-dots punctuation (U+02C6).
It seems that Halla isn't included in any virtual tehta below, though Telco and Ara are (even Ara cannot hold any tehta below either way).
I have another small problem. It turns out that my mode doesn't play well with Tengwar Annatar (and possibly some other fonts). I use the GEMINATE_SIGN
to represent the labialization mark
, but the DASH_INF_S
doesn't appear well on Telco and Halla. I wonder if you can generate an XS size for just Telco and Halla? I know it could be bothersome because you might also have to generate a XS version for other dashes and tildes. If you don't want to make it, I could make it by my own later on.
This discussion was originally started in #15.
Glaemscribe did not give the possibility to have certain chars as input for a mode, either because they are invisible chars (word joiners, & so on) so they are extremely hard to write in the mode file, or because they are part of the language (* [ ] ( ) _ ,
and so on).
This is now fixed by commit 06c4fea with the new following feature, unicode vars can be used in rule source members and also in variable definitions. The syntax is {UNI_XXXX} with XXXX being a hexadecimal value.
The following shortcuts have been added to the engine :
# Characters that are not easily entered or visible in a text editor
add_var("NBSP", "{UNI_A0}")
add_var("WJ", "{UNI_2060}")
add_var("ZWSP", "{UNI_200B}")
add_var("ZWNJ", "{UNI_200C}")
# The following characters are used by the mode syntax.
add_var("UNDERSCORE", "{UNI_5F}")
add_var("ASTERISK", "{UNI_2A}")
add_var("COMMA", "{UNI_2C}")
add_var("LPAREN", "{UNI_28}")
add_var("RPAREN", "{UNI_29}")
add_var("LBRACKET", "{UNI_5B}")
add_var("RBRACKET", "{UNI_5D}")
Will be released with the branch TolkiensReadingDay2018 .
Not yet handled, should be added!
Thanks to Roman Rausch for noticing!
All is in the title, the font was released recently by Måns :
In the standard mapping of Tengwar Unicode, DQUOT_OPEN is the higher one, while DQUOT_CLOSE is the lower. This problem appears in all GlaemUnicode fonts.
hl & hr do not exist in early Qenya, but do exist in later Quenya, so they should be treated as units.
Thanks to Roman Rausch for noticing!
Could I possibly do something to merge these rules into 1 single rule, since the vowels are the only things that make them different?
_a [{TONES_FULL}] [{FINALS}] --> 1, 4, 2, 3 --> {_ZERO_} [{_FINALS_OPENED_}] {_A_} [{_TONES_FULL_}]
_ă [{TONES_FULL}] [{FINALS}] --> 1, 4, 2, 3 --> {_ZERO_} [{_FINALS_OPENED_}] {_A_BREVE_} [{_TONES_FULL_}]
_â [{TONES_FULL}] [{FINALS}] --> 1, 4, 2, 3 --> {_ZERO_} [{_FINALS_OPENED_}] {_A_HAT_} [{_TONES_FULL_}]
_e [{TONES_FULL}] [{FINALS}] --> 1, 4, 2, 3 --> {_ZERO_} [{_FINALS_OPENED_}] {_E_} [{_TONES_FULL_}]
_ê [{TONES_FULL}] [{FINALS}] --> 1, 4, 2, 3 --> {_ZERO_} [{_FINALS_OPENED_}] {_E_HAT_} [{_TONES_FULL_}]
_i [{TONES_FULL}] [{FINALS}] --> 1, 4, 2, 3 --> {_ZERO_} [{_FINALS_OPENED_}] {_I_} [{_TONES_FULL_}]
_ơ [{TONES_FULL}] [{FINALS}] --> 1, 4, 2, 3 --> {_ZERO_} [{_FINALS_OPENED_}] {_O_HORN_} [{_TONES_FULL_}]
_ư [{TONES_FULL}] [{FINALS}] --> 1, 4, 2, 3 --> {_ZERO_} [{_FINALS_OPENED_}] {_U_HORN_} [{_TONES_FULL_}]
_y [{TONES_FULL}] [{FINALS}] --> 1, 4, 2, 3 --> {_ZERO_} [{_FINALS_OPENED_}] {_Y_} [{_TONES_FULL_}]
_ō [{TONES_FULL}] [{FINALS}] --> 1, 4, 2, 3 --> {_ZERO_} [{_FINALS_OPENED_}] {_OO_} [{_TONES_FULL_}]
Note: {_ZERO_}
, {_FINALS_OPENED_}
, {_TONES_FULL_}
are the Tengwar counterparts of {NULL}
. {FINALS}
, {TONES_FULL}
which are Latin. The same for the others.
I see no problem sharing my mode implementation for Vietnamese, which is my mother tongue, at all. I still haven't made any changes to the rules yet. I think sharing this in its current form might help you a bit figuring out any potential problem. Since Vietnamese has 6 tones and many vowels, could it be complex enough for some problems to emerge out of your prediction?
Phonetic mode for Vietnamese.zip
This is an example text. Note: the parantheses indicate words written in other modes.
Please reposition any character that overrides the soft-hyphen (U+00AD). Because soft-hyphens are always hidden in modern text processors, then characters at this position are hidden also.
Fonts affected: Tengwar Annatar, Tengwar Parmaite, Tengwar Elfica, Tengwar Sindarin (also Quenya).
Currently Tengwar Eldamar uses the position U+20AC instead.
New features are missing some doc :
Modes :
Charsets :
The integration link on the main page of this repo https://bentalagan.github.com/glaemscribe should be https://bentalagan.github.io/glaemscribe.
I've just found out that non-breaking space is missing in Annatar, Eldamar and Sindarin fonts. If I use that space, its font will be changed to the default and the spacing is not correct.
Also, Glaemscribe processor seems to ignore the non-breaking space. I've added a definition for it in the charset and use it in the mode, but the output text contains only normal spaces.
I would like to use \outspace
with a sequence of characters (\outspace CIRTH_SPACE CIRTH_PUNCT_MID_DOT CIRTH_SPACE
) so there's a bit of space around the dot), which the documentation says is supported - however it doesn't work and I get:
Mode has some errors:
45: Element 'outspace' should have 1 arguments.
A good example is better than an explanation ; with the currently merged annatar fonts, a word like lumbule would be transcribed :
な&w&j$
because the ligatured lambe is mapped within the japanese unicode set, the word wrap breaking logic of web browsers will allow this word to be cut after the ligatured lambe at the end of a line.
Solution 1 (fastest) : change the mapping of incriminated characters. One good idea would probably to stick to tengwar elfica's mapping.
Solution 2 (painful, and a long work : but cool) : start migrating the old generation fonts to opentype with gsub and gpos tables, and use the free tengwar font project mapping .
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.