def-gthill / lexurgy Goto Github PK

View Code? Open in Web Editor NEW

43.0 43.0 5.0 3.58 MB

A high-powered sound change applier

License: GNU General Public License v3.0

Kotlin 72.22% ANTLR 0.74% Java 26.94% Batchfile 0.01% Dockerfile 0.02% Shell 0.08%

lexurgy's People

Contributors

Stargazers

Watchers

Forkers

bigyihsuan neta-elad bouri alexdcramer sirraide

lexurgy's Issues

Combining diacritics detach after sound changes.

When I ran a sound change like this [front mid vowel] => [central mid vowel] that affected a vowel with a tone diacritic already applied, e.g. é, the tone diacritic becomes detached from the vowel. For example, é => ə ́. This does affect sound changes down the line.

Optional Argument in Environment not matching

Minimal reproducible example:

label:
 a => c / a b b? _

Input:

aba
abba

Output

aba => abc
abba => abba

Bug:
abba should get changed to abbc, but instead is unaffected by the change.

'Report which rules apply' function, or 'trace rule' function, + total words changed

Being able to quickly see at a glance where & how many times rules apply to words is incredibly useful, especially when you're applying this to an entire lexicon of 100s+ words, and for quickly identifying abnormalities generated by a given rule.

It's about the only feature that Zompist SCA has over this program at this point - if this was added, his SCA would be rendered completely obsolete.

Is this something that could be considered looking into for the future?

Allow negated feature values in rule filters

We should be able to write a rule like this:

rule [vowel !low]:
    <rule expressions here>

This should restrict the rule to non-low vowels. But currently, negated feature values aren't allowed here.

syllable as filter

This is a request for an enhancement.

let's say we have

Feature (syllable) +stress
Feature +syllabic
Feature place(*noplace, labial, alveolar, velar)
Feature manner(*nomanner, nasal, plosive, fricative, liquid)
Feature frontness(*nofront, front, back)
Feature closeness(*noclose, close, open)

Diacritic ˈ (before) (floating) [+stress]

Symbol m [labial nasal]
Symbol n [alveolar nasal]
Symbol p [labial plosive]
Symbol t [alveolar plosive]
Symbol k [velar plosive]
Symbol s [alveolar fricative]
Symbol ɹ [alveolar liquid]
Symbol j [close front]
Symbol w [close back]
Symbol i [+syllabic close front]
Symbol u [+syllabic close back]
Symbol a [+syllabic open front]

Class sonorant {ɹ, j, w, m, n}
Class vowel {i, u, a}
Class obstruent {p, t, k, s}
Class consonant {@obstruent, @sonorant}

Syllables:
 @consonant? @vowel @sonorant?

word-initial-stress-assignment:
 <syl> => [+stress] / $ _

iterative-progressive-stress-assignment <syl> propagate:
 [] => [+stress] / [+stress] [] _

then Lexurgy gives the error

"<syl>" doesn't make sense in the line "iterative-progressive-stress-assignment propagate:" (line 34)

I understand that "filter by syllable" doesn't actually make sense, because everything's part of a syllable.
But it would be useful for readability if you were able to have [+stress] refer to a syllable instead of a phoneme.
Yes, this is already possible by writing

iterative-progressive-stress-assignment propagate:
 <syl> => [+stress] / <syl>&[+stress] <syl> _

but writing <syl>&[+feature] is way more clunky than being able to just have a <syl> filter and writing [+feature]

so might this be an option for future implementation?

confusing error: word boundary in one alternative

This rule:

stress:
[vowel] => [stressed] / {$ [cons]?, A [cons]?} _

(there are two spaces at the beginning of the line, though github may not display them)

causes this error:

extraneous input ' ' expecting {, NEWLINE} (Line 87, column 48)

The character at column 48 (one based) is '}' while the message is referencing a space.
The character at column 48 (zero based) is ' ' .

At any rate, what is wrong with this rule?

It is intended to add the [stressed] feature to the first vowel in the word after word-initial consonants: "$ [cons]?" or
after the suffix marker 'A' followed by potential consonants: "A [cons]?"

Rule lines with spaces only

Sometimes it is convenient to use "special" characters, such as -, +, ?, in the Romanized form of a language.

Lexurgy gives an error when I try to declare these in the deromanizer.

ALSO:
Many text editors leave lines with one or more spaces, followed by no text. Lexurgy does not like these either.

Several error messages from the Sample Declarations preset

"The feature value "rounded" is not defined"
This seems to be caused by a missing '+' in one symbol (Symbol ɒ [vowel back rounded open])

After that:
"The feature value "high" is not defined"
This seems to be caused by the Diphthongs referencing vowel height, which is defined in the beginning as 'Open' through 'Close', and doesn't define 'high' in any way.

After that:
"The feature value "labial" is not defined"
Which seems to be caused by 'Symbol ⱱ̟ [+voiced labial tap]', which... I'm unfamiliar with this symbol, but I think it's meant to say 'bilabial'.

After that:
"The feature value "fricative" is not defined"
Several symbols regarding several Lateral Fricatives are defined as 'fricative', whereas (I think) they're meant to say nonSibilantFricative or something. (i.e. 'Symbol ɬ [-voiced alveolar +lateral fricative]'

That's all that I noticed.

Two optional characters in a row doesn't work

Given the following change:

Class cons {p, t, k}

raise-a:
    a => e / i @cons? @cons? _

These words change as follows:

ia    => ie
ika   => ike
ikta  => ikta
iktpa => iktpa

I would expect the first three to change, but only the first two do.

Add literal-only floating diacritics

Floating diacritics should work even if we aren't using feature matrices.

Allow escaping special characters in romanizations

We should be able to use characters that Lexurgy assigns special meaning in romanizations. So we need an escaping mechanism for romanizations. See Issue #15.

Different syllabification rules at word boundary

Hi! Thanks a lot for the grate work you are doing! I have a question/feature request about syllabification.

Is there a way to specify different syllabification rules based on phonological context like word boundary?

For example, I would like to be able to obtain the following: ʔbi.ni, biʔ.ni, ɾβa.na, βaɾ.na, and so on. But now if I allow consonant clusters as onset I get cluster onsets everywhere because of the onset maximisation principle implemented in the syllabification function: ʔbi.ni, bi.ʔni, rba.na, ba.rna. Another example is /s/ which in some languages behaves differently in word-initial position: Italian /spa.da/ 'sword' vs /as.pi.de/ 'type of snake' (at least according to some phonologists).

Is there a way to achieve this or a work around? Thanks again!

Filter by sound class

Rule filters currently have to be feature matrices. We should allow sound classes as rule filters as well.

Syllable feature goes to default when monovalent feature is deleted and another syllable changes

Initial situation

Feature (syllable) Tone(*neutral, low, high)
Feature (syllable) +stress

Diacritic L [low]
Diacritic H [high]
Diacritic ' (before) [+stress]

Syllables:
 b a

tone-assignment:
 <syl> => [high] / $ _
 [] => [low]

stress-assignment:
 <syl> => [+stress] / _ $

romanizer-initial:
 unchanged

Results:

baba   => baH.'baL
bababa => baH.baL.'baL

The problematic rule

stress-retraction:
 <syl>&[high] <syl>&[low] => [+stress] [-stress] / _ $

Results:

baba   => 'baH.ba
bababa => baH.baL.'baL

The syllable that lost its feature +stress also lost its low tone.

Unsuccessful patch

Even when the affected feature gets mentioned in the result side:

stress-retraction:
 <syl>&[high] <syl>&[low] => [+stress] [-stress low] / _ $

It still doesn't carry over.

Similar rules where everything works as expected

Only one syllable change

stress-deletion:
 <syl>&[low] => [-stress] / _ $

Results:

baba   => baH.baL
bababa => baH.baL.baL

The previous syllable gets deleted

pretonic-deletion:
 <syl>&[high] <syl>&[low] => * [-stress] / _ $

Results:

baba   => baL
bababa => baH.baL.'baL

The multivalent feature goes to default value

tone-deletion:
 <syl>&[-stress] <syl>&[+stress] => [] [neutral] / _ $

Results:

baba   => baH.'baL     => baH.'ba
bababa => baH.baL.'baL => baH.baL.'ba

rule with leading tab ignored, leading spaces (or no whitespace) required

Reading your comments on previous white-space issue, I think this bug report is moot

Rule Syntax:

<rule-name>:
    <old-sounds> => <new-sounds> / <environment>

The rule is ignored if the white-space before is a tab.

This is inconvenient for coders like me, who instinctively use a tab to indent a line. If course this only applies to rule files prepared externally, as in the UI online, one cannot imbed tabs.

The attached file w-avardic-v2.lsc.txt contains the rule:
w-avardic-v2-bugs.lsc.txt

final-d-loss:
	d => * / _ $

If the white-space before the "d" is a tab, the rule is ignored.

The word-list:
ʔabʔid
ʔabʔa
grunʔid
grunan
ʔaiʔid
ʔaiʔuid
ʔaiʔiad
ʔaiid
ʔaiud

produces, with d-loss rule with a tab:
ʔabʔid => háp'id
ʔabʔa => háp'a
grunʔid => grundid
grunan
ʔaiʔid => hétid
ʔaiʔuid => hékwid
ʔaiʔiad => hétyad
ʔaiid => héd
ʔaiud => háyud

but produces, with d-loss rule with spaces:
ʔabʔid => háp'i
ʔabʔa => háp'a
grunʔid => grundi
grunan
ʔaiʔid => héti
ʔaiʔuid => hékwi
ʔaiʔiad => hétya
ʔaiid => hé
ʔaiud => háyu

Add classes in class definitions

We should be able to define classes by combining other classes, e.g.

Class stop {p, t, k}
Class fricative {f, s}
Class obstruent {@stop, @fricative}

Captures don't work through syllable breaks

Say I have the input word
to.o
with explicit Syllables.
And I want to apply
[]$1 $1 => [+long] *
then it doesn't work. I just get the same in as out.

Sound laws not applying when they should

I am writing a sound change that is supposed to delete vowels between two unvoiced consonants unless they are stressed or are long, but the unvoiced glottal stop is not taken to account for this. The glottal stop is also not taken to account for the next sound change, which is supposed to turn unvoiced consonants into ejective consonants before a glottal stop. I will paste my entire code here so you can pick through and see if I made an error or if there's a bug.

Feature Type(*cons, vowel, obstruent, resonant, glide)
Feature Place(labial, coronal, dorsal, glottal)
Feature Manner(nasal, stop, affricate, fricative, approximant)
Feature Height(high, mid, low)
Feature Depth(front, central, back)
Feature Stress(*stressed, unstressed)
Feature Voice(*voiced, unvoiced, aspirated, ejective)
Feature Length(*short, long)
Feature Nasality(*oral, nasalized)

Diacritic ʰ [aspirated]
Diacritic ’ [ejective]
Diacritic ́ [stressed]
Diacritic ː [long]

#Proto Sata'ilun
Symbol a [low central vowel]
Symbol i [high front vowel]
Symbol u [high back vowel]
Symbol ā [long low central vowel]
Symbol ī [long high front vowel]
Symbol ū [long high back vowel]
Symbol m [labial nasal]
Symbol n [coronal nasal]
Symbol p [unvoiced labial stop]
Symbol t [unvoiced coronal stop]
Symbol k [unvoiced dorsal stop]
Symbol ' [unvoiced glottal stop]
Symbol s [unvoiced coronal fricative]
Symbol h [unvoiced glottal fricative]
#Intermediate symbols
Symbol b [voiced labial stop]
Symbol d [voiced coronal stop]
Symbol ɡ [voiced dorsal stop]

deromanizer:
y => j
ā => aː
ī => iː
ū => uː
' => ʔ

vowel-loss:
[vowel] => * / [unvoiced] _ [unvoiced]

intervocalic-stop-voicing:
[unvoiced stop !glottal] => [voiced] / [vowel] _ [vowel]

ejectives:
[unvoiced] => [ejective] / _ [glottal stop]

romanizer:
j => y
aː => ā
iː => ī
uː => ū
ʔ => '
[stressed vowel] => [vowel]

It outputs the following:
mitúpaw => mitpaw
muyháy
tulímpi
púyam
tálsan
walkuyuymánwa
niwahitáta => niwahtáta
námu
kanimawpíma
kanlaw'iman
āhimul
usūmilti
inī
ālu
ūyalu
lunkūmia
yāmūimyana
ihakiyukun => ihkiyuɡun
mūlka
layun
miukā => miuɡā
al
ta'ínki
ta'an

Propagate apparently doesn't work if the match is <syl>

Rules marked as propagate are only applied once if the match is <syl> (and further narrowing of the match such as <syl>&[unstressed] makes no difference either).

Minimal working demonstration adapted from the Advancedish example:

Feature type(*cons, vowel)

Feature (syllable) stress(*unstressed, secondary, primary)

Diacritic ˈ (before) [primary]
Diacritic ˌ (before) [secondary]

Symbol a [vowel]

Syllables:
  [cons]? [vowel]

primary-stress-second-last-syllable:
  <syl> => [primary] / _ <syl> $

add-secondary-stress propagate:
  # This rule appears to be applied only once despite propagate
  <syl> => [secondary] / _ <syl> {[primary], [secondary]}

Testinput: papapapapapa
output: pa.pa.ˌpa.pa.ˈpa.pa
expected: ˌpa.pa.ˌpa.pa.ˈpa.pa (with secondary stress on the first syllable)

The issue isn’t related to the number of syllables either; you always get only one secondary stress, regardless of the number of syllables in the input (as long as that number is ≥4 obviously).

unexpected and unexplained results

Using these three files:

w-avardic-v2-bugs.lsc.txt
w-avardic.wli.txt
w-avardic_ev.wli.txt

As the _ev.wli file shows, the middle section (2 of 3) of forms produces forms with only some of the sound changes performed. specifically, the rule "glottal-resolution" has not been applied. _ev.wli shows the forms expected.

It should be noted that .wli.txt is an extract of a longer .wli with many similar forms, which are changed as expected, the attached .wli.txt shows only the forms immediately before and immediately after the problematic forms.

I have not used any "hungry" wild-cards except in "stress-application", so I don't believe this is the issue. I am mystified by the failure occurring on only 9 forms out of 27 (and 9 forms out of the original 117)

intermediate romanization character counting.

let's say we have a file

Feature Vowelheight(open, openmid, mid, closemid, close)
Feature Vowelfrontness(front, central, back)
Feature Vowelrounding(*unrounded, rounded)
Feature Syllabicity(syllabic, static)
Feature Voicing(*voiced, unvoiced)
Feature Consonantplace(bilabial, alveolar)
Feature Consonantmanner(nasal, plosive)
`Symbol i [syllabic close front]` `Symbol u [syllabic close back rounded]` `Symbol a [syllabic open central]` `Symbol m̩ [syllabic bilabial nasal]` `Symbol m [static bilabial nasal]` `Symbol t [static unvoiced alveolar plosive]` `Symbol b [static bilabial plosive]`
demonstrational-rule-minus-one:
b => m / _ $
`Romanizer-one:` ` unchanged`
demonstrational-rule-zero:
[syllabic] [nasal] => * [syllabic]
`Romanizer-two:` ` unchanged`
demonstrational-rule-one:
t => * / _ $

and we input

ait
utib

then you'll see that, in the output, the "=>" symbols don't line up. This is, again, because of there being a combining diacritic there. It really doesn't matter, but, for documentation's sake, I'll say it anyway.

Intermediate romanizer directly before "Syllables: clear" clears the syllables before printing the romanization

For example:

...

Romanizer-after-changes:
 unchanged

Syllables:
 clear

Expected output:
pika => ˈpi.ka => ˈɸi.ɣa => ɸiɣa

Actual output:
pika => ˈpi.ka => ɸiɣa => ɸiɣa

Workaround: If you insert a null rule, then the output is the expected output above:

...

Romanizer-after-changes:
 unchanged

null-rule:
 unchanged

Syllables:
 clear

Feature Suggestion: A rule that runs after each rule

I would like to be able to define a rule that runs after each rule is run.

I propose the global keyword, or some other keyword, that can modify a rule (like propagate or feature/class filters).

Consider the following example:

insert-a:
    * => a / $ @C _

f-gem:
    @F$1 * => $1 $1 / @V _ @V

no-cc global:
    * => ə / @C _ @C

The no-cc rule will run after each rule. So, a word like ksen becomes:

ksen
(insert-a) => kasen
(no-cc)    => kasen    # no change because CC condition was not met
(f-gem)    => kassen
(no-cc)    => kasəsen

Such a global rule could allow for constant features, such as a tonal language forcing tones on any new vowel that appears.

Not applying a change.

I had the following input as the change:

`Feature voicing(unvoiced, voiced)
Feature place(labial, dental, alveolar, postalveolar, palatal, velar, glottal)
Feature manner(stop, fricative, nasal, approximant, affricate, flap)
Feature length(long, regular, short)
Feature diphthong(oy, longi, ow, longa, longo, oi, shorta)
Feature rhotic (rhotic, nonrhotic)
Feature height(low, mid, high, nearhigh, midlow, nearlow, midhigh)
Feature frontness(front, central, back)
Feature rounding (rounded, unrounded)
Feature vowelness (vowel, consonant)
Feature +stress

Symbol p [unvoiced labial stop consonant]
Symbol b [voiced labial stop consonant]
Symbol t [unvoiced dental stop consonant]
Symbol d [voiced dental stop consonant]
Symbol k [unvoiced velar stop consonant]
Symbol ʔ [unvoiced glottal stop consonant]
Symbol ɡ [voiced velar stop consonant]
Symbol f [unvoiced labial fricative consonant]
Symbol v [voiced labial fricative consonant]
Symbol θ [unvoiced dental fricative consonant]
Symbol ð [voiced dental fricative consonant]
Symbol s [unvoiced alveolar fricative consonant]
Symbol z [voiced alveolar fricative consonant]
Symbol x [unvoiced velar fricative consonant]
Symbol ɣ [voiced velar fricative consonant]
Symbol h [unvoiced glottal fricative consonant]
Symbol m [labial nasal consonant]
Symbol n [alveolar nasal consonant]
Symbol l [alveolar approximant consonant]
Symbol ɹ = [postalveolar approximant consonant]
Symbol t͡ʃ [unvoiced postalveolar affricate consonant]
Symbol d͡ʒ [voiced postalveolar affricate consonant]
Symbol ʃ [unvoiced postalveolar fricative consonant]
Symbol ʒ [voiced postalveolar fricative consonant]
Symbol ɾ [alveolar flap consonant]
Symbol ŋ [velar nasal consonant]
Symbol ɔ͡ɪ [regular oy vowel]
Symbol o͡i [regular oi vowel]
Symbol e͡ɪ [regular longa vowel]
Symbol t͡s [unvoiced alveolar affricate consonant]
Symbol o͡ʊ [regular longo vowel]
Symbol a͡ɪ [regular longi vowel]
Symbol a͡ʊ [regular ow vowel]
Symbol ɔ˞ [regular rhotic midlow back rounded vowel]
Symbol ɪ [regular nearhigh front unrounded nonrhotic vowel]
Symbol i [regular high front unrounded nonrhotic vowel]
Symbol æ [regular nearlow front unrounded nonrhotic vowel]
Symbol ɔ [regular midlow back rounded nonrhotic vowel]
Symbol e͡ə [regular vowel shorta]

Diacritic ː [long]
Diacritic ̥ [unvoiced]
Diacritic ̆ [short]
Diacritic ˈ (floating) [+stress]
diacritic ̩ [vowel]

class vowel {ɔ͡ɪ, o͡i, e͡ɪ, t͡s, o͡ʊ, a͡ɪ, a͡ʊ, iː, ă, ĭ, ĕ, ŏ, æ̆, ɑ̆, ɔ̆͡ɪ, ŏ͡i, ĕ͡ɪ, ŏ͡ʊ, ă͡ɪ, ă͡ʊ, ə̆, ɚ̆, ɛ̆, ɜ̆, ɝ̆, ɪ̆, ʊ̆, ʉ̆, ĭː, ɔ̆˞, ɔ˞, a, i, e, o, æ, ɑ, ə, ɚ, ɛ, ɜ, ɝ, ɪ, ʊ, ʉ, o}
class consvoiced {w, r, d, g, j, l, z, v, b, n, m, ŋ, ɹ, ʒ, ɫ, ð}

caught-cot-merger:
ɔ => ɑ
pre-eng-raising:
{ɪ, æ} => {i, e} / _ .* ŋ
prenasal-diphthognisation:
æ => e͡ə / _ [nasal]
u-fronting:
u => ʉ
diphthong-raising:
a͡ʊ => a͡u
ɔ͡ɪ => o͡i
a͡ɪ => a͡i
pre-voiceless-shortening:
{a, i, a͡ɪ, e, o, æ, ɑ, ɔ͡ɪ, o͡i, e͡ɪ, o͡ʊ, a͡ɪ, a͡ʊ, ə, ɚ, ɛ, ɜ, ɝ, ɪ, ʊ, ʉ, iː, ɔ˞, o, a͡i, a͡u, ɐ} => {ă, ĭ, ă͡ɪ, ĕ, ŏ, æ̆, ɑ̆, ɔ̆͡ɪ, ŏ͡i, ĕ͡ɪ, ŏ͡ʊ, ă͡ɪ, ă͡ʊ, ə̆, ɚ̆, ɛ̆, ɜ̆, ɝ̆, ɪ̆, ʊ̆, ʉ̆, i, ɔ̆˞, ŏ, ă͡i, ă͡u, ɐ̆} / _ .* [unvoiced]
l-assimilation:
gl => ɫ
g.l => .ɫ
ɫd => l
ɫ.d => l.
ɫɾ => l
ɫ.ɾ => l.
final-coda-obstruent-devoicing:
{[stop], [affricate], [fricative]} => [unvoiced] / [vowel] [consonant]* _ [consonant]* {., $}
open-o-restoration:
ɔ˞ => ɔ
ɔ̆˞ => ɔ̆`
, and it had trouble using l-assimilation on wɛst.ko͡ʊst.

Add a "do nothing" rule

Sometimes a rule is syntactically required but the rule shouldn't do anything; this is most common for intermediate "romanizers" that should actually just emit the phonetic forms at that stage. Currently, you have to use a dummy rule like * => *. Instead, add a keyword that defines a "do nothing" rule, e.g.

Romanizer-old-examplish:
 unchanged

Confusing Error Message

So I was trying to apply a simple sound change, and I have tested this with multiple sound changes. The error message that I get is:
" => " doesn't make sense in the line "l => j / $ @consonant _" (line 14)
I'm confused as to why it's having trouble with its own divider.

ease of use: two-to-one curly bracket subrules

say I want to do:
demonstrational-rule-one:
a {e, o} => {e e, o o}
Then I have to say
demonstrational-rule-one:
{a e, a o} => {e e, o o}
lest it give me a "1 left, 2 right" error.
This isn't a major issue, but it did take me a while to figure out the first time around. It's not the most intuitive thing.
I do understand, however, that it might be hard to implement, seeing as the section inside the bracket is the thing the parser's trying to compare. I'd completely understand if this is unrealistic to attempt. In that case, it might be a good idea to mention this quirkiness in the documentation.

doubling stress diacritics when using nasalization diacritics in rules

I know how to fix this: declaring the combination as a Symbol. Regardless, this should work without having to do that.
Using
Symbol ɐ [vowellike syllabic open central lax]
Diacritic ˈ (floating) [stressed]
And
Romanizer-phonetic-twelve:
unchanged
vowel-ten:
n̩ => ɐ̃
[nasal syllabic] => n̩
ɐ => ɐ̃
Romanizer-phonetic-thirteen:
unchanged
I get
ɐˈnnɛ => ɐˈ̃ˈnnɛ => ɐˈ̃ˈnnɛ
mɐˈlɐʦɛˈrɪ => mɐˈ̃ˈlɐ̃ʦɛˈrɪ => mɐˈ̃ˈlɐ̃sɛˈrɪ
being respectively the two shown Romanizers and the final result. I removed the Then: from the subrules to fix double nasalization diacritics already, but it's still giving double stress marks. This is either a bug or a big mistake on my end.

Syllable features disappearing when using filter

With these sound changes:

Diacritic ˈ (before) [+stress]

Class vowel {a, i, u}
Class cons {p, t, k, x}

Syllables:
 @cons? @vowel @cons?

stress-assignement:
 <syl> => [+stress] / _ <syl> $

swapping @vowel:
 u a => a u

And these words:

katak
kuta
pakit
kutuka
akakti

We get:

ˈka.tak
ka.tu
ˈpa.kit
ku.ta.ku
a.ˈkak.ti

However by rewriting swapping not to use a filter:

swapping:
 u [] a => a [] u

We get the expected results with the stress mark still present:

ˈka.tak
ˈka.tu
ˈpa.kit
ku.ˈta.ku
a.ˈkak.ti

Word Boundaries in Syllabification Rules

Is there any way to define syllabification rules that are sensitive to word boundaries? E.g.,

Syllables: # Only allow nasal codas in word-final position
  [consonant]? [vowel]
  [consonant]? [vowel] [nasal]? $

Feature variables don't capture the default

This simple vowel harmony system behaves as expected:

Feature Type(*cons, vowel)
Feature Height(low, high)
Feature Depth(front, back)

Symbol a [vowel low back]
Symbol e [vowel low front]
Symbol i [vowel high front]
Symbol u [vowel high back]

harmony [vowel] propagate:
    [] => [$Height] / [$Height] _

But if I make low and front the default, this equivalent system fails to copy the default values:

Feature Type(*cons, vowel)
Feature Height(*low, high)
Feature Depth(*front, back)

Symbol a [vowel back]
Symbol e [vowel]
Symbol i [vowel high]
Symbol u [vowel high back]

harmony [vowel] propagate:
    [] => [$Height] / [$Height] _

Here's a sample set of input words:

kaki
putatu
ichegaku
epistrefu

The first sound change file produces the expected:

kake
pututu
ichiguku
epestrefa

The second file produces:

kaki
pututu
ichiguku
epistrifu

It copies the high value rightward across the word, but not the low value.

Can't delete a repeater (*, +, ?)

"Invalid element types: RepeaterMatcher and NullEmitter". I do not have these sequences of characters in my file and I believe they aren't mentioned in the guide. I have no idea what to do with them.

Switching Word Order in Lexurgy SC

How do I switch word order in Lexurgy? I could try wordToSwitch !Q+$1 => $1 wordToSwitch but you cannot use more than one suffix in Lexurgy SC.

Make keywords case-insensitive

Case sensitivity in keywords seems overly picky. We should allow keywords that currently must be capitalized (Feature, Diacritic, Symbol, Deromanizer, Romanizer) to be lowercase, and keywords that currently must be all lowercase (floating, propagate) to be capitalized.

Add negated class notation

We should be able to match anything but a member of a class. For example, if we have Class stop {p, t, k}, then !@stop should match anything but p, t, or k.

location notification for unequal left and right element count

"found 2 elements on the left and 1 on the right at line 225, column 45" doesn't help much if you don't know at what rule this happened, especially since the current format doesn't display line and column numbers.

Suggestion: show only start and end stages

The current checkbox is to show only the end-stage after all sound changes are applied.

I'd like an option to view only the start and end stages, like start => end like with Show Stages turned on.

How do I write two suffixes at a time?

I don't mean language suffixes, but Lexurgy's suffixes like $1 and *, which are the two ones I'd like to use at the same time. I tried this under "shortening:":
{@vowel, !@vowel} *$1 {@vowel, !@vowel}$2 $$ $1 {@vowel, !@vowel}$3 => $1 $2 $3
What I meant by this is something like how it is turns into it(')s in English. But it may happen on a larger scale maybe like: anko ante to ankote. This is definitely not natural, but this is not a Conlang. Just practice with Lexurgy! Maybe I read the documentation wrong, but can you help me out? Thanks! 😄

Make romanizers and deromanizers more intuitive

In a romanizer, the last sequential subrule emits romanized text, which ignores all the phonetic declarations. This is confusing, especially when there's only one sequential subrule, and you have to put a dummy rule like * => * to make phonetic rules work. Make this distinction explicit; e.g. you have to put literal after the Romanizer or Then keyword to make that section of the rule ignore phonetic declarations.

(And vice versa for deromanizers.)

Error on lines with whitespace only

Lines that contain only whitespace and nothing else should be ignored, but they cause parse errors instead. See Issue #15.

Suggestion: Make it possible to define classes from features

For example, I would like to define the following:

# (current produces an error saying that "[" makes no sense in this context)
Class sonorant {[vowel], [nasal], [lateral], [approximant]}

Or are there any other elegant ways to do this that I haven’t thought of?

not allowing syllable boundaries in output

I use syllable boundaries in my rules, why can't I have it use them in the output of them?

Stress diacritic replaces vowel by the first vowel symbol declared

I've been trying to run a simple file on Lexurgy 0.8.2:

Feature Type(*cons, vowel)
Feature Stress(*unstressed, stressed)

Diacritic ' [stressed]

Symbol æ [vowel]
Symbol ɛ [vowel]
Symbol i [vowel]
Symbol ə [vowel]
Symbol o [vowel]

stress-first-syllable [vowel]:
[] => [stressed] / $ _

With the input əonə I get æ'onə, which is weird because it shouldn't replace the vowel, but only add the stress diacritic. If I make the symbol o the first one declared, then I get o'onə. I would expect to get ə'onə.

Adding diacritics fails if I've explicitly specified the default value

I have this example lsc file:

Feature Type(*cons, vowel)
Feature Height(*low, high)
Feature Depth(*front, back)
Feature Stress(*unstr, str)

Diacritic ́  [str]

Symbol a [vowel low back]
Symbol e [vowel low front]
Symbol i [vowel high front]
Symbol u [vowel high back]

stress [vowel]:
    [] => [str] / $ _

This fails with an error when I apply it to almost anything:

No combination of a symbol and diacritics has the matrix [vowel high str]

These features should produce í, but the explicit "front" in the definition of the i symbol seems to be preventing this.

Feature Suggestion: Applying a rule only on a syllable of given structure

I am not sure if there is a way to do this, but here we go.

I want to be able to have a rule only apply on syllables of given structure.

For example, let's say I want only CVC syllables to gain an additional r (CVrC) but ignore all other syllables (so CV.CV should not work, but CVC.CV will, for example).

Writing this as rules:

Feature Type(C, V)
Syllables:
[C]? [V] ([C] ([C]?))?
# rules...

In the current system I would need to do the following:

insert-r-on-cvc:
* => r / [C] [V] _ [C] // _ . [C]

This is verbose and error prone since you have two conditions you need to worry about.

I propose the following syntax:

insert-r-on-cvc  <syl>([C] [V] [C]):
* => r / [C] [V] _ [C]

which says that this rule only applies on syllables with form [C] [V] [C].

The syntax feels ugly, perhaps a better syntax can be thought up.

The feature value 'velar' doesn't show up!

I have a lexurgy code (a little repetitive, so please just skim it!)

Feature type (consonant, vowel)
Feature height (low, lowermid, lowmid, mid, midhigh, high)
Feature frontness (front, central, back)
Feature rounded
Feature nasal
Feature manner (nasal, plosive, fricative, lateral) 
Feature place (labial, alveolar, postalveolar, dorsal)
Feature voiced
Diacritic ~ [+nasal]

Symbol i [high front -rounded -nasal vowel]
Symbol y [high front +rounded -nasal vowel]
Symbol u [high back -rounded -nasal vowel]
Symbol ɪ [midhigh front -rounded -nasal vowel]
Symbol ʊ [midhigh back -rounded -nasal vowel]
Symbol e [mid front -rounded -nasal vowel]
Symbol ə [mid central -rounded -nasal vowel]
Symbol o [mid back +rounded -nasal vowel]
Symbol ɛ [lowmid front -rounded -nasal vowel]
Symbol ʌ [lowmid back -rounded -nasal vowel]
Symbol ɔ [lowmid back +rounded -nasal vowel]
Symbol æ [lowermid front -rounded -nasal vowel]
Symbol a [low front -rounded -nasal vowel]
Symbol m [nasal labial +voiced consonant]
Symbol n [nasal alveolar +voiced consonant]
Symbol p [plosive labial -voiced consonant]
Symbol b [plosive labial +voiced consonant]
Symbol t [plosive alveolar -voiced consonant]
Symbol d [plosive alveolar +voiced consonant]
Symbol k [plosive dorsal -voiced consonant]
Symbol g [plosive dorsal +voiced consonant]
Symbol s [fricative labial -voiced consonant]
Symbol z [fricative labial +voiced consonant]
Symbol ∫ [fricative postalveolar -voiced consonant]
Symbol ʒ [fricative postalveolar +voiced consonant]
Symbol θ [fricative alveolar -voiced consonant]
Symbol ð [fricative alveolar +voiced consonant]
Symbol χ [fricative dorsal -voiced consonant]
Symbol l [lateral alveolar +voiced consonant]

There shouldn't be a problem (I hope!), but it gives me this error: The feature value "velar" is not defined. I didn't mention velar! I need a little help.

Intervocallic loss of /ɣ/ is not working

When I used this: ɣ => * / [vowel] _ [vowel] to remove /ɣ/ intervocalically, it doesn't work.
What I wanted: eɣa => ea
The result: eɣa => eɣa

Allow two deromanizers with no other rules between them

Currently I have to put a dummy rule between deromanizers:

Deromanizer-abc:
    <rules>

dummy:
    a => a

Deromanizer-def:
    <rules>

I should be able to put one right after the other.

Syllable features disappearing when inserting segment in non-initial position

Same end result as #44, but different trigger.

.lsc file

Feature (syllable) +stress

Diacritic ˈ (before) [+stress]

Class vowel {a, i, u}
Class cons {p, t, k, x}

Syllables:
 @cons? @vowel @cons?

stress-assignement:
 <syl> => [+stress] / _ <syl> $

neutralisation:
 * @cons => x * / _ a

.wli file

katak
akatak

Results

ˈxa.xak
a.xa.xak

Expected

ˈxa.xak
a.ˈxa.xak

Remarks

Results are the same regardless of the placement of the stress mark (first, before, etc)

Program crashes when running with --compare-versions if the _ev file doesn't exist

If I run lexurgy with the --compare-versions flag before I have run it to create the _ev.wli file, it crashes. Running with -d gives the following error:

Applying changes to words in pereyan.wli
java.io.FileNotFoundException: pereyan_ev.wli (The system cannot find the file specified)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(Unknown Source)
        at java.io.FileInputStream.<init>(Unknown Source)
        at kotlin.io.FilesKt__FileReadWriteKt.forEachLine(FileReadWrite.kt:190)
        at kotlin.io.FilesKt__FileReadWriteKt.readLines(FileReadWrite.kt:219)
        at kotlin.io.FilesKt__FileReadWriteKt.readLines$default(FileReadWrite.kt:217)
        at com.meamoria.lexurgy.WordlistsKt.loadList(Wordlists.kt:7)
        at com.meamoria.lexurgy.sc.SoundChangerJvmKt.changeFiles(SoundChangerJvm.kt:108)
        at com.meamoria.lexurgy.sc.SoundChangerJvmKt.changeFiles(SoundChangerJvm.kt:26)
        at com.meamoria.lexurgy.SC.run(MainJvm.kt:82)
        at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:168)
        at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:176)
        at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:16)
        at com.github.ajalt.clikt.core.CliktCommand.parse(CliktCommand.kt:258)
        at com.github.ajalt.clikt.core.CliktCommand.parse$default(CliktCommand.kt:255)
        at com.github.ajalt.clikt.core.CliktCommand.main(CliktCommand.kt:273)
        at com.github.ajalt.clikt.core.CliktCommand.main(CliktCommand.kt:298)
        at com.meamoria.lexurgy.MainJvmKt.main(MainJvm.kt:118)

It would be better to simply display an error about the file not existing, or even just print a warning and then go on to run the sound changes without writing a comparison file.

def-gthill / lexurgy Goto Github PK

lexurgy's People

Contributors

Stargazers

Watchers

Forkers

lexurgy's Issues

Initial situation

The problematic rule

Unsuccessful patch

Similar rules where everything works as expected

Only one syllable change

The previous syllable gets deleted

The multivalent feature goes to default value

.lsc file

.wli file

Results

Expected

Remarks

Recommend Projects

Recommend Topics

Recommend Org