rism-digital / pae-code-spec Goto Github PK
View Code? Open in Web Editor NEWIssue tracker and website for the Plaine and Easie specification
Home Page: https://plaine-and-easie.info/
Issue tracker and website for the Plaine and Easie specification
Home Page: https://plaine-and-easie.info/
Not sure if this is a Muscat thing, a Verovio thing, or a P&E thing, so I'll start discussion here.
The guidelines say:
12. Change of clef, key or time signature:
Use % to change the clef, $ to change the key, and @ to change the time signature. Follow this with the new indication (clef, key, or time), followed by a space.
Examples:
%C-1 '2A
%C-1 $xFC '8B
@3/2 '1C
$nBE $xFC
I'm interested in the last example. With the new Verovio validation, if you enter something like 4CDEF/$nBE $xFC AB''C
then it's not happy because you can't change the key signature more than once in a measure:
But adding the required bar line 4CDEF/$nBE /$xFC AB''C
isn't very satisfying optically:
But maybe you can just leave out the cancelled key signature and go directly to the new one:
4CDEF/$xFC AB''C
The previous Verovio version (still visible on muscat-training) didn't print the cancelled key signature anyways, so I guess we haven't missed this feature. I am wondering if Verovio should be adjusted, or if we can make a shortcut in the guidelines to go directly to the new key signature (analogue to going directly to the new clef or meter).
Currently the specification does not say anything about disallowing beam nesting.
The MARC21 031 $2
field uses the value pe
to indicate that the data in the field is Plaine & Easie code. This code is taken from the MARC21 Incipit Scheme Source Codes.
Depending on the changes to the specification, we may need to indicate a new value for this field. (e.g., p2
if we want to keep it to two characters)
This issue is simply to mark this as a consideration when updating the specification, and a reminder to assess whether this is necessary later in the process.
So far RISM catalogers haven't had to declare initial octave or duration if beginning in the first octave after Middle C with quarter notes. Therefore, CDEF
displays as intended even if '4CDEF
would, strictly speaking, be required.
Not sure if this is a Verovio thing or a spec thing, if it is even desirable to have these defaults built in? Or maybe that's just been local RISM practice for years?
Even {CD}
defaults to 8th notes at the beginning of a measure:
The sentence "If the music is written for a transposing instrument, notate the incipit at sounding pitch." was removed from the clef section, but this should be added to the encoding guidelines.
I found 76 cases with repeated sharps/flats in the key signature (see list, the real case number might be higher). Examples are: $bBEE, $xFCGF, but also $xFCG[G]
Should we correct those entries and modify the paec-grammar in order to detect them?
What about $xFCG[G] ? We probably have no chance to know which G is too much
P&E says:
but Verovio is not as strict: both ,8qB and ,q8B seem to be allowed.
Could you please clarify the order of elements here? Rhythm-q or should it be q-rhythm?
Currently the key signature can be empty. However, this makes it unclear whether the key signature is missing, or whether it is a completely natural key signature.
The section should be clarified to say that encoders should provide a value, and that a value of n
can be given if no key signature is present.
It should also clarify that parsers should assume a value of n
if no key signature is present.
The rests section does not actually spell out the form of a rest with a duration. This should be added.
We should have provision for hosting any 'legacy' specifications, in addition to any 'new' versions. This is so that people can refer back to former versions of the spec if they have legacy data.
This will mean a decision about what to call the current version, to differentiate between it and a changed version. I'm not sure if calling it "Version 1" is appropriate?
Time signatures should be an optional field, since they are not necessary.
Split off from #11
The spec should explicitly state that key signatures with consecutive accidentals requiring square brackets must be included within a single square bracket. Consecutive square brackets are not allowed. For example: $xF[CG] is allowed, $x[F]C[G] is allowed, but $xF[C][G] is forbidden.
The prose of the octave values section can be improved to set out the form and number of apostrophes and commas.
This should also be the place where it is specified that omitting the octave marker will keep the following notes in the same octave, and to switch octaves you need to explicitly give a new octave marker.
Currently the "music notation" section launches straight into the octave symbols description, but it would be good to give an overview of the expected form of notation within the "body" of the PAE encoding.
Perhaps we can re-employ and update the visual structure provided by Massimo earlier (attached).
Currently the 'note names' section contains only the note letter names.
<section>
<h4>Note Names</h4>
<p>C, D, E, F, G, A, B</p>
</section>
A brief prose description should be added indicating how they combine with other parts of a note, among other things.
In accordance with RISM's general policy that cataloguers should reflect what they see on the source, a number of records include sharp signs in positions where we would today use a natural. See e.g.
https://muscat.rism.info/admin/sources/452507507
This causes no particular difficulty for the (sufficiently intelligent) human user, but I wonder whether the IT side can also handle such irregularities when searching for incipits or comparing them. If not, we should amend the Muscat guidelines (and potentially the PAE specs) to explicitly state that such cases should be normalized.
Summary: As a software developer, I would like to see neumatic notation encoded differently, to make it easier to parse.
Description:
Currently neume notation is indicated by using the special 7.
note duration value. This means that parsers need to be prepared to encounter this value in any stream of notes, which makes parsing the note values more difficult even though encountering neume notation in the middle of a common notation note stream is highly unlikely.
A possible solution is to indicate neume notation by using the second character of a clef indicator, similar to mensural notation. For example, the colon :
could be used, e.g., G:2
. Other characters could be used as well.
A preliminary check of the existing RISM incipits indicates that we currently have no incipits encoded this way, so the impact of this change will be small on this dataset. We should inquire more widely to assess the broader impact.
In the spec (V1) there is no explanation how to encode tied chords (the problematic is already mentioned in V2). Some users overcame this by applying own rules:
A non-normative section on serialization formats (MARC, single-line, multi-line, JSON) should be added to the spec to document the different input formats that may be encountered.
@lpugin: I think we need to decide if n MAY be given (for example when changing from xFCG to nxF. If yes, we also need to decide the allowed order(s).
@ahankinson : We should handle this in the section on inline changes to key signatures.
From #54
'nC
vs n'C
We should determine who are the current editors / former editors / authors of the specification.
ReSpec allows one to specify the people involved in making the specification. Editors should be those who are currently involved in reviewing PRs and making changes, so we will need to determine who should be added here. Currently I have pulled over those listed on the IAML PAE page, but we can adjust this when we figure out who is currently involved.
Former editors can be specified for those people who have significant input, but are no longer active in maintaining the spec.
Authors (if necessary) can be used to acknowledge a fundamental role in creation.
The spec should explicitly state whether clef, key signature, or time signature data are required fields, or if they can be omitted.
The form and use of time signatures should be explicitly defined. This will need to be tied to the type of notation being rendered; for example, the "o" and "o." are only valid in mensural notation.
As well, the field should be restricted to a single time signature in the "time signature" field, eliminating the use of semi-colon separated values.
We should determine whether we want to use RFC2119/RFC8174 keywords in the language of the specification:
"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
Usually specs use a subset of these -- e.g., just MUST and MUST NOT (requirements) SHOULD (best practice), MAY (optional).
The spec currently says this about multi-measure rests:
measure rest (followed by number of measures and a bar line)
This should be made more explicit and state that a multi-measure rest MUST be followed by a bar line.
We should decide if we want ordered sharps/flats in the key signature or allow arbitrary sequences. For example: do we tolerate D major written as $xCF or it must be $xFC?
I found 225 cases with wrong order of sharps/flats (including cases like $Bb), see attached list. The real number of those could be slightly higher than this, since grammatically wrong key signatures are excluded from the analysis.
The actual paec-grammar version does not check the order of sharps/flats.
The current specifications of PAE support the encoding of fermata with the (
and )
character.
From the current specifications
fermata (include only one note or rest; accidentals or octave symbols must be outside the parentheses. See also "Special rhythmic groupings" below.)
The main problems with this a approach are:
The proposal would be to use a single characters for encoding fermata since there is no need to have a start and end delimiter. It would avoid overlap with tuplets, it would also work better with chords and would be more inline with the current encoding of trills.
Some possible characters could be:
?
*
c
(corona)h
(hold)p
(pause and point d'orgue)Placement of the sign should be after the note / chord
"In the RISM guidelines for the field Key signature (031 $n), it says "Add missing sharps or flats in brackets." You use this if, for example, the piece is clearly in A major but the source only has xFC. You enter xFC[G]. This is a very old RISM rule that we've had since CD-ROM days. In the CD-ROM the sharp/flat in the key signature would display in brackets. In Muscat, the [G] can be entered but it does not display in brackets. Can you change it so that the sharps/flats are in brackets?"
The specification should be adjusted to allow square brackets in key signature
Potentially: Remarks, Comments, Commentary
While the version 2 specification will help clarify the boundaries of the Plaine & Easie code, it will also create a hole. Previous versions were part specification, part encoding guideline. In Version 2 the guidelines are largely going to feature less in the specification, but there is likely still an audience for them.
So we should create a second document that can contain implementation guides, best practices, or other guidelines. Like "record transposing instruments in a non-transposed way" or "Should contain 3-4 measures", etc.
As discussed on the 21.10.2021 call, the PAE code spec will be split into two versions: v1 will be the current spec and will contain only enhancements and clarifications to the existing spec.
v2 will contain backwards-incompatible changes.
A landing page will route people to the right place.
We should decide on an appropriate copyright license for the PAE code specifications. There currently is not one assigned.
If we want a "public domain" license, we can choose the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication.
If we retain some "rights" and use an actual copyright license (even other Creative Commons licenses) then we will need to decide who these rights are assigned to. IAML? RISM? The Editors?
Currently the "Rhythmic values" section has no prose to begin the section, and only a table giving the values.
The friendliness and clarity of this section could probably be improved by actually describing the field and its possible values first before showing examples.
P&E allows compound meters—not meters that change later on, but things like this:
Could you please clarify the validity and display of this? Meters like c; 3/4
(starts in c, changes to 3/4 later) are obviously not allowed but c3/2 and similar are present throughout the RISM data. Currently, the Verovio validator gives you an error message with c3/2: "The input contains one or more character(s) 'c3/2'." But it should be allowed.
See also rism-digital/muscat#809, where other meters are listed that do not display and trigger a warning with the new validation.
The new PAE spec has a section that spells out the different ways Plaine and Easie can be represented: MARC/UNIMARC, Single-string, multi-line string, and JSON.
Currently the MARC and UNIMARC codes for the fields are given at the start of each section. These should be moved to the MARC and UNIMARC sections to provide users with a single point of reference for the different possible fields.
To replace content moved to the "encoding guidelines".
Summary: As a software developer, I would like ties to be encoded with the underscore (_) rather than the plus (+), so that I can use PAE more easily in a web-based environment.
Background: The plus character, when it occurs in a URL, is used to represent a blank space. When a PAE string is used as a query parameter this can lead to some problems. Since a space can be interpreted as a "+" this means the following PAE code:
'C 'C 'D 'E 'F
can become:
'C+'C+'D+'E+'F
This may incorrectly be interpreted as a series of tied notes.
Ideally all ties would be encoded using the proper URL encoding for plus signs, %2B
. However, this can lead to problems when encoding and unencoding multiple times:
'C+'C 'D 'E 'F -->
'C%2B'C%20'D%20'E%20'F -->
'C%2B'C+'D+'E+'F -->
'C%2B'C%2B'D%2B'E%2B'F
When the final stage is encountered, it is not possible to tell which notes were originally tied together, and which were not.
Most characters will be URL encoded when placed in a query parameter, and these are treated as either URL encoded (%2B) or, in some browsers, are used as a raw value. However, the difference with the plus sign is that the literal interpretation of it, and the URL encoding of it, can be ambiguous as to what is meant.
What does the future of $s look like? In Plaine & Easie officially it is allowed:
5. Coded validity note
A one-character coded validity note can be introduced by a '~' at the end of the code.
Accepted characters are:
? a mistake in the incipit has not been corrected
+ a mistake in the incipit has been corrected
t incipit has been transcribed into modern notation
https://www.iaml.info/plaine-easie-code#toc-5
But starting with Muscat, we don't want to use it any more:
Validity (031 $s)
Do not enter anything into this field! (It is only used for old data.)
The reasoning is because someone seeing a "+" etc. in a record wouldn't intuitively know what was meant, and if changes were made then you would just say so in a note, without the encoded character.
I see that $s is only used ca. 2,500 times. I haven't checked for ~ at the end of the code though.
I'd like to propose a Muscat validation that creates a warning if $s is filled out (we don't want the field to be used, but people might accidently use it) but I thought I'd ask here first.
Currently the spec does not explicitly say the only valid clef shapes.
This should be amended to say that clef shapes should be one of three values: C, G, or F.
Double dots do not exist in mensural notation. However, the specs are not explicit about the fact that having 2..8
- which would mean in CMWN 𝅗𝅥..𝅘𝅥𝅮
- is not supported for mensural notation. This should be clarified
The specifications are currently not clear about a possible requirement regarding the order of these, except maybe through the table of signs.
We need to decide if we want to restrict the code validity to a specific order, and if yes which one. The one given in the table looks logical. That is:
Although the pitch and octave of every note in PAE is individually encoded, the absence of a clef field means that it cannot be properly drawn.
Clef should therefore be a required field in PAE.
Follows the discussion in #11
Should spell out the possible values of bar and measure lines.
We have many incipits where +
has been used for encoding ligatures instead of ties.
7.
)We need to clarify if this is supported or not. If yes, tooling needs to be adjusted accordingly. We should also consider adjusting the specifications to use another sign.
(We also have cases where +
has been used to represent slurs. This is simply wrong.)
In some places in the PAE specification the value for the single-string representation is given, e.g., for key signature it says
Begin this field with the character
$
However, this is only necessary when using the "single-string" representation. All other representations, including MARC, do not require this.
The exception to this is when discussing clef/keysig/timesig changes, so references to the field delimiters will be kept in that section.
The current specifications of PAE support the encoding of chords with the ^
character.
From the current specifications
Enter chords from the highest to the lowest note, each one separated by
^
.Example:
''2D^'A^xF
There are several issues with this approach:
The proposal would be to change the chords to a container markup. To preserve some graphical similarity with the current ^
, a proposal would be to use <
and >
. These characters are not used elsewhere in PAE.
Example 2<''D'AxF>
Some rules would be:
<
chord start delimiter>
chord end delimitersIt would need to be decided if fermata and trills should be outside or inside chord delimiters
The version
field should be added to the various PAE representations, with values of pe
or pe2
.
If version
is empty or missing, then pe
is assumed.
What this looks like:
In multi-line strings:
@version:pe2
In JSON:
{ "version": "pe2" }
In MARC21:
$2pe2
(subject to approval and addition to the Musical Incipit Scheme Codes list.
For the single-string representation, we have a couple options:
pe2
at the beginning; if the string starts with pe2
then it's version 2; if it starts with pe
, or starts with a clef, then it's version 1;pe2
or ;pe
. I don't think this is necessary, but it could help if the fields get out-of-order.Related to #5
Currently the 'note names' section contains only the note letter names.
<section>
<h4>Note Names</h4>
<p>C, D, E, F, G, A, B</p>
</section>
A brief prose description should be added indicating how they combine with other parts of a note, among other things.
Currently the accidentals section does not contain any prose setting out the form of accidentals and how they should be employed. (position in a pitch string, etc.)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.