Giter Site home page Giter Site logo

pae-code-spec's People

Contributors

ahankinson avatar jenniferward avatar lpugin avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

pae-code-spec's Issues

Clarify whether `n` is a valid accidental type in key signatures.

Not sure if this is a Muscat thing, a Verovio thing, or a P&E thing, so I'll start discussion here.
The guidelines say:

12. Change of clef, key or time signature:
Use % to change the clef, $ to change the key, and @ to change the time signature. Follow this with the new indication (clef, key, or time), followed by a space.

Examples:
%C-1 '2A
%C-1 $xFC '8B
@3/2 '1C
$nBE $xFC

I'm interested in the last example. With the new Verovio validation, if you enter something like 4CDEF/$nBE $xFC AB''C then it's not happy because you can't change the key signature more than once in a measure:
image

But adding the required bar line 4CDEF/$nBE /$xFC AB''C isn't very satisfying optically:
image

But maybe you can just leave out the cancelled key signature and go directly to the new one:
4CDEF/$xFC AB''C
image

The previous Verovio version (still visible on muscat-training) didn't print the cancelled key signature anyways, so I guess we haven't missed this feature. I am wondering if Verovio should be adjusted, or if we can make a shortcut in the guidelines to go directly to the new key signature (analogue to going directly to the new clef or meter).

Clarify beam nesting

Currently the specification does not say anything about disallowing beam nesting.

Determine a new version code for 031$2

The MARC21 031 $2 field uses the value pe to indicate that the data in the field is Plaine & Easie code. This code is taken from the MARC21 Incipit Scheme Source Codes.

Depending on the changes to the specification, we may need to indicate a new value for this field. (e.g., p2 if we want to keep it to two characters)

This issue is simply to mark this as a consideration when updating the specification, and a reminder to assess whether this is necessary later in the process.

Default values in incipits?

So far RISM catalogers haven't had to declare initial octave or duration if beginning in the first octave after Middle C with quarter notes. Therefore, CDEF displays as intended even if '4CDEF would, strictly speaking, be required.
image

Not sure if this is a Verovio thing or a spec thing, if it is even desirable to have these defaults built in? Or maybe that's just been local RISM practice for years?

Even {CD} defaults to 8th notes at the beginning of a measure:
image

Add note about transposing instruments

The sentence "If the music is written for a transposing instrument, notate the incipit at sounding pitch." was removed from the clef section, but this should be added to the encoding guidelines.

Repeated sharps or flats in key signature

I found 76 cases with repeated sharps/flats in the key signature (see list, the real case number might be higher). Examples are: $bBEE, $xFCGF, but also $xFCG[G]

Should we correct those entries and modify the paec-grammar in order to detect them?

What about $xFCG[G] ? We probably have no chance to know which G is too much

Clarification with order of elements in ornaments

P&E says:

  1. Grace notes and ornaments
    g = acciaccatura (without rhythmic value, precedes the note name)
    q = appoggiatura (with rhythmic value, precedes the note name)

but Verovio is not as strict: both ,8qB and ,q8B seem to be allowed.
Could you please clarify the order of elements here? Rhythm-q or should it be q-rhythm?

Make key signature a recommended field

Currently the key signature can be empty. However, this makes it unclear whether the key signature is missing, or whether it is a completely natural key signature.

The section should be clarified to say that encoders should provide a value, and that a value of n can be given if no key signature is present.

It should also clarify that parsers should assume a value of n if no key signature is present.

Split off from #11
Related to #26

Provide form of rests

The rests section does not actually spell out the form of a rest with a duration. This should be added.

Host legacy and new versions simultaneously

We should have provision for hosting any 'legacy' specifications, in addition to any 'new' versions. This is so that people can refer back to former versions of the spec if they have legacy data.

This will mean a decision about what to call the current version, to differentiate between it and a changed version. I'm not sure if calling it "Version 1" is appropriate?

Clarify use of square brackets with more accidentals in key signature

The spec should explicitly state that key signatures with consecutive accidentals requiring square brackets must be included within a single square bracket. Consecutive square brackets are not allowed. For example: $xF[CG] is allowed, $x[F]C[G] is allowed, but $xF[C][G] is forbidden.

Improve octave values prose

The prose of the octave values section can be improved to set out the form and number of apostrophes and commas.

This should also be the place where it is specified that omitting the octave marker will keep the following notes in the same octave, and to switch octaves you need to explicitly give a new octave marker.

Add prose introduction to the "Music notation" section

Currently the "music notation" section launches straight into the octave symbols description, but it would be good to give an overview of the expected form of notation within the "body" of the PAE encoding.

Perhaps we can re-employ and update the visual structure provided by Massimo earlier (attached).

pae-code

Add prose to note name section

Currently the 'note names' section contains only the note letter names.

<section>
            <h4>Note Names</h4>
            <p>C, D, E, F, G, A, B</p>
</section>

A brief prose description should be added indicating how they combine with other parts of a note, among other things.

Clarify methodology when encoding incipits

In accordance with RISM's general policy that cataloguers should reflect what they see on the source, a number of records include sharp signs in positions where we would today use a natural. See e.g.
https://muscat.rism.info/admin/sources/452507507

This causes no particular difficulty for the (sufficiently intelligent) human user, but I wonder whether the IT side can also handle such irregularities when searching for incipits or comparing them. If not, we should amend the Muscat guidelines (and potentially the PAE specs) to explicitly state that such cases should be normalized.

Encoding Neumatic notation

Summary: As a software developer, I would like to see neumatic notation encoded differently, to make it easier to parse.

Description:

Currently neume notation is indicated by using the special 7. note duration value. This means that parsers need to be prepared to encounter this value in any stream of notes, which makes parsing the note values more difficult even though encountering neume notation in the middle of a common notation note stream is highly unlikely.

A possible solution is to indicate neume notation by using the second character of a clef indicator, similar to mensural notation. For example, the colon : could be used, e.g., G:2. Other characters could be used as well.

A preliminary check of the existing RISM incipits indicates that we currently have no incipits encoded this way, so the impact of this change will be small on this dataset. We should inquire more widely to assess the broader impact.

Specify how to encode tied chords

In the spec (V1) there is no explanation how to encode tied chords (the problematic is already mentioned in V2). Some users overcame this by applying own rules:

  • adding a + after each note in the chord, e.g. C+^E+^G+C^E^G
  • adding a single + between the chords, e.g. C^E^G+C^E^G
    Whereas the second approach would only work if the two chords are identical, the first approach would give more flexibility if not all the notes in the chord are tied. However it would be to specify whether + or ^ should directly follow the note.

Add a section on serialization formats

A non-normative section on serialization formats (MARC, single-line, multi-line, JSON) should be added to the spec to document the different input formats that may be encountered.

Decide on editors / editorial process

We should determine who are the current editors / former editors / authors of the specification.

ReSpec allows one to specify the people involved in making the specification. Editors should be those who are currently involved in reviewing PRs and making changes, so we will need to determine who should be added here. Currently I have pulled over those listed on the IAML PAE page, but we can adjust this when we figure out who is currently involved.

Former editors can be specified for those people who have significant input, but are no longer active in maintaining the spec.

Authors (if necessary) can be used to acknowledge a fundamental role in creation.

Explicitly spell out the form of time signatures

The form and use of time signatures should be explicitly defined. This will need to be tied to the type of notation being rendered; for example, the "o" and "o." are only valid in mensural notation.

As well, the field should be restricted to a single time signature in the "time signature" field, eliminating the use of semi-colon separated values.

Use RFC2119/RFC8174 keywords

We should determine whether we want to use RFC2119/RFC8174 keywords in the language of the specification:

"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"

Usually specs use a subset of these -- e.g., just MUST and MUST NOT (requirements) SHOULD (best practice), MAY (optional).

Order of sharps or flats in key signature

We should decide if we want ordered sharps/flats in the key signature or allow arbitrary sequences. For example: do we tolerate D major written as $xCF or it must be $xFC?

I found 225 cases with wrong order of sharps/flats (including cases like $Bb), see attached list. The real number of those could be slightly higher than this, since grammatically wrong key signatures are excluded from the analysis.

The actual paec-grammar version does not check the order of sharps/flats.

Encoding of fermata

The current specifications of PAE support the encoding of fermata with the ( and ) character.

From the current specifications

fermata (include only one note or rest; accidentals or octave symbols must be outside the parentheses. See also "Special rhythmic groupings" below.)

The main problems with this a approach are:

  • fermata uses the same characters as the special rhythmic groupings (tuplets), which causes some additional validation / parsing problems. Only the analysis of the content gives the meaning of the signs, which is not very good practice.
  • there are some contradictions with trill and tie encoding when saying that only a note or a rest should be included since the trill and tie characters are supposed to follow the note immediately. In other words, encoding a note with a fermata and a trill is currently impossible when following strictly the current PAE specifications.

The proposal would be to use a single characters for encoding fermata since there is no need to have a start and end delimiter. It would avoid overlap with tuplets, it would also work better with chords and would be more inline with the current encoding of trills.

Some possible characters could be:

  • ?
  • *
  • c (corona)
  • h (hold)
  • p (pause and point d'orgue)

Placement of the sign should be after the note / chord

Allow square brackets in key signature

From rism-digital/muscat#808

"In the RISM guidelines for the field Key signature (031 $n), it says "Add missing sharps or flats in brackets." You use this if, for example, the piece is clearly in A major but the source only has xFC. You enter xFC[G]. This is a very old RISM rule that we've had since CD-ROM days. In the CD-ROM the sharp/flat in the key signature would display in brackets. In Muscat, the [G] can be entered but it does not display in brackets. Can you change it so that the sharps/flats are in brackets?"

The specification should be adjusted to allow square brackets in key signature

Create encoding "guidelines"

While the version 2 specification will help clarify the boundaries of the Plaine & Easie code, it will also create a hole. Previous versions were part specification, part encoding guideline. In Version 2 the guidelines are largely going to feature less in the specification, but there is likely still an audience for them.

So we should create a second document that can contain implementation guides, best practices, or other guidelines. Like "record transposing instruments in a non-transposed way" or "Should contain 3-4 measures", etc.

Split spec into v1 and v2

As discussed on the 21.10.2021 call, the PAE code spec will be split into two versions: v1 will be the current spec and will contain only enhancements and clarifications to the existing spec.

v2 will contain backwards-incompatible changes.

A landing page will route people to the right place.

Document license and copyright

We should decide on an appropriate copyright license for the PAE code specifications. There currently is not one assigned.

If we want a "public domain" license, we can choose the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication.

If we retain some "rights" and use an actual copyright license (even other Creative Commons licenses) then we will need to decide who these rights are assigned to. IAML? RISM? The Editors?

Improve Rhythmic values section

Currently the "Rhythmic values" section has no prose to begin the section, and only a table giving the values.

The friendliness and clarity of this section could probably be improved by actually describing the field and its possible values first before showing examples.

Clarification for certain meters

P&E allows compound meters—not meters that change later on, but things like this:
image

Could you please clarify the validity and display of this? Meters like c; 3/4 (starts in c, changes to 3/4 later) are obviously not allowed but c3/2 and similar are present throughout the RISM data. Currently, the Verovio validator gives you an error message with c3/2: "The input contains one or more character(s) 'c3/2'." But it should be allowed.

See also rism-digital/muscat#809, where other meters are listed that do not display and trigger a warning with the new validation.

Move MARC and UNIMARC references to the "Representations" section.

The new PAE spec has a section that spells out the different ways Plaine and Easie can be represented: MARC/UNIMARC, Single-string, multi-line string, and JSON.

Currently the MARC and UNIMARC codes for the fields are given at the start of each section. These should be moved to the MARC and UNIMARC sections to provide users with a single point of reference for the different possible fields.

Use underscore instead of plus for tied notes

Summary: As a software developer, I would like ties to be encoded with the underscore (_) rather than the plus (+), so that I can use PAE more easily in a web-based environment.

Background: The plus character, when it occurs in a URL, is used to represent a blank space. When a PAE string is used as a query parameter this can lead to some problems. Since a space can be interpreted as a "+" this means the following PAE code:

'C 'C 'D 'E 'F

can become:

'C+'C+'D+'E+'F

This may incorrectly be interpreted as a series of tied notes.

Ideally all ties would be encoded using the proper URL encoding for plus signs, %2B. However, this can lead to problems when encoding and unencoding multiple times:

'C+'C 'D 'E 'F  -->
'C%2B'C%20'D%20'E%20'F -->
'C%2B'C+'D+'E+'F  -->
'C%2B'C%2B'D%2B'E%2B'F

When the final stage is encountered, it is not possible to tell which notes were originally tied together, and which were not.

Most characters will be URL encoded when placed in a query parameter, and these are treated as either URL encoded (%2B) or, in some browsers, are used as a raw value. However, the difference with the plus sign is that the literal interpretation of it, and the URL encoding of it, can be ambiguous as to what is meant.

Subfield $s, coded validity note

What does the future of $s look like? In Plaine & Easie officially it is allowed:

5. Coded validity note
A one-character coded validity note can be introduced by a '~' at the end of the code.
Accepted characters are:

?	a mistake in the incipit has not been corrected
+	a mistake in the incipit has been corrected
t	incipit has been transcribed into modern notation

https://www.iaml.info/plaine-easie-code#toc-5

But starting with Muscat, we don't want to use it any more:

Validity (031 $s)
Do not enter anything into this field! (It is only used for old data.)

The reasoning is because someone seeing a "+" etc. in a record wouldn't intuitively know what was meant, and if changes were made then you would just say so in a note, without the encoded character.

I see that $s is only used ca. 2,500 times. I haven't checked for ~ at the end of the code though.

I'd like to propose a Muscat validation that creates a warning if $s is filled out (we don't want the field to be used, but people might accidently use it) but I thought I'd ask here first.

Clarify required clef shapes

Currently the spec does not explicitly say the only valid clef shapes.

This should be amended to say that clef shapes should be one of three values: C, G, or F.

No double dots support for mensural notation

Double dots do not exist in mensural notation. However, the specs are not explicit about the fact that having 2..8 - which would mean in CMWN 𝅗𝅥..𝅘𝅥𝅮 - is not supported for mensural notation. This should be clarified

Make clef a required field

Although the pitch and octave of every note in PAE is individually encoded, the absence of a clef field means that it cannot be properly drawn.

Clef should therefore be a required field in PAE.

Follows the discussion in #11

Use of `+` for something else than ties

We have many incipits where + has been used for encoding ligatures instead of ties.

  • For representing ligatures in mensural notation
  • For representing ligatures with stem-less notes (7.)

We need to clarify if this is supported or not. If yes, tooling needs to be adjusted accordingly. We should also consider adjusting the specifications to use another sign.

(We also have cases where + has been used to represent slurs. This is simply wrong.)

Move field references for clefs, keysig, timesig, etc. to single string representation section.

In some places in the PAE specification the value for the single-string representation is given, e.g., for key signature it says

Begin this field with the character $

However, this is only necessary when using the "single-string" representation. All other representations, including MARC, do not require this.

The exception to this is when discussing clef/keysig/timesig changes, so references to the field delimiters will be kept in that section.

Encoding chord

The current specifications of PAE support the encoding of chords with the ^ character.

From the current specifications

Enter chords from the highest to the lowest note, each one separated by ^.

Example: ''2D^'A^xF

There are several issues with this approach:

  • Parsing chords is problematic because the start of a chord is given retrospectively when the second note is encountered
  • there are no clear specifications for tied chords. Since chords have to be encoded from the highest note to the lowest, the tie between two chords will always be (except for unison chords) between two notes with a different pitch, which is problematic.
  • chords can have conflicting note durations
  • how to apply a trill or a fermata to chord is not clearly defined

The proposal would be to change the chords to a container markup. To preserve some graphical similarity with the current ^, a proposal would be to use < and >. These characters are not used elsewhere in PAE.

Example 2<''D'AxF>

Some rules would be:

  • Duration encoding should be given before the < chord start delimiter
  • Chord delimiters should contain only pitch related information (octave, accidental and pitch name)
  • Ties should follow > chord end delimiters

It would need to be decided if fermata and trills should be outside or inside chord delimiters

Add a new field to representations, `version`

The version field should be added to the various PAE representations, with values of pe or pe2.

If version is empty or missing, then pe is assumed.

What this looks like:

In multi-line strings:

@version:pe2

In JSON:

{ "version": "pe2" }

In MARC21:

$2pe2 (subject to approval and addition to the Musical Incipit Scheme Codes list.

For the single-string representation, we have a couple options:

  • Simply put pe2 at the beginning; if the string starts with pe2 then it's version 2; if it starts with pe, or starts with a clef, then it's version 1
  • Give it a delimiter at the beginning: ;pe2 or ;pe. I don't think this is necessary, but it could help if the fields get out-of-order.

Related to #5

Add prose to and examples to 'Note names' section

Currently the 'note names' section contains only the note letter names.

<section>
            <h4>Note Names</h4>
            <p>C, D, E, F, G, A, B</p>
</section>

A brief prose description should be added indicating how they combine with other parts of a note, among other things.

Add prose for Accidentals section

Currently the accidentals section does not contain any prose setting out the form of accidentals and how they should be employed. (position in a pitch string, etc.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.