Giter Site home page Giter Site logo

johentsch / ms3 Goto Github PK

View Code? Open in Web Editor NEW
42.0 1.0 3.0 107.24 MB

A parser for annotated MuseScore 3 files.

Home Page: https://ms3.readthedocs.io

License: GNU General Public License v3.0

Python 98.42% TeX 1.58%
corpus corpus-data corpus-generator corpus-processing corpus-tools musescore musescore2 musescore3 musescore4 music-score

ms3's Introduction

GitHub PyPI GitHub Release Date GitHub repo size

ms3 - Parsing MuseScore 3 and 4

Welcome to ms3, a Python library for parsing MuseScore files.

Statement of need

Here comes a list of functionalities to help you decide if this library could be useful for you.

  • parses MuseScore 3 and 4 files, dispensing with lossy conversion to musicXML. The file formats in question are
    • uncompressed *.mscx files,
    • compressed *.mscz files,
  • extracts and processes the information contained in one or many scores in the form of DataFrames:
    • notes (start, duration, pitch etc.) and/or rests,
    • measures (time signature, lengths, repeat structure etc.)
    • labels, such as
      • guitar/Jazz chord labels
      • arbitrary annotation labels
      • expanded harmony labels following the DCML annotation standard
      • cadences (part of the same annotation syntax)
      • form_labels (annotation standard currently in press)
    • chords, that is, onset positions that have musical markup attached, e.g. dynamics, lyrics, slurs, 8va signs...
    • metadata from the respective fields, but also score statistics, such as length, number of notes, etc.
  • stores the extracted information in a uniform and interoperable tabular format (*.tsv)
  • writes information from tabular *.tsv files into MuseScore files, especially
    • chord and annotation labels
    • metadata
    • header information (title, subtitle, etc.)
    • note coloring
  • uses a locally installed or standalone MuseScore executable for
    • batch-converting files to any output format supported by MuseScore (mscz, mscx, mp3, midi, pdf etc.)
    • on-the-fly converting any file that MuseScore can read (including MuseScore 2, cap, capx, midi, and musicxml) to parse it
  • offers its functionality via the convenient ms3 commandline interface.

View the full documentation here.

For a demo video (using an old, pre-1.0.0 version) on YouTube, click here

Installation

ms3 requires Python >= 3.10 (type python3 --version to check). Once you have switched to a virtual environment that has Python 3.10 installed you can pip-install the library via one of the two commands:

python3 -m pip install ms3
pip install ms3

If successful, the installation will make the ms3 commands available in your PATH (try by typing ms3).

Quick demo

Parsing a single score

import ms3
score = ms3.Score('musescore_file.mscz')

Parsing a corpus

import ms3
corpus = ms3.Corpus('score_directory')
corpus.parse()

Parsing several corpora

import ms3
corpora = ms3.Parse('my_research_corpora')
corpora.parse()

Making Changes & Contributing

This project uses pre-commit to ensure code quality. If you are a developer, please make sure to install it before making any changes:

cd ms3
pip install -e ".[dev]" # includes "pip install pre-commit"
pre-commit install

Acknowledgements

Development of this software tool was supported by the Swiss National Science Foundation within the project “Distant Listening – The Development of Harmony over Three Centuries (1700–2000)” (Grant no. 182811). This project is being conducted at the Latour Chair in Digital and Cognitive Musicology, generously funded by Mr. Claude Latour.

Project generated with PyScaffold

ms3's People

Contributors

arinalozhkina avatar faroit avatar github-actions[bot] avatar johentsch avatar leobruneau avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

ms3's Issues

Error IndexError: tuple index out of range when parsing score

Hi!
We are facing an error when parsing a mscx file and still no clue what can be causing the issue.
The error takes place at line 292 in annotations.py when trying to expand labels.
It believe it has to do with repetition signs maybe, but the file seems to be right by looking at it.
The code used to parse the score is:
msc3_score = ms3.score.Score(file_path.strip(), logger_cfg={'level': 'ERROR'})
harmonic_analysis = msc3_score.mscx.expanded
mn = ms3.parse.next2sequence(msc3_score.mscx.measures.set_index('mc').next)
mn = pd.Series(mn, name='mc_playthrough')
harmonic_analysis = ms3.parse.unfold_repeats(harmonic_analysis, mn)
Any advice on this woud be really appreciated!
I'll leave the example file here to run some tests.
Archivos.zip
Thanks in advance!

Spanners continuing on next line

Describe the bug
A spanner (e.g. a pedal line) that has ID 0, when being continued on the next system gets ID 0, 1.

ms3 version
1.2.2

To Reproduce
Create a pedal that spans over more than one line and look at the output of ms3 extract -C.

ms3 logger overwrites default logger [BUG]

Describe the bug
Importing ms3 sets the logging level and it cannot be changed back somehow?

ms3 version
1.2.8

To Reproduce

$ python
> import logging
> logging.basicConfig(level=logging.DEBUG)
> logging.debug("test")  # Logs correctly

New session:

$ python
> import logging
> import ms3
> logging.basicConfig(level=logging.DEBUG)
> logging.debug("test")  # Nothing is printed

Problem with peculiar voltas

Context: 1st and 2nd endings where 1st ending doesn't have an endRepeat as it jumps into the bar after the 2nd ending.

There are cases where the situation gets more complicated in terms of the writing: the 1st ending (b. 113 in the example) has a note tied to the next sounding bar (b. 115-->114):
image

ms3 understands that this 'next sounding bar' is a 3rd volta.
image

Score snippet creator [FEATURE]

Describe the solution you'd like

From a Score object it should be possible to create a snippet score from measure x to measure y.

Concrete ToDos (Optional)

There is some old code from the 2019 MuseScore 2 parser that could simply be adapted.

Section breaks

Section breaks set back the measure numbers (MN) to 1 (or 0). Currently this cannot happen because it would lead to ambiguous values in the MN column. Pieces containing section breaks should probably receive an additional column section so that the mn column can be disambiguated. This problem should be solved in conjunction with #5

Don't assign measure number 0 to incipits

Incipits should be considered as markup, not part of the music. Right now they are excluded from bar count and get MN 0 but that makes them indistinguishable from upbeat measures. Suggestion: Replace 0 by x or missing value
image

Abbreviated writing, tremolos, harmonics and 8va

ms3 does not acknowledge abbreviated writing for repeated notes, tremolos, the real sound of harmonics or 8va lines, no matter if those come from a XML file or are engraved in MuseScore directly.

A passage such as this one:

image

would be processed as:

image

Multimeasure rests lead to wrong duration_qb [BUG]

Describe the bug

When using the shortcut "m", musescore displays multimeasure rests within a single bar, indicating the number of full bars without notes:

image

In mscx.measures(), this leads to some confusing outputs in the duration_qb column (also act_dur and quarterbeats as a consequence): In both parsed versions of the above score (with or without m activated), the number of rested measures is correct. However, in scores where multimeasure rest was activated, the duration of the second full measure in the multirest is abnormally high. As if it were a single very long measure:

mmr_bug

The duration of multirest is superfluously added to a single bar. If someone were to sum the duration values, they could get drastically different piece durations depending on whether or not the m shortcut was used. The m shortcut should intuitively only affect how the music is displayed within musescore.

ms3 version
1.2.7 (first discovered in 1.2.3)

To Reproduce

I think any score with multimeasure rests should do. I can gladly send my example files if needed!

Logging gets into an infinite loop

Bug found in the current version of the development branch.

Desired behaviour

If the function receives an invalid value, it should display an error message:

>>> from ms3 import roman_numeral2fifths
>>> roman_numeral2fifths('V/V', logger=None)
ERROR    root -- /home/hentsche/PycharmProjects/ms3/src/ms3/utils.py (line 2292) split_scale_degree(): 
	 V/V is not a valid scale degree.

Actual behaviour

The code gets into an infinite loop:

File .../python3.10/logging/__init__.py:1622, in Logger._log(self, level, msg, args, exc_info, extra, stack_info, stacklevel)
   1620     elif not isinstance(exc_info, tuple):
   1621         exc_info = sys.exc_info()
-> 1622 record = self.makeRecord(self.name, level, fn, lno, msg, args,
   1623                          exc_info, func, extra, sinfo)
   1624 self.handle(record)

File .../ms3/logger.py:136, in config_logger.<locals>.make_record_with_extra(name, level, fn, lno, msg, args, exc_info, func, extra, sinfo)
    127 def make_record_with_extra(name, level, fn, lno, msg, args, exc_info, func, extra, sinfo):
    128     """
    129     Rewrites the method of record logging to pass extra parameter.
    130     Returns
   (...)
    134                             _message_type_full - name of message type accordingly to enum class MessageType
    135     """
--> 136     record = original_makeRecord(name, level, fn, lno, msg, args, exc_info, func, extra=extra, sinfo=sinfo)
    137     if extra is None:
    138         record._message_id = ()

File .../ms3/logger.py:136, in config_logger.<locals>.make_record_with_extra(name, level, fn, lno, msg, args, exc_info, func, extra, sinfo)
    127 def make_record_with_extra(name, level, fn, lno, msg, args, exc_info, func, extra, sinfo):
    128     """
    129     Rewrites the method of record logging to pass extra parameter.
    130     Returns
   (...)
    134                             _message_type_full - name of message type accordingly to enum class MessageType
    135     """
--> 136     record = original_makeRecord(name, level, fn, lno, msg, args, exc_info, func, extra=extra, sinfo=sinfo)
    137     if extra is None:
    138         record._message_id = ()

    [... skipping similar frames: config_logger.<locals>.make_record_with_extra at line 136 (2963 times)]

RecursionError: maximum recursion depth exceeded

Fixes tried

It is, however, working when a logger name is specified:

>>> from ms3 import roman_numeral2fifths
>>> roman_numeral2fifths('V/V', logger='test_logger')
ERROR    test_logger -- /home/hentsche/PycharmProjects/ms3/src/ms3/utils.py (line 2292) split_scale_degree(): 
	 V/V is not a valid scale degree.

Include information on columns for generating expressive frictionless schemas [FEATURE]

Scope: anything that can be expressed according to this frictionless jsonSchema

  • Field information
    • name
    • type
    • title
    • description (copy from docs)
    • constraints (required [= notna], unique, pattern, enum, minLength, maxLength, minimum, maximum)
    • (example)
  • primaryKey

Mapping between Python (and pandas) object types to property sets:

  • Fraction: string + pattern constraint
  • Tuple: array
  • int: integer + required=True
  • Int64: integer + required=False (default)
  • Categorical (pandas): string + enum constraint

Incorrect unfolding if a jump is to happen only the third time [BUG]

Take this example
incorrect_unfolding.zip

measure 1 2 3 4 5 6 7
jump 𝄋 |: to 𝄌 :| D.S. al Coda 𝄌

Expected unfolding

1 2 3 4 5 3 4 5 6 2 3 4 7

Actual unfolding

1 2 3 4 5 3 4 7

Suggested solution

The next column for m. 4 needs to read 5 5 7 rather than 5 7.
If the pointer hits an end-repeat (m. 5) everything repeated bar could get its next bar prepended. Or only "non-default" bars, such as m. 4 in the example.

notes given as MIDI values only

When extracting the notes, it would be useful to have their names besides the midi indices, the same as the ambitus is referred to through midi indices and note names.

Voltas without endRepeat

There are cases where voltas actually don't have endRepeats because they serve for a dal segno/da capo. This needs to be taken into account when computing the next column.

For example:
image

get rid of warnings emitted by squash_staves()

In most cases they are unnecessary, e.g. here when calling ms3 extract -N on peri_euridice:

WARNING euridice_3_venere_e_orfeo:MeasureList -- bs4_measures.py (line 575) squash_staves(): mc 3887 300 Name: mc, dtype: int64: The values ['-1' '-2' '-3' '-4' '-5' '-6' '-7' '-8' '-9'] in 'voice/BarLine/linked/indexDiff' of staff [ 4 5 6 7 8 9 10 11 12] are lost.

The problem occurs when the information contained in the simultaneous <Measure> nodes for all staves is to be compressed into a single value in the measures DataFrame. Most values that cause a warning are disregarded anyways, the warnings are rather cautionary. They have, however, proven useful in the past for detecting special cases, for example some of the movements of Scarlatti op. 1 where the two upper staves are in 12/8 and the two lower staves are in 4/4 meter, currently leading to wrong temporal positions.

The solution could involve keeping a list of XML components (such as voice/BarLine/linked/indexDiff in the example above) that are to be ignored, and to collect the differing values for the different staves for relevant cases like the differing time signatures in Corelli. This might be a call for an overall change in the XML -> DataFrame conversion because it would also solve the problem of lyrics with several verses (#32 )

Imported labels not showing on score

When importing a set of annotations using the load_annotations and attach_labels methods, they are not shown on the score, although they are stored in the file as the expand function works afterwards. Would it be interesting to add an option whereby the user can choose whether to show them or not?

Documentation - JOSS Paper Reference

There is a reference labeled as "to be submitted" in the JOSS submission

(McLeod, A., & Rohrmeier, M. (To be submitted). An integrated system for harmonic analysis including chord tone alterations. Transactions of the International Society for Music Information Retrieval.

As it is unclear when this will be the case and the submission also doesn't have a DOI, I'd suggest to remove it.

Parsing error

Error produced while parsing all the scores in a folder.
Commands executed:

p = ms3.Parse("name of the folder", key = 'all')
p.parse_mscx()

ERROR   Did25M-Va_lusingando-1730-Sarro[2.17][0159] -- parse.py (line 639) _parse():
        Traceback (most recent call last):
          File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\parse.py", line 632, in _parse
            score = Score(path, read_only=read_only, logger_name=self.logger.name, level=level)
          File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\score.py", line 125, in __init__
            self._parse_mscx(mscx_src, read_only=read_only)
          File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\score.py", line 195, in _parse_mscx
            self._mscx = MSCX(self.full_paths['mscx'], read_only=read_only, parser=self.parser, logger_name=ln)
          File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\score.py", line 376, in __init__
            self.parse_mscx()
          File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\score.py", line 387, in parse_mscx
            self._parsed = _MSCX_bs4(self.mscx_src, read_only=self.read_only, logger_name=self.logger.name)
          File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\bs4_parser.py", line 59, in __init__
            self.parse_measures()
          File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\bs4_parser.py", line 89, in parse_measures
            self.parse_mscx()
          File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\bs4_parser.py", line 69, in parse_mscx
            self.soup = bs4.BeautifulSoup(file.read(), 'xml')
          File "C:\Users\user\AppData\Local\Programs\Python36\lib\encodings\cp1252.py", line 23, in decode
            return codecs.charmap_decode(input,self.errors,decoding_table)[0]
        UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 40076: character maps to <undefined>

And finally:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\parse.py", line 494, in parse_mscx
    self.collect_annotations_objects_references(ids=ids)
  File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\parse.py", line 154, in collect_annotations_objects_references
    if 'annotations' in score._annotations:
AttributeError: 'NoneType' object has no attribute '_annotations'

p

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\parse.py", line 900, in __repr__
    return self.info(return_str=True)
  File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\parse.py", line 438, in info
    detached = sum(True for id in parsed_ids if self._parsed_mscx[id].has_detached_annotations)
  File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\parse.py", line 438, in <genexpr>
    detached = sum(True for id in parsed_ids if self._parsed_mscx[id].has_detached_annotations)
AttributeError: 'NoneType' object has no attribute 'has_detached_annotations'

p.parsed

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\parse.py", line 62, in parsed
    return {k: score.full_paths['mscx'] for k, score in self._parsed_mscx.items()}
  File "C:\Users\user\AppData\Local\Programs\Python36\lib\site-packages\ms3\parse.py", line 62, in <dictcomp>
    return {k: score.full_paths['mscx'] for k, score in self._parsed_mscx.items()}
AttributeError: 'NoneType' object has no attribute 'full_paths'

Endless loop

ms3.parse.next2sequence() should check whether the next column contains the value -1 somewhere to prevent an endless loop.

Export to .dez format [FEATURE]

Describe the solution you'd like
ms3 transform command should have an option to export expanded harmony tables to the JSON format .dez of Algomus' Dezrann app. In addition, the converter should be a standalone script.

Concrete ToDos (Optional)

  • Write a first draft (667c2a1 summarizing work by LC & JH)
  • #49
  • correctly treat first and second endings (#50)
  • correctly treat anacrusis (see K283.1 for example)
  • separate localkey segments (#50)
  • separate phrases (#50)
  • extract cadences (#50)
  • #48
  • add parameters for choosing which feature goes into which line (three above, three below) (#50)
  • add a test case for a set of labels issued elsewhere, e.g. .rntxt converted to a DataFrame (not planned)
  • simply the script according to the fact that quarterbeats_all_endings are now included with all facets by default (v2.2.0)

`ms3 convert` doesn't recognize absolute path

I was trying to use

ms3 convert -m usr/bin/mscore3 -f some/input/path/file.cap -t mscx -o ~/some/output/path

but ms3 tried to save the file to

some/input/path/~/some/output/path

which is obviously not valid.

Adding or converting to other formats

Hi guys,
in the past i converted musescore format to lilypond and to synthv. But I struggled with XML parsing.
Here are my repos:

You can here the result here https://www.youtube.com/watch?v=a3G_8BG2l7Q

Would this library help me skip the part where i have to convert the mscz/mscx to objects? Where should I start?
Would you be interested in adding something like that?
Thanks

Repeats unfolding functionality

A repeats unfolding functionality would be much welcomed. This would make measure counts more accurate and would avoid problems when an organ point is opened (but not closed) before the return to the beginning or the segno.

KeyError on "extended" in concat branch

I get the following error with the current concat branch:

File "/home/andrew/anaconda3/envs/harmony/lib/python3.7/site-packages/ms3/parse.py", line 1005, in <dictcomp>
    matching_candidates = {wh: {(key, i): self.fnames[key][i] for key, i in ids if (key, i) in lists[wh]} for wh in what[1:]}
KeyError: 'extended'

I use it like this:

# Add musescore and tsv suffixes to filename match
filename_regex = re.compile(basename + "\\.(mscx|tsv)")

# Parse scores and tsvs
parse = Parse(annotations_dir, file_re=filename_regex)
parse.add_dir(labels_dir, key="labels", file_re=filename_regex)
parse.parse()

# Write annotations to score
parse.add_detached_annotations("MS3", "labels")
parse.attach_labels(staff=2, voice=1, check_for_clashes=False)

# Write score out to file
parse.store_mscx(root_dir=labels_dir, suffix="_inferred", overwrite=True)

[JOSS] reference rendering

👋 @johentsch here is a few more small issues about the references in the JOSS paper

  1. the following paper (I guess ISMIR 23) should probably get a year even though it was just accepted. Otherwise try to change the rendering so that inPress isn't repeated twice.
image
  1. I guess the company is called NPC Imaging. So please change the capitalization here.
image
  1. the colon in "bigger tent". should be outside of the hyphens.
image

this is part of the JOSS review openjournals/joss-reviews#5195

Unclear distinction between grace notes and appoggiaturas

ms3 seems to distinguish between grace notes and appogiaturas depending on their duration, although this is not always accurate from a musicological point of view.

In the example:

  • the C semiquaver in b. 1 and the D crotchet in b.3 in the voice are treated as grace notes (grace16 & grace4 respectively)
  • the D and B quavers in the violins in b. 3 are retrieved as appoggiature.
    image

This file comes from a previous XML, but the same happens when the music is engraved in MuseScore directly (different notes but same durations).

Strange logging behavior

If I am using a logger outside of the ms3 code, but then use ms3, the logging stops. For example:

import logging
from ms3 import Parse

logging.basicConfig(filename="test.log", level=logging.DEBUG, filemode="w")

logging.debug("Log message")  # This correctly goes to test.log

p = Parse("Kozeluh")
...
# Do some ms3 parsing things
...

logging.debug("Log message 2")  # This message doesn't go to test.log, or stdout, or anywhere else I can find.

Documentation - JOSS Paper

Here are some small comments regarding the JOSS paper submission:

  • l.25-l.27: please clarify or add an example where the lossy conversion through musicXML poses a problem in the proposed workflow. In the paper, it is not clear why this could not be solved with MusicXML. I would guess that especially when you need to edit the MuseScore file in ms3, an intermediate step through MusicXML would make certain operations impossible or destroy some parts of the score layout.

  • l.29-30: DataFrames are not necessarily feature matrices. The feature matrices extracted by ms3 are represented as DataFrames.

  • l.31: "... to enable version control" sounds a bit like TSV is required for version control. It definitely simplifies it and it is also much easier to see differences between the annotations across different commits. Could you make this point a bit clearer?

  • l. 35 "write back" -> "save"

  • L.35: To me the term "data translocation" is not clear. Do you mean conversion?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.