Giter Site home page Giter Site logo

pretty-midi's Introduction

pretty_midi contains utility function/classes for handling MIDI data, so that it's in a format which is easy to modify and extract information from.

Documentation is available here. You can also find a Jupyter notebook tutorial here (click here to load in Colab).

pretty_midi is available via pip or via the setup.py script. In order to synthesize MIDI data using fluidsynth, you need the fluidsynth program and pyfluidsynth.

If you end up using pretty_midi in a published research project, please cite the following report:

Colin Raffel and Daniel P. W. Ellis. Intuitive Analysis, Creation and Manipulation of MIDI Data with pretty_midi. In Proceedings of the 15th International Conference on Music Information Retrieval Late Breaking and Demo Papers, 2014.

Example usage for analyzing, manipulating and synthesizing a MIDI file:

import pretty_midi
# Load MIDI file into PrettyMIDI object
midi_data = pretty_midi.PrettyMIDI('example.mid')
# Print an empirical estimate of its global tempo
print(midi_data.estimate_tempo())
# Compute the relative amount of each semitone across the entire song, a proxy for key
total_velocity = sum(sum(midi_data.get_chroma()))
print([sum(semitone)/total_velocity for semitone in midi_data.get_chroma()])
# Shift all notes up by 5 semitones
for instrument in midi_data.instruments:
    # Don't want to shift drum notes
    if not instrument.is_drum:
        for note in instrument.notes:
            note.pitch += 5
# Synthesize the resulting MIDI data using sine waves
audio_data = midi_data.synthesize()

Example usage for creating a simple MIDI file:

import pretty_midi
# Create a PrettyMIDI object
cello_c_chord = pretty_midi.PrettyMIDI()
# Create an Instrument instance for a cello instrument
cello_program = pretty_midi.instrument_name_to_program('Cello')
cello = pretty_midi.Instrument(program=cello_program)
# Iterate over note names, which will be converted to note number later
for note_name in ['C5', 'E5', 'G5']:
    # Retrieve the MIDI note number for this note name
    note_number = pretty_midi.note_name_to_number(note_name)
    # Create a Note instance for this note, starting at 0s and ending at .5s
    note = pretty_midi.Note(velocity=100, pitch=note_number, start=0, end=.5)
    # Add it to our cello instrument
    cello.notes.append(note)
# Add the cello instrument to the PrettyMIDI object
cello_c_chord.instruments.append(cello)
# Write out the MIDI data
cello_c_chord.write('cello-C-chord.mid')

pretty-midi's People

Contributors

a-pillay avatar ajk4 avatar almostimplemented avatar apmcleod avatar areeves87 avatar bzamecnik avatar carlthome avatar cflamant avatar craffel avatar daviddiazguerra avatar douglaseck avatar drunkwcodes avatar gulnazaki avatar jalammar avatar jvlmdr avatar kevinzakka avatar maezawa-akira avatar michalskibinski109 avatar mxkrn avatar nintorac avatar pkirlin avatar rafaelvalle avatar saisimon avatar sccds avatar tengyifei avatar tomrolb avatar vug avatar yao-lirong avatar yaph avatar yawjalik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pretty-midi's Issues

Timing issues in fluidsynth

For some MIDI files, some instruments have slightly different timing and get out of sync over time when using the fluidsynth synthesis function, due to rounding errors when accumulating the current sample (observed by @dkario)

get_pitch_class_transition_matrix returns unexpected results for some sequences.

It's hard to do simple next-note transition probabilities using get_pitch_class_transition_matrix for certain midi files.

This sequence:
Note(start=0.000000, end=0.200000, pitch=60, velocity=100)
Note(start=0.250000, end=0.450000, pitch=61, velocity=100)
Note(start=0.500000, end=0.700000, pitch=60, velocity=100)
Note(start=0.750000, end=0.950000, pitch=62, velocity=100)
Note(start=1.000000, end=1.200000, pitch=67, velocity=100)
Yields get_pitch_class_transition_matrix()
[[ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

By only changing the offset time (making the notes less staccato) I get different results:
Note(start=0.000000, end=0.237500, pitch=60, velocity=100)
Note(start=0.250000, end=0.487500, pitch=61, velocity=100)
Note(start=0.500000, end=0.737500, pitch=60, velocity=100)
Note(start=0.750000, end=0.987500, pitch=62, velocity=100)
Note(start=1.000000, end=1.237500, pitch=67, velocity=100)
Yields get_pitch_class_transition_matrix()
[[ 0. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

This is due to the use of a hardcoded time_thresh:
# use 20hz(0.05s) as the maximum time threshold for transitions
time_thresh = 0.05

I imagine this is desired behavior for more complicated files. I believe the right thing to do is to compute vertical grouping first (chunking nearby notes into chords based on a time threshold) and then in a seconds step calculate next-note transition probabilities by moving horizontally through the chunked sequence. I guess you still might want to drop a transition based on a long pause. But in many kinds of music this is an odd claim to make. It amounts to claiming that a staccato performance of a piece should yield different note transition probabilities than the same piece played legato.

I don't think this warrants a fix(!) I just wanted to communicate my thoughts. Feel free to close out "Working as intended"

Tempo change on non-zero track warning causes an error now

This warning causes an error:

warnings.warn(("Tempo change events found on non-zero tracks."
    "  This is not a valid type 0 or type 1 MIDI "
    "file.  Timing may be wrong.", RuntimeWarning))

Should be

warnings.warn(("Tempo change events found on non-zero tracks."
    "  This is not a valid type 0 or type 1 MIDI "
    "file.  Timing may be wrong."), RuntimeWarning)

Thanks to @hkmogul for finding this

get_all_events method

Create a method which returns a list of all events, both for Instrument and PrettyMIDI objects. It should probably return a list of lists, where each inner list has the event's time, data, and probably the event itself.

Time is not a float. Note timings are sometimes float, sometimes np.float64

https://github.com/craffel/pretty-midi/blob/master/pretty_midi/containers.py#L17
In fact there's an explicit test which raises an error if time is a float. It seems to be an np.float64. This is just a documentation issue. However it raises a question: do you intend for note.start and note.end to be np.float64 or to be float? The documentation says float, but in fact you get np.float64 values. This has caused us some problems in going back and forth between protos and PrettyMIDI instances. I believe the transformation from float to np.float64 happens when timing is manipulated via self.__tick_to_time = np.zeros(max_tick + 1) (this default constructor does yield an np array of type np.float64).

In [1]: import pretty_midi

In [2]: x = pretty_midi.Note(10, 10, 1.00, 1.00)

In [3]: type(x.start)
Out[3]: float

In [4]: mf = pretty_midi.PrettyMIDI('/tmp/example.mid')
Reading /tmp/example.mid

In [5]: type(mf.instruments[0].notes[0].start)
Out[5]: numpy.float64

Utility functions

Most functions should have an inverse:

  • midi note number to hz
  • midi note number to note name
  • percussion note number to drum name
  • program number to instrument name for general MIDI
  • program number to instrument class for general MIDI (should not have inverse)
  • pitch bend value to absolute pitch in semitones

Notes go missing when the have overlapping start/end times

This may be a playback bug, not a pretty_midi bug, but Douglas reports

"This plays back really freaky if the multiplier is >= 1.0, fine if
it's < 1.0. Although at faster tempos it still gets kinda freaky and
has volume variations where it shouldn't. But that's probably a
playback problem, not an encoding problem?"

pm = pretty_midi.PrettyMIDI()

pm.instruments.append(pretty_midi.Instrument(56))

beat_dur = 0.2
note_dur = beat_dur * 1.0
time = 0.0

for beat in range(0,100):

    note = 60;
    pm.instruments[0].notes.append(pretty_midi.Note(90, note, time,
time + note_dur))
    time += beat_dur

midi_filename = "borked.mid"
pm.write(midi_filename)

synthesize() clips notes when they overlap in time and pitch

If the MIDI file contains two notes (in the same instrument) that have the same pitch and overlap in time, the second of the two will not get synthesized. I.e., say I have a C4 from 1-3s, and another C4 from 2.5-4s: the second note gets "killed" during synthesis.

Admittedly two overlapping notes by the same instrument is not physically possible for all instruments (it is for a guitar for example though), but I think that the correct behavior should be to give the onset of the second note priority over the offset of the first note?

Don't collapse instruments by program number

When loading in a MIDI file, a single Instrument instance is created for all events on all channels and all tracks which have the same program number. This is problematic primarily because pitch bend and other control events will get collapsed onto a single instrument, when they shouldn't be. For example:

import pretty_midi
pm = pretty_midi.PrettyMIDI()
i = pretty_midi.Instrument(0)
i.notes.append(pretty_midi.Note(100, 36, 0.0, 1.0))
i.notes.append(pretty_midi.Note(100, 40, 0.0, 1.0))
i.notes.append(pretty_midi.Note(100, 45, 0.0, 1.0))
pm.instruments.append(i)
i = pretty_midi.Instrument(0)
i.notes.append(pretty_midi.Note(100, 66, 0.0, 1.0))
for n in range(8000):
    i.pitch_bends.append(pretty_midi.PitchBend(n, (n + 10)/8000.))
pm.instruments.append(i)
pm.write('test.mid')
pm2 = pretty_midi.PrettyMIDI('test.mid')
print len(pm2.instruments)
print len(pm2.instruments[0].notes)
print len(pm2.instruments[0].pitch_bends)

yields

1
4
8000

but

import midi
print len(midi.read_midifile('test.mid'))

yields 3. I.e., the net result is that when loading in this MIDI file which has 3 tracks (one timing track, one "chord" track with no pitch bends, and one "single note" track with a single note with many pitch bends) is that the two instrument tracks get merged into one Instrument which has all four notes and all pitch bends - meaning that the "chord" gets pitch bent, when it shouldn't. So, we should create separate Instrument instances for all channels and tracks.

MIDI files with corrupt pitch values are not handled correctly

Sometimes, midi reads in a MIDI file with has NoteOnEvents with data[0] > 127:

In [1]: import midi
In [2]: m = midi.read_midifile("data/clean_midi/mid/Celine Dion/That's The Way It Is.mid")
In [3]: for t in m:
   ...:     for e in t:
   ...:         if type(e) == midi.NoteOnEvent:
   ...:            if e.data[0] > 127:
   ...:                print e
   ...:
midi.NoteOnEvent(tick=4656, channel=7, data=[253, 75])
midi.NoteOnEvent(tick=0, channel=7, data=[254, 75])
midi.NoteOnEvent(tick=48, channel=7, data=[253, 0])
midi.NoteOnEvent(tick=0, channel=7, data=[254, 0])

This is because midi just loads in data via ord https://github.com/vishnubob/python-midi/blob/master/src/fileio.py#L94, so as long as the argument is a 8-bit char it will happily set a data value to a number more than 127. So, this likely happens for other events too, not just NoteOnEvents data[0]. This can create issues later on. Either midi should raise an exception, we should raise an exception, or we should issue a warning and ignore those events with invalid data values.

hooks to modify/save metadata?

I'm finding myself wanting to be able to generate midi files with embedded metadata (author, software used, version number, etc). Is that possible, and/or within the scope of pretty-midi?

Writing and reading back in should be roughly lossless

I.e. if you write a MIDI file out and read it back in, you should get at least the same collection of notes, time signature changes, tempo changes, pitch bends, key signatures, etc. (all of the data that pretty_midi stores)

Control changes

Allow for control changes to be stored, just like pitch bends.

Get "track" names for instruments

@craffel is it possible to obtain the MIDI "track" names for the midi_data.instruments ?

I need to convert a set of MIDI files to JAMS files, but I only want to extract the notes of one "track" (or instrument) from each MIDI file, and I know the name of that track. I've been playing around with pretty_midi but haven't been able to get at this information. As an example, if I import one of these MIDI files into GarageBand I can see the track names: http://i.imgur.com/ziGf3I7.png

Is there a way to extract this information using pretty_midi?

Additional features for key/time signatures

cc @rafaelvalle

I'm writing unit tests for the whole library, and found a few things I think should be changed/fixed with the time/key signature changes.

  1. They currently aren't being written out in PrettyMIDI.write; they should be included here. I started to do this myself, but I was unsure how to convert from our KeySignature/TimeSignature to python-midi data/events.
  2. Related - I think it would be useful, for writing out any maybe in general, if midi_key_to_key_number got moved to a separate function in utilities.py and called something like mode_accidentals_to_key_number; instead of taking in a midi.event.KeySignature it would just take in num_accidentals and mode. Then, we should also have a key_number_to_mode_accidentals, which will be handy for writing out, I think.
  3. midi_key_to_key_number and key_name_to_key_number don'tt have a Returns section in their docstring, I overlooked this when merging.
  4. When constructing a PrettyMIDI object without a MIDI file, key_changes and time_signature_changes aren't getting created because they're created in the _load_metadata function which isn't called in that case. So, we need to create empty lists for them manually, as is done for instruments; https://github.com/craffel/pretty-midi/blob/master/pretty_midi/pretty_midi.py#L96

I'll update if I find anything else!

bend_range causes out of bounds error in get_piano_roll

When an instrument has a pitch bend which is past the last note, this causes an out of bounds error:

# Column indices effected by the bend
bend_range = np.r_[int(start_bend.time*fs):int(end_bend.time*fs)]
# Construct the bent part of the piano roll
bent_roll = np.zeros(piano_roll[:, bend_range].shape)

because the piano roll is initialized to only be big enough for notes (ignores pitch bends). Should use get_end_time instead.

Add fs parameter to get_piano_roll/get_chroma

Right now, piano rolls (and therefore chroma matrices) are created by first sampling at 100 Hz, then computing the mean to aggregate over the supplied time intervals. If you want a higher sampling rate, or even just a different one, it's more simple and principaled to just use a different sampling rate and not do averaging. The averaging is mostly for longer, non-uniform intervals (like beats).

estimate_tempo can give an index out of bounds error

In [1]: import pretty_midi

In [2]: a = pretty_midi.PrettyMIDI()

In [3]: a.estimate_tempo()
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-3-aa51704cfe91> in <module>()
----> 1 a.estimate_tempo()

/usr/local/lib/python2.7/site-packages/pretty_midi-0.0.1-py2.7.egg/pretty_midi/pretty_midi.pyc in estimate_tempo(self)
    342                 Estimated tempo, in bpm
    343         '''
--> 344         return self.estimate_tempii()[0][0]
    345
    346     def get_beats(self):

IndexError: index 0 is out of bounds for axis 0 with size 0

Writing out and reading back in can lead to max tick error

With this file: http://www.jsbach.net/midi/bwv988/988-v04.mid
This code:

import pretty_midi
midi_data = pretty_midi.PrettyMIDI('988-v04.mid')
midi_data.write('/tmp/test.mid')
new_midi_data = pretty_midi.PrettyMIDI('/tmp/test.mid')

results in

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

pretty_midi/pretty_midi.pyc in __init__(self, midi_file, resolution, initial_tempo)
     66             if max_tick > MAX_TICK:
     67                 raise ValueError(('MIDI file has a largest tick of {},'
---> 68                                   ' it is likely corrupt'.format(max_tick)))
     69 
     70             # Create list that maps ticks to time in seconds

ValueError: MIDI file has a largest tick of 268435458, it is likely corrupt

Reported by @douglaseck.

Notes disappear when writing

Seems to be an issue either with how midi.write_midifile is being called or midi.write_midifile itself, because the notes are being collected correctly beforehand.

In [4]: for i in xrange(10):
    pm = pretty_midi.PrettyMIDI('test.mid')
    print pm.get_onsets().size
    pm.write('test.mid')
   ...:
4359
3518
3490
3483
3480
3478
3476
3475
3475
3474

Spotted by @hkmogul

Add intro to docs

The docs could use an intro and usage examples, also a reference to a paper to cite if/when it's available.

PrettyMIDI.fluidsynth fails when all of self.instruments have no notes

In [1]: import pretty_midi
In [2]: pm = pretty_midi.PrettyMIDI()
In [3]: pm.instruments.append(pretty_midi.Instrument(0, 0))
In [4]: pm.instruments.append(pretty_midi.Instrument(0, 0))
In [5]: pm.fluidsynth()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-8fb00d507fbe> in <module>()
----> 1 pm.fluidsynth()

pretty_midi/pretty_midi.pyc in fluidsynth(self, fs, sf2_path)
    717             synthesized[:waveform.shape[0]] += waveform
    718         # Normalize
--> 719         synthesized /= np.abs(synthesized).max()
    720         return synthesized
    721

numpy/core/_methods.pyc in _amax(a, axis, out, keepdims)
     24 # small reductions
     25 def _amax(a, axis=None, out=None, keepdims=False):
---> 26     return umr_maximum(a, axis, None, out, keepdims)
     27
     28 def _amin(a, axis=None, out=None, keepdims=False):

ValueError: zero-size array to reduction operation maximum which has no identity

Should return np.array([]).

Empty array dtype issue in latest numpy

Due to this change in numpy 1.10:

https://github.com/numpy/numpy/blob/master/doc/release/1.10.0-notes.rst#default-casting-rule-change

empty arrays are cast as type np.float64 and cannot be added to type np.int16, which is what all calls to get_piano_roll (both pretty_midi.py and instrument.py). This can be solved I think by forcing all arrays to explicitly be of type np.int16 (even the empty ones). MWE to reproduce the error:

`import jams
import pretty_midi
import numpy as np

print jams.version
print pretty_midi.version
print np.version

jam = jams.load('./jams/TRAAAZF12903CCCF6B.jams')
ann = jam.search(namespace='beat')[0]

midi_md5 = ann.annotation_metadata.annotator.midi_md5

midi_object = pretty_midi.PrettyMIDI(
'mid_aligned/TRAAAZF12903CCCF6B/{}.mid'.format(midi_md5))

piano_roll = midi_object.get_piano_roll()`

is_drum referenced before assignment

  File "build/bdist.macosx-10.9-x86_64/egg/pretty_midi/pretty_midi.py", line 179, in _load_instruments
UnboundLocalError: local variable 'is_drum' referenced before assignment

Should be moved up, was broken in 4a6a714

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.