Hi, I run the basic-pitch/basic_pitch/experiments/run_evalua

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Question regarding evaluation about basic-pitch HOT 2 OPEN

xinzuan commented on June 12, 2024

Question regarding evaluation

from basic-pitch.

Comments (2)

xinzuan commented on June 12, 2024

When i check the value of the ref_intervals, est_intervals, it gave a really different value:

ref_interval: [[  0.98046875   1.08723958]
 [  0.99739583   1.25260417]
 [  1.09375      1.16536458]
 ...
 [384.79557292 388.55338542]
 [384.79817708 388.61067708]
 [384.80989583 388.52864583]] 
est_interval: [[387.07030113 387.36055057]
 [387.07030113 387.5475941 ]
 [387.07030113 387.5475941 ]
 ...
 [146.40265896 146.64646848]
 [362.68555193 362.89453152]
 [307.87372971 308.01433333]]

which I think this is one of the reason why the previous result is really far from what reported in the paper. After modify the functions in basic-pitch/basic_pitch/experiments/run_evaluation.py:

change minimum note length from 58.0 to 127.70 following the inconsistent minimum note length in issue #93
modify the model_inference function as follow:

def model_inference(audio_path, model, save_path,minimum_note_length=127.70):

    output = run_inference(audio_path, model)

    
    frames = output["note"]
    onsets = output["onset"]
     # frames (13678, 88) onsets(13678, 88)

    min_note_len = int(np.round(minimum_note_length / 1000 * (AUDIO_SAMPLE_RATE / FFT_HOP))) # add min_note len since it is required

    estimated_notes = note_creation.output_to_notes_polyphonic(
        frames,
        onsets,
        onset_thresh=0.5,
        frame_thresh=0.3,
        infer_onsets=True,
        min_note_len=min_note_len, # needed in the function, it will throw error if not provided
        max_freq=None, # needed in the function, it will throw error if not provided
        min_freq=None # needed in the function, it will throw error if not provided
    )
    # [(start_time_seconds, end_time_seconds, pitch_midi, amplitude)]

    
   
    pitch = np.array([n[2] for n in estimated_notes]) 
    pitch_hz = librosa.midi_to_hz(pitch)



    estimated_notes_with_pitch_bend = note_creation.get_pitch_bends(output["contour"],estimated_notes)
    times_s = note_creation.model_frames_to_time(output["contour"].shape[0])
    
    estimated_notes_time_seconds = [
        (times_s[note[0]], times_s[note[1]], note[2], note[3], note[4]) for note in estimated_notes_with_pitch_bend
    ]
    
   
    midi = note_creation.note_events_to_midi(estimated_notes_time_seconds, save_path)

    

    intervals = np.array([[times_s[note[0]], times_s[note[1]]] for note in estimated_notes_with_pitch_bend])
    

    return intervals, pitch_hz,midi # add midi in the return to be used in the evaluation

In the function main, instead of using the intervals and pitch_hz returned from the function model_inference, I used:

 __,_,midi = model_inference(audio_path, model, save_path)

est_notes = io.load_notes_from_midi(midi = midi)
if est_notes is None:
    est_intervals = []
    est_pitches = []
else:
    est_intervals, est_pitches, _ = est_notes.to_mir_eval()

I finally got the result that are close to the result reported in the paper:
{'Precision': 0.11997030494604051, 'Recall': 0.11606390831628464, 'F-measure': 0.11663329326696836, 'Average_Overlap_Ratio': 0.8401297548289717, 'Precision_no_offset': 0.7436669014704781, 'Recall_no_offset': 0.6548245337432261, 'F-measure_no_offset': 0.6874150165838026, 'Average_Overlap_Ratio_no_offset': 0.4262920646319229, 'Onset_Precision': 0.8259000078273144, 'Onset_Recall': 0.721544837754125, 'Onset_F-measure': 0.7601824436965499, 'Offset_Precision': 0.5818535280932536, 'Offset_Recall': 0.504137416529927, 'Offset_F-measure': 0.5329684074137423}

from basic-pitch.

drubinstein commented on June 12, 2024

Hi @xinzuan. The training branch is still a work in progress, so don't rely on it too heavily. Regarding your issue, it's possible that there is a difference in units between the estimate, reference timestamps and frequency values and your solution took care of the difference.

from basic-pitch.

Question regarding evaluation about basic-pitch HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent