Could you elaborate on how did you get the Dataset in the desired folder structure (as suggested in settings.py) ?
The MedleyDB dataset does not have any train/test split, but the settings.py tries to mention paths for
MEDLEY_TRAIN_FEATURE_BASEPATH, MEDLEY_TEST_FEATURE_BASEPATH.
The Dataset that I got from MedleyDB website has following structure:
V1
V1/sound_track_name
V1/sound_track_name/sound_track_name_RAW
V1/sound_track_name/sound_track_name_RAW/{multiple_RAW files}
V1/sound_track_name/sound_track_name_STEMS
V1/sound_track_name/sound_track_name_STEMS/{multiple_STEM files}
V1/sound_track_name/sound_track_name_MIX.wav
So you can see that I have only _RAW, _STEMS and _MIX wav files. The MedleyDB suggets the installation of a python library called 'medleydb' for annotations and metadata.
So can you tell whether we need to set the PATHS in settings.py before running data_prep.py ?
If so then how do we get the data set into desired structure ?
If not then I think that data_prep.py expects a different structure of dataset.
Conclusively I want to know what needs to been done between downloading the dataset and running data_prep.py ?
Hi,
Nice work you've done.
I am trying to reproduce some of the experiment also in conjunction with this paper you also mention in your work.
I have some questions about the window size.
In your thesis you say a window size of 3 seconds is preferred. Although you only experiment only up until 5 seconds. Did your experiment with a duration > 5 seconds were super bad or you simply decided not to try with such a configuration?
Another question about the threshold for the activation data. I am currently using a threshold of 0.5 to assign the label 1 to each instrument, as suggested in the paper.
According to your work, would you say that for a window of 5s a threshold of .4 or .45 would be more appropriate?