This project is based on scikit-learn. The accuracy of 5-Fold cross-validation is 74.2%. The classifier used is MLP with (110,110) neurons.
The confusion matrix is shown as below:
The best performing class is classical:
The worst performing class is rock:
- python 2.7
- matplotlib
- scikit-learn
- FFmpeg (demo.py only)
GTZAN
Although we are grateful that the author provides this dataset for free. There are some flaws in this dataset, see
An analysis of the GTZAN music genre dataset
and
The GTZAN dataset: Its contents, its faults, their effects on evaluation, and its future use
Do not use it if you can.
- 10 genres
- each genre has 100 soundtracks
Please use -h to see descriptions and options. X.npy and y.npy are the results of feature_extraction.py. X is the extracted features, y is the corresponding labels.
Please use -h to see descriptions and options. Below is the table of accuracy of different combinations of features of different classifiers.
Please use -h to see descriptions and options. model.out is the trained model.
Please use -h to see descriptions and options. It uses X.npy and model.out to predict the genre of any given music.