Raga Detection Using Machine Learning

This repository contains code for an end to end model for raga and tonic identification on audio samples

Note: This repository currently only contains inference code, the training code and lots of experimental code can be accessed here: https://github.com/VishwaasHegde/E2ERaga However it is not well maintained.

Getting Started

Requires python==3.6.9

Download and install Anaconda for easier package management

Install the requirements by running pip install -r requirements.txt

Model

Create an empty folder called model and place it in SPD_KNN folder
Download the pitch model from here and place it in the 'model' folder
Download the tonic models (Hindustani and Carnatic) from here and place it in the 'model' folder
Download the Carnatic raga models from here and place it in 'data\RagaDataset\Carnatic\model' (create empty folders if you need)
Download the Hindustani raga models from here and place it in 'data\RagaDataset\Hindustani\model' (create empty folders if you need)

Data

I dont have the permisssion to upload the datasets, the datasets has to be obtained by request from here: https://compmusic.upf.edu/node/328

Run Time Input

E2ERaga supports audio samples which can be recorded at runtime

Steps to run:

Run the command python main.py --runtime=True --tradition=h --duration=30
You can change the tradition (hindustani or carnatic) by choosing h/c and duration to record in seconds
Once you run this command, there will be a prompt - Press 1 to start recording or press 0 to exit:
Enter accordingly and start recording for duration duration
After this the raga label and the tonic is outputted
The tonic can also be optionally given by --tonic=D for specify D pitch as the tonic.

File input

E2ERaga supports recorded audio samples which can be provided at runtime

Steps to run:

Run the command python main.py --runtime_file=<audio_file_path> --tradition=<h/c>

Example: python test_sample.py --runtime_file=data/sample_data/Ahira_bhairav_27.wav --tradition=h
The model supports wav and mp3 file, with mp3 there will be a delay in converting into wav format internally
After this the raga label and the tonic frequency is outputted

Demo videos:

Live Raga Prediction

Hindustani Raga Embedding cosine similarity obtained from the model

Carnatic Raga Embedding cosine similarity obtained from the model

Acknowledgments:

The model uses CREPE to find the pitches for the audio, I would like to thank Jong Wook for clarifiying my questions
Also thank CompMusic and Sankalp Gulati for providing me the datasets

sravankodem / spd_knn Goto Github PK

spd_knn's Introduction

Raga Detection Using Machine Learning

Getting Started

Model

Data

Run Time Input

File input

Live Raga Prediction

Hindustani Raga Embedding cosine similarity obtained from the model

Carnatic Raga Embedding cosine similarity obtained from the model

spd_knn's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent