Giter Site home page Giter Site logo

spd_knn's Introduction

Raga Detection Using Machine Learning

This repository contains code for an end to end model for raga and tonic identification on audio samples

Note: This repository currently only contains inference code, the training code and lots of experimental code can be accessed here: https://github.com/VishwaasHegde/E2ERaga However it is not well maintained.

Getting Started

Requires python==3.6.9

Download and install Anaconda for easier package management

Install the requirements by running pip install -r requirements.txt

Model

  1. Create an empty folder called model and place it in SPD_KNN folder
  2. Download the pitch model from here and place it in the 'model' folder
  3. Download the tonic models (Hindustani and Carnatic) from here and place it in the 'model' folder
  4. Download the Carnatic raga models from here and place it in 'data\RagaDataset\Carnatic\model' (create empty folders if you need)
  5. Download the Hindustani raga models from here and place it in 'data\RagaDataset\Hindustani\model' (create empty folders if you need)

Data

  1. I dont have the permisssion to upload the datasets, the datasets has to be obtained by request from here: https://compmusic.upf.edu/node/328

Run Time Input

E2ERaga supports audio samples which can be recorded at runtime

Steps to run:

  1. Run the command python main.py --runtime=True --tradition=h --duration=30
  2. You can change the tradition (hindustani or carnatic) by choosing h/c and duration to record in seconds
  3. Once you run this command, there will be a prompt - Press 1 to start recording or press 0 to exit:
  4. Enter accordingly and start recording for duration duration
  5. After this the raga label and the tonic is outputted
  6. The tonic can also be optionally given by --tonic=D for specify D pitch as the tonic.

File input

E2ERaga supports recorded audio samples which can be provided at runtime

Steps to run:

  1. Run the command python main.py --runtime_file=<audio_file_path> --tradition=<h/c>

    Example: python test_sample.py --runtime_file=data/sample_data/Ahira_bhairav_27.wav --tradition=h

  2. The model supports wav and mp3 file, with mp3 there will be a delay in converting into wav format internally

  3. After this the raga label and the tonic frequency is outputted

Demo videos:

Live Raga Prediction

Demo

Hindustani Raga Embedding cosine similarity obtained from the model

alt text

Carnatic Raga Embedding cosine similarity obtained from the model

alt text

Acknowledgments:

  1. The model uses CREPE to find the pitches for the audio, I would like to thank Jong Wook for clarifiying my questions
  2. Also thank CompMusic and Sankalp Gulati for providing me the datasets

spd_knn's People

Contributors

vishwaashegde avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.