Giter Site home page Giter Site logo

genre-it's Introduction

License

Genre-It

The mobile app that recognizes music genres using Deep Learning

Description

How it works

Why CNN and LSTM?

  • CNN makes sense since spectograms look like an image, each with their own distinct patterns.
  • RNNs excel in understand sequential data by making the hidden state at time t dependent on hidden state at time t-1.
  • The spectograms have a time component and RNNs can do a much better job of identifying the short term and longer term temporal features in the song.

Parallel CNN-RNN Model

  • The model passes the input spectogram through both CNN and RNN layers in parallel, concatenating their output and then sending this through a dense layer with softmax activation to perform classification.
  • The convolutional block of the model consists of 2D convolution layer followed by a 2D Max pooling layer. There are 5 blocks of Convolution Max pooling layers. The final output is flattened and is a tensor of shape None , 256.
  • The recurrent block starts with 2D max pooling layer of pool size 4,2 to reduce the size of the spectogram before LSTM operation. This feature reduction was done primarily to speed up processing. The reduced image is sent to a bidirectional GRU with 64 units. The output from this layer is a tensor of shape None, 128.
  • The outputs from the convolutional and recurrent blocks are then concatenated resulting in a tensor of shape, None, 384. Finally we have a dense layer with softmax activation.

Model accuracy

  • The accuracy of the model is around 51%
  • The reason for this is the small sample size of spectograms, which is a very small sample for building a deep learning neural network.
  • The FMA data set is challenging and has few classes which are easy to confuse among. The top leader board score on FMA-Genre Recongnition has a test F1 score of around 0.63
  • The accuracy is much better than guessing in random, which would be around 0.125 and could be improved if additional datasets and hardware was available.

Genres

1. Electronic
2. Experimental
3. Folk
4. Hip-Hop
5. Instrumental
6. International
7. Pop
8. Rock

Getting Started

Prerequisites

Installing

  • pip install -r requirements.txt

Built with

Tested on

  • Linux Debian
  • Running on Windows machines could yield issues, due to missing codecs*

Datasets used for training

Authors

  • Hanna Sababa — Full Stack Developer — hb20007
  • Chris Peppos — Full Stack Developer, Music Expert — ChrisPeppos
  • Karlen Avogian — Deep Learning, Cyber Security — kosnet2

Acknowledgments

  • Kudos to Hack{Cyprus} for organizing the event.
  • Kudos to anyone whose code and ideas were used in our project.
  • Kudos to all the open source community for allowing us achieve the impossible.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.