Giter Site home page Giter Site logo

ronggong / jingjusyllabicsegmentaion Goto Github PK

View Code? Open in Web Editor NEW
7.0 3.0 3.0 18.37 MB

Code for the paper: Score-informed Syllable Segmentation for A Cappella Singing Voice with Convolutional Neural Networks

Home Page: https://arxiv.org/pdf/1707.03544.pdf

Python 99.26% Shell 0.74%
onset-detection singing syllable viterbi-algorithm cnn-model

jingjusyllabicsegmentaion's Introduction

Jingju Singing Syllable Segmentation

The code in this repo aims to help reproduce the results in the work:

Jordi Pons, Rong Gong, and Xavier Serra. 2017. Score-informed Syllable Segmentation for A Cappella Singing Voice with Convolutional Neural Networks. In 18th International Society for Music Information Retrieval Conference. Suzhou, China.

This paper introduces a new score-informed method for the segmentation of jingju a cappella singing voice into syllables. The proposed method estimates the most likely sequence of syllable boundaries given the estimated syllable onset detection function (ODF) and its score. Throughout the paper, we first examine the jingju syllables structure and propose a definition of the term “syllable onset”. Then, we identify which are the challenges that jingju a cappella singing poses. We propose using a score-informed Viterbi algorithm –instead of thresholding the onset function–, because the available musical knowledge we have can be used to inform the Viterbi algorithm in order to overcome the identified challenges. In addition, we investigate how to improve the syllable ODF estimation with convolutional neural networks (CNNs). We propose a novel CNN architecture that allows to efficiently capture different time- frequency scales for estimating syllable onsets. The proposed method outperforms the state-of-the-art in syllable segmentation for jingju a cappella singing. We further provide an analysis of the segmentation errors which points possible research directions.

Steps to reproduce the experiment results

  1. Clone this repository
  2. Download Jingju a capella singing dataset, scores and syllable boundary annotations from https://goo.gl/y0P7BL
  3. Change dataset_root_path variable in src/filePath.py to locate the above dataset
  4. Python 2.7.9 and Essentia 2.1-beta3 were used in the paper; Install python dependencies from requirements.txt.
  5. Set mth_ODF, layer2, fusion and filter_shape variables in src/parameters.py
  6. Run python onsetFunctionCalc.py to produce the experiment results for above parameter setting
  7. Run python eval_demo.py to produce the evaluation result

Steps to train CNN acoustic models

  1. Do steps 1, 2, 3, 4 in Steps to reproduce the experiment results
  2. Run python trainingSampleCollection.py to calculate mel-bands features
  3. CNN models training code is located in localDLScripts folder. Use them according to the computing configurations (CPU, GPU).
  4. Pre-trained models are located in cnnModels folders

Dependencies

numpy scipy matplotlib essentia scikit-learn cython keras theano hyperopt

License

Affero GNU General Public License version 3

jingjusyllabicsegmentaion's People

Contributors

jingju-smc2016-pcs avatar ronggong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.