Giter Site home page Giter Site logo

1050265390 / melody-extraction-with-melodic-segnet Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bill317996/melody-extraction-with-melodic-segnet

0.0 0.0 0.0 7.14 MB

The source code of "A Streamlined Encoder/Decoder Architecture for Melody Extraction"

License: MIT License

Python 100.00%

melody-extraction-with-melodic-segnet's Introduction

Melody-extraction-with-melodic-segnet

The source code of "A Streamlined Encoder/Decoder Architecture for Melody Extraction"

Dependencies

Requires following packages:

  • python 3.6
  • pytorch 0.4.1
  • numpy
  • scipy
  • pysoundfile
  • pandas

Usage

predict_on_audio.py

Melody extraction on an audio file. The output will be .txt file of time(sec) and frequency(Hz).

usage: predict_on_audio.py [-h] [-fp FILEPATH] [-t MODEL_TYPE]
                           [-gpu GPU_INDEX] [-o OUTPUT_DIR] [-e EVALUATE]

optional arguments:
  -h
  -fp filepath            Path to input audio(.wav) (default: train01.wav)
  -t model_type           Model type: vocal or melody (default: vocal)
  -gpu gpu_index          Assign a gpu index for processing.
                          It will run with cpu if None. (default: 0)
  -o output_dir           Path to output folder (default: ./output/)
  -e evaluate             Path to ground-truth (default: None)
  -m mode                 The mode of CFP: std and fast (default: std)
                          fast mode: use sr=22050 and hop=512 (faster)
                          std mode : use sr=native_sample_rate and hop=256 (more accurate)

evaluate.py

Evaluate our result on three dataset: ADC2004, MIREX05, MedleyDB. The output will be .csv file of evaluation metrics (mir_eval).

usage: evaluate.py [-h] [-dd DATA_DIR] [-t MODEL_TYPE] [-gpu GPU_INDEX]
                   [-o OUTPUT_DIR] [-ds DATASET]
optional arguments:
  -h
  -dd data_dir          Path to the dataset folder (default:
                        Dataset/MedleyDB/Source/)
  -t model_type         Model type: vocal or melody (default: vocal)
  -gpu gpu_index        Assign a gpu index for processing.
                        It will run with cpu if None. (default: 0)
  -o output_dir         Path to output foler (default: ./output/)
  -ds dataset           Dataset for evaluate (default: Mdb_vocal)
                        Must be ADC2004 or MIREX05 or Mdb_vocal or Mdb_melody2 

data_arrangement.py

Preparing data for training.

usage: data_arrangement.py [-h] [-df DATA_FOLDER] [-t MODEL_TYPE]
                           [-o OUTPUT_FOLDER]

optional arguments:
  -h, --help            show this help message and exit
  -df DATA_FOLDER, --data_folder DATA_FOLDER
                        Path to the dataset folder (default:
                        ./data/MedleyDB/Source/)
  -t MODEL_TYPE, --model_type MODEL_TYPE
                        Model type: vocal or melody (default: vocal
  -o OUTPUT_FOLDER, --output_folder OUTPUT_FOLDER
                        Path to output foler (default: ./data/)

training.py

Please prepare the h5py file by data_arrangement.py before training.

usage: training.py [-h] [-fp FILEPATH] [-t MODEL_TYPE] [-gpu GPU_INDEX]
                   [-o OUTPUT_DIR] [-ep EPOCH_NUM] [-lr LEARN_RATE]
                   [-bs BATCH_SIZE]

optional arguments:
  -h, --help            show this help message and exit
  -fp FILEPATH, --filepath FILEPATH
                        Path to input training data (h5py file) and validation
                        data (pickle file) (default: ./data/)
  -t MODEL_TYPE, --model_type MODEL_TYPE
                        Model type: vocal or melody (default: vocal)
  -gpu GPU_INDEX, --gpu_index GPU_INDEX
                        Assign a gpu index for processing. It will run with
                        cpu if None. (default: 0)
  -o OUTPUT_DIR, --output_dir OUTPUT_DIR
                        Path to output folder (default: ./train/model/)
  -ep EPOCH_NUM, --epoch_num EPOCH_NUM
                        the number of epoch (default: 100)
  -lr LEARN_RATE, --learn_rate LEARN_RATE
                        the number of learn rate (default: 0.0001)
  -bs BATCH_SIZE, --batch_size BATCH_SIZE
                        The number of batch size (default: 50)

melody-extraction-with-melodic-segnet's People

Contributors

bill317996 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.