Light

rerrayne / bc_learning_pytorch Goto Github PK

View Code? Open in Web Editor NEW

6.0 4.0 0.0 153 KB

Python 100.00%

bc_learning_pytorch's Introduction

BC learning for sounds PyTorch Port

This is the port of Between-class Examples for Deep Sound Recognition to PyTorch. Dataset generation was taken from the original repo.

Implementation of Learning from Between-class Examples for Deep Sound Recognition by Yuji Tokozume, Yoshitaka Ushiku, and Tatsuya Harada (ICLR 2018).

This also contains training of EnvNet: Learning Environmental Sounds with End-to-end Convolutional Neural Network (Yuji Tokozume and Tatsuya Harada, ICASSP 2017).¹

Contents

Between-class (BC) learning
- We generate between-class examples by mixing two training examples belonging to different classes with a random ratio.
- We then input the mixed data to the model and train the model to output the mixing ratio.
Training of EnvNet on ESC-50, ESC-10 [1], and UrbanSound8K [2] datasets

Setup

Install PyTorch.
Prepare datasets following this page.

Training

Template:

  python main.py --dataset [esc50, esc10, or urbansound8k] --netType [envnet or envnetv2] --data path/to/dataset/directory/ (--BC) (--strongAugment)

Recipes:

Standard learning of EnvNet on ESC-50 (around 29% error²):

  python main.py --dataset esc50 --netType envnet --data path/to/dataset/directory/

Notes:
- Please check opts.py for other command line arguments.

See also

Between-class Learning for Image Clasification (github)

Reference

[1] Karol J Piczak. Esc: Dataset for environmental sound classification. In ACM Multimedia, 2015.

[2] Justin Salamon, Christopher Jacoby, and Juan Pablo Bello. A dataset and taxonomy for urban sound research. In ACM Multimedia, 2014.

bc_learning_pytorch's People

Contributors

Stargazers

Watchers

bc_learning_pytorch's Issues

Not Able to replicate results on ESC-50

I have been running the experiments on this repo and I have not been able to replicate the results of ESC50(only BC, EnvNetv2, without strongAugment) mention in the paper. Any assistance would be appreciated.

Input of KLDivLoss

I believed that the input of KLDivLoss should be log-probability.
https://pytorch.org/docs/stable/nn.html?highlight=kldivloss#torch.nn.KLDivLoss

In the code, the output of the model is the softmax.
I think it is suppose to be LogSoftmax() at the last layer.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.