Giter Site home page Giter Site logo

ishine / dcasenet Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jungjee/dcasenet

0.0 1.0 0.0 232.8 MB

Author's repository for reproducing DcaseNet, an integrated pre-trained DNN that performs acoustic scene classification, audio tagging, and sound event detection. Implemented using PyTorch.

License: MIT License

Shell 7.39% Python 92.61%

dcasenet's Introduction

Overview

This GitHub project includes PyTorch implementation for reproducing experiments and DNN models used in the paper DcaseNet: An integrated pretrained deep neural network for detecting and classifying acoustic scenes and events, accepted for presentation at IEEE ICASSP 2021.

DcaseNet is a DNN which jointly performs acoustic scene classification (ASC), audio tagging (TAG), and sound event detection (SED) simultaneously. It adopts a two-phase training. In the first phase, joint training of three tasks is performed. Then, the model is fine-tuned for each task.

Usage

Environment Setting

We used Nvidia GPU Cloud for conducting our experiments. The training was done using one Nvidia Titan RTX GPU. Our settings are available at launch_nvidia-gpu-cloud.sh

Train

  1. Download three datasets: DCASE 2020 challenge Task 1-a, DCASE 2019 challenge Task 2, and DCASE 2020 challenge Task 3 and configure directories.
  2. (selectively) Enter virtual environment using NGC.
  3. Set parameters in train.sh
  4. run train.sh

If you prefer to use pre-trained joint DcaseNet and fine-tune only, remove 'Joint' experiment on train.sh and copy Joint weights into your 'save_dir'

Evaluation

  1. Download three datasets: DCASE 2020 challenge Task 1-a, DCASE 2019 challenge Task 2, and DCASE 2020 challenge Task 3 and configure directories.
  2. Set parameters in evaluate_trained_models.sh
  3. Run evaluate_trained_models.sh

Windows

There's a simple GUI program in DCASENetShellScriptBuilder that generates a script that one can run on Windows OS. After configuring a few checkboxes and setting directories for datasets, the generated script trains and evaluates. This program is provided by yeongsoo, and no further maintenance will be done.

The program has three rows: (i) On which tasks will the user conduct joint training (By checking none, it will use pretrained DcaseNet using all three tasks) (ii) On which tasks to perform fine-tuning (checking more than one task will train separate DcaseNets for each fine-tune task) (recommended to should check at least on task) (iii) On which tasks to perform the evaluation (recommended to be the same with upper row)

Below, there are text boxes where one can set directories of the downloaded datasets and save trained models. Note that when setting dataset directories, the code in this repo expects the folder that comes out after unzipping it.

DCASENetShellScriptBuilder

Email [email protected] for other details :-).

BibTex

This repository provides the code for reproducing the below paper.

@inproceedings{jung2021dcasenet,
  title={DCASENet: An integrated pretrained deep neural network for detecting and classifying acoustic scenes and events},
  author={Jung, Jee-weon and Shim, Hye-jin and Kim, Ju-ho and Yu, Ha-Jin},
  booktitle={Proc. ICASSP},
  pages={621--625},
  year={2021},
  organization={IEEE}
}

TO-DO

Log

  • 2020.09.24. : Initial commit
  • 2020.10.18. : Overall validation & refactoring (thanks to yeongsoo)
  • 2020.11.04. : Added filetrees & Refactoring finish

dcasenet's People

Contributors

jungjee avatar yeongsoo avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.