Giter Site home page Giter Site logo

waverdeep / wavesimsiam Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 35 KB

Self-Supervised Learning for General Audio Representation of Raw Waveform

License: MIT License

Python 100.00%
self-supervised-learning audio-representation-learning byol pytorch simsiam

wavesimsiam's Introduction

WaveSimSiam

Self-Supervised Learning for General Audio Representation of Raw Waveform

Description

자기지도학습은 데이터가 내포하고 있는 일반적인 표현을 학습하기 위한 방법론이다. 본 프로젝트에서는 SimSiam 모델을 기반으로 원시 오디오 파형에서 일반적인 표현을 학습할 수 있는 WaveSimSiam모델을 제안한다. WaveSimSiam은 오디오의 일반적인 표현을 학습할 수 있도록 원시 오디오 파형 데이터에 적합한 Augmentation Layer와 Encoding Layer를 설계하였다. 제안한 모델을 평가하기 위해 선형 분류기 평가 방식으로 분류와 인식 기반의 실제 작업을 진행하였다. WaveSimSiam은 대부분의 실제 작업에서 기존 논문들보다 향상된 성능결과를 보였다.

Getting Started

Datasets

Dependencies

  • Linux Ubuntu, Nvidia Docker, Python
  • adamp 0.3.0
  • scikit-learn 1.0.2
  • numpy 1.21.6
  • tensorboard 2.8.0
  • torch 1.12.0
  • torchvision 0.13.0
  • torchaudio 0.12.0
  • tqdm 4.63.1
  • sox 1.4.1
  • soundfile 0.10.3
  • natsort 8.1.0
  • WavAugment

Pretext Task

  1. Make up own your configuration file. (There is an pretext example in the config folder)
  2. You can modify this part at train.py
...
...

os.environ['CUDA_VISIBLE_DEVICES'] = '1' # input tranning cuda device number


...
...


def main():
    parser = argparse.ArgumentParser(description='waverdeep - WaveSimSiam')
    parser.add_argument("--configuration", required=False,
                        default='./config/write down your configuration file name')
                        
...
...

  1. And then, start pretext task training!
python train.py

Downstream Task

Currently, only transfer learning is implemented in this project.

  1. Make up own your configuration file. (There is an transfer learning example in the config folder)
  2. You can modify this part at train.py
...
...

os.environ['CUDA_VISIBLE_DEVICES'] = '1' # input tranning cuda device number


...
...


def main():
    parser = argparse.ArgumentParser(description='waverdeep - WaveSimSiam')
    parser.add_argument("--configuration", required=False,
                        default='./config/write down your configuration file name')
                        
...
...
  1. And then, start pretext task training!
python train.py

Authors

waverDeep

Version History

* Initial Release

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

wavesimsiam's People

Contributors

waverdeep avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.