Giter Site home page Giter Site logo

johnmartinsson / adaptive-change-point-detection Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 2.14 MB

The official implementation of A-CPD from the paper "From Weak to Strong Sound Event Labels using Adaptive Change-Point Detection and Active Learning".

License: Apache License 2.0

Python 96.89% Shell 3.11%
active-learning annotation bioacoustics machine-learning sound-event-detection

adaptive-change-point-detection's People

Contributors

johnmartinsson avatar

Stargazers

 avatar  avatar

Watchers

 avatar

adaptive-change-point-detection's Issues

Cleanup code: data generation

Make the data generation easier, using only one script to generate soundscapes and embeddings with the default settings.

Idea: improve embedding time resolution

It may be feasible to improve the time resolution of the embeddings by simply zeroing out the first 1 second and last 1 second of a 3 second segment to get an embedding for only 1 second.

  • Try this for the BirdNET-Analyzer and see what happens.

Foreground sounds in background sounds?

It seems the ME dataset may not be properly annotated. There appears to be foreground sounds that appear in the background sounds.

This may become an issue for the active learning approach. Since this will result in a faulty oracle, the active learning method may end up increasing label noise in the annotated dataset, which will be detrimental to the model performance.

  • Choose a set of different known background noises. E.g., 30 seconds of rain, wind, traffic, urban city, et cetera.

No probas before 1.5 seconds

The BirdNET embeddings start at 1.5 seconds, meaning that we have not probability estimate before this, and can therefore not detect peaks at the beginning of the file.

Problematic. Temporary solution is to generate soundscapes that never have events before 1.5 seconds.

Cleanup code: data visualization

Move visualization code to separate script which produces the main figures in an efficient way.

Maybe move the AL simulation code to a separate script and store the most important results, which can then be loaded by the visualization scripts. Depends on how much time it requires.

Add description of data generation and embedding pre-computing to README

DRAFT.

Run experiments on modified data

Environment

Check the requirements.txt for the requirements. In particular we need:

- Scaper, and
- BirdNET-Analyser.

Scaper is used to generate the soundscapes using the source data, and BirdNET-Analyser is used to pre-compute the embeddings that the method works on.

Pre-compute the embeddings

TODO: last part of

scripts/generate_scaper_data.sh

Run simulations / experiments

TODO: update main script with proper default results folder

If everything is setup properly you should be able to run everything by simply writing:

    python main.py

this should run

  • active learning annotation simulation,
  • model training and prediction, and
  • evaluation of models trained with annotations.

Generate the dataset

TODO: describe how to generate the data in more detail.

Produce source files

TODO: add the doi:s and links to the datasets.

- NIGENS dataset,
- TUT Rare Events dataset,
- DCASE Few-shot bioacoustic dataset,

In the ``produce_source_material.py'' you need to set the correct data paths:

    tut_base_dir    = '<path>/TUT_rare_sed_2017/TUT-rare-sound-events-2017-development/data/source_data/'
    nigens_base_dir = '<path>/NIGENS/'
    dcase_base_dir  = '<path>/Development_Set/Validation_Set/ME/'

Generate audio recordings

TODO:

Extract the embeddings

TODO: explain how to generate the embeddings in more detail.

Setup the BirdNET-Analyzer v2.4 (https://github.com/kahst/BirdNET-Analyzer).

python embeddings.py --i ./scaper_source_files/BV/AMRE/train_soundscapes/ --o ./scaper_source_files/BV/AMRE/train_soundscapes/ --threads 8 --batchsize 16 --overlap 0
python embeddings.py --i ./scaper_source_files/BV/AMRE/train_soundscapes/ --o ./scaper_source_files/BV/AMRE/train_soundscapes/ --threads 8 --batchsize 16 --overlap 0

This will generate the embeddings for the train_soundscapes and the test_soundscapes and store them in the respective directory. Embeddings are for 3 second segments with the specified overlap.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.