Deep learning models have emerged as a powerful tool in avian bioacoustics to assess environmental health. To maximize the potential of cost-effective and minimal-invasive passive acoustic monitoring (PAM), models must analyze bird vocalizations across a wide range of species and environmental conditions. However, data fragmentation challenges a evaluation of generalization performance. Therefore, we introduce the
The simplest way to install
conda create -n birdset python=3.10
pip install -e .
You can also use the devcontainer configured as as git submodule:
git submodule update --init --recursive
Or poetry.
poetry install
poetry shell
First, you have to download the background noise files for augmentations
python resources/utils/download_background_noise.py
We provide all experiment YAML files used to generate our results in the path birdset/configs/experiment/birdset_neurips24
. For each dataset, we specify the parameters for all training scenario: DT
, MT
, and LT
The experiments for DT
with the dedicated subset can be easily run with a single line:
python birdset/train.py experiment="birdset_neurips24/DT/$Model"
Experiments for training scenarios MT
and LT
are harder to reproduce since they require more extensive training times.
Additionally, the datasets are quite large (90GB for XCM and 480GB for XCL). Therefore, we provide the best model checkpoints via Hugging Face in the experiment files to avoid the need for retraining. These checkpoints can be executed by running the evaluation script, which will automatically download the model and perform inference on the test datasets:
python birdset/eval.py experiment="birdset_neurips24/$EXPERIMENT_PATH"
As the model EAT is not implemented in Hugging Face transformer (yet), the checkpoints are available to download from the tracked experiments on Weights and Biases LT_XCL_eat.
If you want to start the large-scale trainings and download the big training datasets, you can also employ the XCM
and XCL
trainings via the experiment YAML files.
python birdset/train.py experiment="birdset_neurips24/$EXPERIMENT_PATH"
After training, the best model checkpoint is saved based on the validation loss and can then be used for inference:
python birdset/eval.py experiment="birdset_neurips24/$EXPERIMENT_PATH" module.model.network.local_checkpoint="$CHECKPOINT_PATH"
from birdset.datamodule.base_datamodule import DatasetConfig
from birdset.datamodule.birdset_datamodule import BirdSetDataModule
# initiate the data module
dm = BirdSetDataModule(
dataset= DatasetConfig(
data_dir='data_birdset/HSN', # specify your data directory!
dataset_name='HSN',
hf_path='DBD-research-group/BirdSet',
hf_name='HSN',
n_classes=21,
n_workers=3,
val_split=0.2,
task="multilabel",
classlimit=500,
eventlimit=5,
sampling_rate=32000,
),
)
# prepare the data (download dataset, ...)
dm.prepare_data()
# setup the dataloaders
dm.setup(stage="fit")
# get the dataloaders
train_loader = dm.train_dataloader()
from lightning import Trainer
min_epochs = 1
max_epochs = 5
trainer = Trainer(min_epochs=min_epochs, max_epochs=max_epochs, accelerator="gpu", devices=1)
from birdset.modules.multilabel_module import MultilabelModule
model = MultilabelModule(
len_trainset=dm.len_trainset,
task=dm.task,
batch_size=dm.train_batch_size,
num_epochs=max_epochs)
trainer.fit(model, dm)
Logs will be written to Weights&Biases by default.
To enhance model performance we mix in additional background noise from downloaded from the DCASE18. To download the files and convert them to the correct format, run the notebook 'download_background_noise.ipynb' in the 'notebooks' folder.
Our experiments are defined in the configs/experiment
folder. To run an experiment, use the following command in the directory of the repository:
python birdset/train.py experiment="EXPERIMENT_PATH"
Replace EXPERIMENT_PATH
with the path to the disired experiment YAML config originating from the experiment
directory. For example, here's a command for training an EfficientNet on HSN:
python bridset/train.py experiment="local/HSN/efficientnet.yaml"
Our datasets are shared via HuggingFace Datasets in our BirdSet repository. First log in to HuggingFace with:
huggingface-cli login
For a detailed guide to using the BirdSet data pipeline and its many configuration options, see our comprehensive BirdSet Data Pipeline Tutorial.
The datamodules are defined in birdset/datamodule
and configurations are stored under configs/datamodule
.
base_datamodule
is the main class that can be inherited for specific datasets. It is responsible for preparing the data in the function prepare_data
and loading the data in the function setup
. prepare_data
downloads the dataset, applies preprocessing, creates validation splits and saves the data to disk. setup
initiates the dataloaders and configures data transformations.
The following steps are performed in prepare_data
:
- Data is downloaded from HuggingFace Datasets
_load_data
- Data gets preprocessed with
_preprocess_data
- Data is split into train validation and test sets with
_create_splits
- Length of the dataset gets saved to access later
- Data is saved to disk with
_save_dataset_to_disk
The following steps are performed in setup
:
- Data is loaded from disk with
_get_dataset
in which the transforms are applied
Data transformations are referred to data transformations that are applied to the data during training. They include e.g. augmentations. The transformations are added to the huggingface dataset with set_transform
.