cheggan / metaaudio-a-few-shot-audio-classification-benchmark Goto Github PK

View Code? Open in Web Editor NEW

56.0 3.0 10.0 3.69 MB

A new comprehensive and diverse few-shot acoustic classification benchmark.

Python 100.00%

few-shot acoustics deep-learning pytorch python meta-learning

metaaudio-a-few-shot-audio-classification-benchmark's Introduction

Hi 👋 My name is Calum Heggan

PhD Candidate investigating Representation and Few-Shot Learning for Acoustics

🌍 I'm based in Edinburgh
🖥️ See my portfolio at my web page here
✉️ You can contact me at [email protected]
🧠 I'm learning about self-supervised representation learning

Socials

metaaudio-a-few-shot-audio-classification-benchmark's People

Contributors

Stargazers

Watchers

Forkers

wendonggan chester-w-xie kennedy12335 dibyakantimahapatra racheltlw banalasaritha enescigdem ankitshah009 boneseva liam-kelley

metaaudio-a-few-shot-audio-classification-benchmark's Issues

Problems with full_stack_KAGGLE.py

In folder_sort_KAGGLE18.py the function create_class_folders uses the variable main_dir even though it is not decleared or passed in to that function.

Also, the main function in folder_sort_KAGGLE18.py does not return anything which causes full_stack_KAGGLE.py to fail.

About the accuracy of 5-way 5-shot

Researcher, hello!
Recently, I have been conducting experiments on 5-way 5-shot learning. I used the example code for ESC-50, and I modified the configuration file by setting k_shot to 5, while keeping the rest of the settings the same. The accuracy is shown in the following figure.

The accuracy of 97% differs from the data in other papers. I would like to inquire whether there is a difference with the results from you, the researcher, or if there are any aspects I should pay attention to. Thank you.

about proto_step_fixed

Hello researcher,

I would like to inquire about the proto_step_fixed function in proto_steps.py, specifically lines 67 and 68:

x_queries = embeddings[q_queriesn_way:]
y_queries = y_batch[q_queriesn_way:]
Shouldn't it be [-(q_queries*n_way):] instead?

Unable to run example for MAML_ESC

Hi! Great repo!

I'm trying to run your example and was able to complete Step 1 'Getting and Formatting the Data' with 1 change:
In line 45 of to_spec.py I had to change '160000' to '80000' since the audio data had a shape of (80000,).

However on step 2, I followed the instruction and ran BaseLooper.py, however am getting the following error 'IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2)', which is an issue in stat_recorder_class.py, a portion of the function below, when calculating the mean across dimensions (0,1,2):

 def update(self, data):
        """
        data: ndarray, shape (nobservations, ndimensions)
        """
        # initialize stats and dimensions on first batch
        if self.nobservations == 0:
            print('SHAPE', data.shape) # I printed the shape, it is 'SHAPE torch.Size([64000, 157])'
            self.mean = data.mean(
                dim=self.red_dims, keepdim=True)
            self.std  = data.std (dim=self.red_dims,keepdim=True)
            self.nobservations = data.shape[0]
            self.ndimensions   = data.shape[1]

Here are the params I am passing in from the yaml file (no changes made except for the task type and data path.

{'base': {'n_way': 5, 'k_shot': 1, 'q_queries': 1, 'cuda': 0, 'num_repeats': 1, 'task_type': 'MAML_ESC_TRIAL_6Nov22Hybrid_global_1_runs', 'seed': 927}, 'models': ['Hybrid'], 'hyper': {'inner_train_steps': 8, 'inner_val_steps': 8, 'inner_lr': 0.4, 'meta_lr': 0.001, 'min_lr': 0.001, 'T_max': 1}, 'training': {'epochs': 200, 'episodes_per_epoch': 10, 'train_batch_size': 50, 'val_tasks': 200, 'test_tasks': 1000, 'trans_batch': False, 'eval_spacing': 10, 'break_threshold': 500, 'warm_up': 100000, 'num_workers': 4}, 'data': {'variable': False, 'name': 'ESC', 'norm': 'global', 'type': 'variable_spec', 'fixed': True, 'fixed_path': 'dataset_/splits/ESC_paper_splits.npy', 'data_path': 'C:/Users/Rachel Tan/Documents/GitHub/MetaAudio-A-Few-Shot-Audio-Classification-Benchmark/ESC-50-master/ESC_spec'}, 'split': {'train': 0.7, 'val': 0.1, 'test': 0.2}}

Any idea what could be the issue or how to make debug the example?

How to implement Meta-Curvature

Hello researcher!
I was wondering if your Meta-Curvature is also implemented using the learn2learn library, similar to the sample code for MAML.

I am currently trying to reproduce Meta-Curvature using the following code:
gbml = l2l.algorithms.GBML(model,transform=MetaCurvatureTransform,lr=params['hyper']['inner_lr'],adapt_transform=False,first_order=True)

However, the results I'm getting are not satisfactory.

Request for Updates on SimpleShot Code for ESC-50 Dataset

Hello researcher!
I am a university student currently studying few-shot learning. I was wondering if you will still update the sample code? I'm very interested in your implementation of SimpleShot on the ESC-50 dataset, but I'm having trouble getting started.
Looking forward to your response, and I wish you a wonderful day!

How to reproduce the results

Hello, researcher!

Recently, I have been attempting to reproduce my previous experimental results but encountered some issues. I would like to seek advice regarding the matter of seed.

My understanding is that setting the seed to the same value should yield the same accuracy in training for two different runs. However, this is not the case for me. I made some modifications to the code in BaseLooper.py to ensure that training uses the same seed, as shown below:

seed = np.random.randint(low=0, high=1000)
set_seed(seed)
params['base']['seed'] = seed

Changed to:

seed = params['base']['seed']
set_seed(seed)

I would like to ask if my understanding of the seed is incorrect?
Thanks you!

to_spec.py IndentationError

Hello respected researcher! I encountered the following issue while running to_spec.py, and after trying to troubleshoot the problem, I found that both .wav and .npy files are being generated in the same folder (Sorted). Can you please let me know if there's something I did wrong?

Traceback (most recent call last):
File "full_stack_ESC.py", line 27, in
from to_spec import main as to_spec_main
File "/home/mitlab/kuo/ESC-50-master/to_spec.py", line 45
if audio_data.shape[0] < 160000:
^
IndentationError: unindent does not match any outer indentation level

Training for more than n_way>5

Hi, in the training with ESC 20 dataset in MAML...when I change number of n_way greater than 5 or any number except 5, it gives me
episode_classes = np.random.choice(
File "mtrand.pyx", line 965, in numpy.random.mtrand.RandomState.choice
ValueError: Cannot take a larger sample than population when 'replace=False'
what can be the reason

I would like to run experiements using voxceleb1 dataset, but it seems like not publicly available now. Could you provide the dataset?

Environment configuration Question

Dear author, read your work feel benefited, but there is a problem in the environment configuration, as a novice will not solve this problem, hope to get your advice.

How can I run MAMl Example for a predefined support set and a query

Hi, first thank you for your great work..I am wondering how can I change the setup so that for evaluation I can but my (n-way and k shot support) for example 3 classes with 2 example for each in one folder and query in the other folder and run your code..
assuming that I have a pre-trained network for 3-way and 2-shot already.. Should I change test_tasks: 1000 to 1? should I write a new data loader instead of fast data loader?

Loading of BirdClef2020(pruned)

Hi,

I have an issue concerning the loading time of BirdClef2020(pruned):
I want to do batch training (and not episodic training). the training set size is 44542, by using a batch size of 128 I can load ~347 batches at each epoch. the dataset is in .wav format and I use torchaudio.load for loading the files use torch.utils.data.Dataloader for the dataloader.
My class dataset has minimal processing : for each file, I use librosa.get_duration to know in advance whether to pad the audio to 5s if the audio is short or to crop randomly 5s if the long is long.
The problem I am encountering is that the loading time of batches is long, even if I tune num_workers, prefetch_factor. Doing a full loading pass on the training set without any compute takes so long.

I also tried storing the whole dataset in numpy arrays in one h5 file but Its still long to do one pass over the training set (~15 mins).

Any tip/advice/suggestion would be appreciated, thank you.

train_batch_size represents what?

Hello researcher!
I've been studying the ProtoNet example you provided recently, and I changed the dataset to ESC-50. However, I encountered some issues in the process. When I change the train_batch_size parameter to any number greater than 1, the program fails to run. Unlike in the MAML example, I haven't encountered this problem. I would like to ask about the meaning of the train_batch_size parameter and whether it's a mistake in my configuration. Below are my configuration files and the execution report.

config
base:
n_way: 5
k_shot: 1
q_queries: 1
distance: 'l2'
task_type: 'PROTO_VAR_Kaggle_5_second_'
cuda: 0
num_repeats: 1
out_dim: 128

models: ['Hybrid']

hyper:
initial_lr: 0.0005 #0.01 #0.005
The lowest lr that is ever hit
min_lr: 0.0001
Patience for val loss
patience: 100000
Factor of lr reduction-new_lr = lr*factor
factor: 0.1
Number of episodes to warm up for before using scheduler
scheduler_warm_up: 20

training:
epochs: 1000 #1500
episodes_per_epoch: 10
train_batch_size: 5 # 10/20/50

How many tasks we want at each step
val_tasks: 200
test_tasks: 10000
break_threshold: 1000

Episodes between validation steps
eval_spacing: 100

trans_batch: False

Number workers for the dataloaders
num_workers: 4

data:
variable: False
name: 'ESC' # Kaggle_18
norm: 'global'
type: 'spec' # rawtospec/spec/variable_spec
fixed: True

fixed_path: 'dataset_/splits/ESC_paper_splits.npy'
data_path: '/home/mitlab/kuo/ESC-50-master/ESC_spec'

Split percentages for train/val/test
split:
train: 0.7
val: 0.1
test: 0.2

Traceback (most recent call last):
File "BaseLooperProto.py", line 106, in
pre, post, loss, post_std = single_run_main(params=params,
File "/home/mitlab/kuo/Proto_Kaggle18/ProtoMain.py", line 155, in single_run_main
pre, post, loss, post_std = fit(
File "/home/mitlab/kuo/Proto_Kaggle18/fit_proto.py", line 103, in fit
val_loss, val_pre, val_post, val_post_std = validation_step(valLoader, learner,
File "/home/mitlab/kuo/Proto_Kaggle18/fit_proto.py", line 208, in validation_step_fixed
x, y = prep_batch(batch, params['training']['train_batch_size'])
File "/home/mitlab/kuo/Proto_Kaggle18/all_prep_batches.py", line 41, in prep_batch_fixed
x = x.reshape(meta_batch_size, (n_wayk_shot + n_wayq_queries),
RuntimeError: shape '[5, 10, 128, 157]' is invalid for input of size 200960