<a href="https://autoencoded-vocal-analysis.readthedocs.io/en/latest/index.html" rel="

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Discussed this with <a class="user-mention notranslate" data-hovercard-type="user" dat

Tentative / rough to-do list for <a class="user-mention notranslate" data-hovercard-ty

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Ah whoops, sorry I missed this <a class="user-mention notranslate" data-hovercard-type

ENH: Add AVA models and datasets,about vocalpy/vak

NickleDave commented on June 21, 2024 1

Hey @marisbasha! Sorry you're running into this issue. It's probably something we haven't explained clearly enough.

If that's the case, where should I download the data from?

Just checking, did you already download the "source" test data as described here?
https://vak.readthedocs.io/en/latest/development/contributors.html#download-test-data

To do that you would run

nox -s test-data-download-source

Just to clarify, I should use my own "toy data" or does running vak prep tests/data_for_tests/configs/ConvEncoderUMAP_train_audio_cbin_annot_notmat.toml generate "toy data"?

You are right that these are basically "toy" datasets, that are as small as possible. I tried to define the two different types in that section on the development set-up page but just in case it's not clear: the "source" data is inputs to vak, like audio and annotation files. You create the other type, the "generated" test data, when you run nox -s test-data-generate. This "generated" test data consists of (small) prepared datasets and results, some of which are used by the unit tests.

You don't actually need to generate this test data to be able to develop. I just suggested it as a fairly painless way to check that you were able to set up the environment correctly. The script that generates the test data should be able to run to completion without any errors.

I am almost finished with that feature branch that will fix the unit tests so you can run them to test what you are developing. That branch will also speed up the script that generates the test data considerable and reduce the size of the generated test data.
#693

Does that help?

from vak.

NickleDave commented on June 21, 2024

Discussed this with @marisbasha and @yardencsGitHub today. Updating here with some thoughts I've had

We should add as a new model family a VAEModel. I would suggest that class look something like the VAExperiment class here, https://github.com/AntixK/PyTorch-VAE/blob/master/experiment.py#L15 -- as far as training/validation step
- We might want to additionally have the methods their base class has, encode, decode, and sample, for all VAE models: https://github.com/AntixK/PyTorch-VAE/blob/master/models/base.py
- let's avoid directly adapting their code though, since a lot of the logic is specific to their framework (and also because the license is Apache)
Add a model AVA that uses as the model family VAEModel
Make it so that we can load weights from AVA
- check if there are weights in the shared data, and test on those: https://research.repository.duke.edu/concern/datasets/9k41zf38g?locale=en -- my initial impression is that there are not model weights / checkpoints in the shared dataset, and that it's just the audio data / low-D features possibly?
- if not, use the original repo to generate weights and test that we can load those
Add an intitial walkthrough / how-to in the docs on using this model

from vak.

NickleDave commented on June 21, 2024

I don't think we need this for the initial implementation but noting for future work:

I think we can use WindowDataset for the Shotgun VAE models (which trains on randomly drawn windows)

from vak.

NickleDave commented on June 21, 2024

Tentative / rough to-do list for @marisbasha after our meeting today

from vak.

marisbasha commented on June 21, 2024

@NickleDave I am having trouble with nox -s test-data-generate. I receive the following error:
NotADirectoryError: Path specified for ``data_dir`` not found: tests/data_for_tests/source/audio_cbin_annot_notmat/gy6or6/032312
Which after inspection I see that tests/data_for_tests/source/ is an empty directory. I checked in the code for gy6or6, and I saw a script to download it. I put the data inside the audio_cbin_annot_notmat folder, but I get an error that says there's no .not.mat file in the directory, but I cannot find a link to download the data elsewhere.

Just to clarify, I should use my own "toy data" or does running vak prep tests/data_for_tests/configs/ConvEncoderUMAP_train_audio_cbin_annot_notmat.toml generate "toy data"?
If that's the case, where should I download the data from?

from vak.

marisbasha commented on June 21, 2024

Everything fine now. Thanks!

from vak.

NickleDave commented on June 21, 2024

🙌 awesome, glad to hear it!

Will ping you here as soon as I get that branched merged, it does fix a couple minor bugs so you'll probably want to git pull them in along with the fixed tests

from vak.

marisbasha commented on June 21, 2024

@NickleDave I have pushed again to my fork the parts divided by file.
I am having trouble configuring the trainer. Could we have a brief discussion?

from vak.

NickleDave commented on June 21, 2024

Ah whoops, sorry I missed this @marisbasha.

What you have so far looks great. I am reading through your code now to make sure I understand where you're at.

We can definitely discuss what to do with the trainer when we meet tomorrow.

from vak.

ENH: Add AVA models and datasets about vak HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent