Giter Site home page Giter Site logo

deep_audio_features's Introduction

deep_audio_features's People

Contributors

dkatsiros avatar nikosmichas avatar pakoromilas avatar sofiaele avatar tyiannak avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

deep_audio_features's Issues

Max sequence length computation error

The code down below does not compute the max sequence length. Please check the length formula.

with contextlib.closing(wave.open(f, 'r')) as fp:
frames = fp.getnframes()
fs = fp.getframerate()
duration = frames / float(fs)
length = int((duration -
(config.HOP_LENGTH - config.HOP_LENGTH)) / \
(config.HOP_LENGTH) + 1)

Testing Script failure

Hello,
After a successful run of the training script, I got a model and applied it to the testing script. This is what I got after the execution:

Traceback (most recent call last): File "basic_test.py", line 94, in <module> test_model(modelpath=model, ifile=ifile, layers_dropped=layers_dropped) File "basic_test.py", line 54, in test_model fuse=fuse) TypeError: __init__() got an unexpected keyword argument 'spec_size'

Thanks,

Bug in classification report?

Theres a bug related to path dept in classification report. To reproduce:

from deep_audio_features.bin import classification_report as cr
cr.test_report('/Users/tyiannak/Downloads/soundscape_8k_1s.pt', ['/Users/tyiannak/Downloads/soundscape_8k_1sec/test/1', '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/2/', '/Users/tyiannak/Downloads/soundscape_8k_1sec/te
   ...: st/3', '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/4', '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/5'])

Loaded model class mapping: {0: '1', 1: '2', 2: '3', 3: '4', 4: '5'}
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-2-3e59e19efe16> in <module>
----> 1 cr.test_report('/Users/tyiannak/Downloads/soundscape_8k_1s.pt', ['/Users/tyiannak/Downloads/soundscape_8k_1sec/test/1', '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/2/', '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/3', '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/4', '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/5'])

/usr/local/lib/python3.9/site-packages/deep_audio_features/bin/classification_report.py in test_report(model_path, folders)
     58 
     59     max_seq_length = model.max_sequence_length
---> 60     files_test, y_test, class_mapping = load_dataset.load(
     61         folders=folders, test=False,
     62         validation=False, class_mapping=class_mapping)

/usr/local/lib/python3.9/site-packages/deep_audio_features/utils/load_dataset.py in load(folders, test_val, test, validation, class_mapping)
     71         folder2idx = {v: k for k, v in idx2folder.items()}
     72 
---> 73     labels = list(map(lambda x: folder2idx[x], labels))
     74 
     75     class_mapping = {}

/usr/local/lib/python3.9/site-packages/deep_audio_features/utils/load_dataset.py in <lambda>(x)
     71         folder2idx = {v: k for k, v in idx2folder.items()}
     72 
---> 73     labels = list(map(lambda x: folder2idx[x], labels))
     74 
     75     class_mapping = {}

KeyError: '/Users/tyiannak/Downloads/soundscape_8k_1sec/test/1'

if I go to the soundscape_8k_1sec path and then run

cr.test_report('../soundscape_8k_1s.pt', ['test/1', 'test/2/', 'test/3', 'test/4', 'test/5'])

Everything runs ok.

Also if I use the long path in the bin.basic_training script it also runs ok. So probably sth is going wrong with the load_dataset.load(), around the class mapping assignment when classification_report is used.

CNN + TRL

CNN and Tensor Regression Layer instead of linear

Int16 melgram

Check if Int16 melgram has comparable performance to float 32 melgram

Error in histogram file name on Windows 10.

Got this error on Windows 10 running:

C:\Python310\lib\site-packages\deep_audio_features\bin\basic_training.py -i "genres/blues" "genres/classical" "genres/country" "genres/disco" "genres/hiphop", "genres/jazz" "genres/metal" "genres/pop" "genres/reggae" "genres/rock" -o "energy"

...
--> Plotting histogram of spectrogram sizes.
Traceback (most recent call last):
File "", line 1, in
File "C:\Python310\lib\site-packages\deep_audio_features\bin\basic_training.py", line 64, in train_model
train_set = FeatureExtractorDataset(X=files_train, y=y_train,
File "C:\Python310\lib\site-packages\deep_audio_features\dataloading\dataloading.py", line 86, in init
self.plot_hist(spec_sizes, y)
File "C:\Python310\lib\site-packages\deep_audio_features\dataloading\dataloading.py", line 261, in plot_hist
plt.savefig(ct.strftime("%m_%d_%Y, %H:%M:%S") + ".png")
File "C:\Python310\Lib\site-packages\deep_audio_features\utils../..\matplotlib\pyplot.py", line 1023, in savefig
res = fig.savefig(*args, **kwargs)
File "C:\Python310\Lib\site-packages\deep_audio_features\utils../..\matplotlib\figure.py", line 3378, in savefig
self.canvas.print_figure(fname, **kwargs)
File "C:\Python310\Lib\site-packages\deep_audio_features\utils../..\matplotlib\backend_bases.py", line 2366, in print_figure
result = print_method(
File "C:\Python310\Lib\site-packages\deep_audio_features\utils../..\matplotlib\backend_bases.py", line 2232, in
print_method = functools.wraps(meth)(lambda *args, **kwargs: meth(
File "C:\Python310\Lib\site-packages\deep_audio_features\utils../..\matplotlib\backends\backend_agg.py", line 509, in print_png
self._print_pil(filename_or_obj, "png", pil_kwargs, metadata)
File "C:\Python310\Lib\site-packages\deep_audio_features\utils../..\matplotlib\backends\backend_agg.py", line 458, in _print_pil
mpl.image.imsave(
File "C:\Python310\Lib\site-packages\deep_audio_features\utils../..\matplotlib\image.py", line 1689, in imsave
image.save(fname, **pil_kwargs)
File "C:\Python310\Lib\site-packages\deep_audio_features\utils../..\PIL\Image.py", line 2410, in save
fp = builtins.open(filename, "w+b")
OSError: [Errno 22] Invalid argument: '01_09_2024, 08:51:43.png'

"ValueError" in Transfer Learning script

Hi,
While running the transfer learning script in terminal, I get a "ValueError":
Resetting model to epoch 14. Traceback (most recent call last): File "bin/transfer_learning.py", line 179, in <module> transfer_learning(model=modelpath, folders=folders, strategy=strategy) File "bin/transfer_learning.py", line 122, in transfer_learning best_model, train_losses, valid_losses, train_accuracy, \ ValueError: too many values to unpack (expected 6)

What can I do?

Thanks!

"pop up windows" in scripts

Hi,

When I execute the "training script" (and I think same happens with the other two scripts also) it starts like this:
1

I see the terminal window and a "pop up" window named "Figure 1". In order to proceed I must close the pop up window. If I don't nothing happens. So When I close it the script continues like this:
2
Again, I have to close the window to continue the execution. When I close it, the script continues as expected but:
3

This time I can't close the pop up "Figure 1" window until the script is finished running.

So, suppose this problem is something you can reproduce is there a way to fix it?
It would be great if the Histograms could be saved as an image automatically without user interference.

Thank you very much,

Refactor

  1. Configure architectures from config file
  2. Class for training and validation

error in audioTrainTest.extract_features_and_train if class folder contain only 1 sample

295 for feat in features:
296         temp = []
297         for i in range(feat.shape[0]):
298             temp_fv = feat[i, :]

if one of the class folder has only 1 sample, feat will be an 1d array of shape (, 136). And line 398 will give an error as it tries to access 1d array with 2d indices (feat[i, :])

I propose the following fix

    for feat in features:
        if feat.ndim == 1: # this class has only 1 sample
            feat = feat.reshape((1, feat.shape[0]))
        temp = []
        for i in range(feat.shape[0]):
            temp_fv = feat[i, :]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.