Giter Site home page Giter Site logo

ucasiggcas / keras-audiodatagenerator Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ahmedimtiazprio/keras-audiodatagenerator

1.0 1.0 0.0 149 KB

Augmented Audio Data Generator for 1D-Convolutional Neural Networks

License: MIT License

Python 100.00%

keras-audiodatagenerator's Introduction

Keras-AudioDataGenerator

Augmented Audio Data Generator for 1D-Convolutional Neural Networks

Note: For analogy with Image Data Generator, please read the ImageDataGenerator documentation. Also has a balanced data generator that creates balanced mini-batches.

The Audio Data Generator generates batches of audio data with real-time data augmentation. Data is looped over in batches. This method enables audio augmentation in CPU while training in parallel on the GPU. Code is tested on Linux computers. CPU usage should look something like this during GPU training:

CPU Usage during GPU training

Basic Usage:

from AudioDataGenerator import AudioDataGenerator

datagen = AudioDataGenerator(
                featurewise_center=True,
                featurewise_std_normalization=True,
                shift=.2,
                horizontal_flip=True,
                zca_whitening=True)

datagen.fit(x_train)

model.fit_generator(datagen.flow(x_train,y_train,batch_size=32),
                    steps_per_epoch = len(x_train)/32,
                    epochs = 100)

Arguments:

The following are the currently available augmentation parameters.

featurewise_center: Boolean. Set input mean to 0 over the dataset, feature-wise.

samplewise_center: Boolean. Set each sample mean to 0.

featurewise_std_normalization: Boolean. Divide inputs by std of the dataset, feature-wise.

samplewise_std_normalization: Boolean. Divide each input by its std.

zca_epsilon: epsilon for ZCA whitening. Default is 1e-6.

zca_whitening: Boolean. Apply ZCA whitening.

Note: Make sure to apply zca_whitening when using a separate validation set.

roll_range: Float (fraction of total sample length). Range for horizontal circular shifts.

horizontal_flip: Boolean. Randomly flip inputs horizontally.

zoom_range: Float (fraction of zoom) or range of [lower, upper] Uniform Distribution.

noise: [mean,std,'Normal'] or [lower,upper,'Uniform'] Add Random Additive noise. Noise is added to the data with a .5 probability.

noiseSNR: Float required SNR in dB. Noise is added to the data with a .5 probability(NotImplemented).

shift: Float (fraction of total sample length). Range for horizontal shifts.

fill_mode: One of {"constant", "nearest", "reflect" or "wrap"}. Default is 'nearest'. Points outside the boundaries of the input are filled according to the given mode:

  • 'constant': kkkkkkkk|abcd|kkkkkkkk (cval=k)
  • 'nearest': aaaaaaaa|abcd|dddddddd
  • 'reflect': abcddcba|abcd|dcbaabcd
  • 'wrap': abcdabcd|abcd|abcdabcd

cval: Float or Int. Value used for points outside the boundaries when fill_mode = "constant".

rescale: rescaling factor. Defaults to None. If None or 0, no rescaling is applied, otherwise we multiply the data by the value provided (before applying any other transformation).

preprocessing_function: function that will be implied on each input.

  • The function will run after the audio is resized and augmented.
  • The function should take one argument:
    • one audio (Numpy tensor with rank 2),
    • should output a Numpy tensor with the same shape.

brightness_range: Tuple or list of Float. Range for adding random gain to the audio. Doesn't have any effect if normalization is performed.

data_format: One of {"channels_first", "channels_last"}. "channels_last" mode means that the audio should have shape (samples, width, channels), "channels_first" mode means that the data should have shape (samples, channels, width). It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json. If you never set it, then it will be "channels_last".

validation_split: Float. Fraction of data reserved for validation (strictly between 0 and 1).

Dependencies:

  • Keras
  • Scipy
  • Numpy

Methods:

.fit() Method

Compute the internal data stats related to the data-dependent transformations, based on an array of sample data. Only required if featurewise_center or featurewise_std_normalization or zca_whitening is True.

datagen.fit(x_train,seed=1)

.flow() Method

Creates an iterable with batches. The iterator iterates indefinitely, therefore loops need to be manually broken.

batch_size = 32
for epoch in range(epochs):
    batch = 0
    for x_batch, y_batch in datagen.flow(x_train, y_train, batch_size=batch_size, shuffle=True):
        model.fit(x_batch, y_batch)
        batch += 1
        if batch >= len(x_train) / batch_size :
            break

.flow_from_directory() Method

Not Implemented

keras-audiodatagenerator's People

Contributors

ahmedimtiazprio avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.