Giter Site home page Giter Site logo

vad's Introduction

VAD - simple voice activity detection in Python

This is a simple voice activity detection (VAD) algorithm in Python. It is based on simple energy-based thresholding and is intended to be used as a simple method for detecting speech in audio files when other methods cannot be used for both privacy, performance, or other reasons.

Installation

You can install the package using pip:

pip install vad

Usage

The package can be seamlessly integrated into your Python code. The following example shows how to use the package to detect speech in an audio file:

from vad import EnergyVAD

# load audio file in "audio" variable

vad = EnergyVAD(
    sample_rate: int = 16000,
    frame_length: int = 25, # in milliseconds
    frame_shift: int = 20, # in milliseconds
    energy_threshold: float = 0.05, # you may need to adjust this value
    pre_emphasis: float = 0.95,
) # default values are used here

voice_activity = vad(audio) # returns a boolean array indicating whether a frame is speech or not

# you can also use the following method to get the audio file with only speech
# speech_signal is a numpy array of the same shape as audio
speech_signal = vad.apply_vad(audio)

Audio samples

  • example.wav is a sample audio file that can be used to test the package.
  • example_vad.wav is the audio file with only speech after applying the VAD algorithm.
  • example_vad_2.wav is the audio file with only speech direcly extracted from the original audio file using the apply_vad method.
  • vad_output.png is a plot of the voice activity detected by the VAD algorithm.
  • test_vad.py is the script that was used to generate the above audio files and plot.

Known issues

  • There is no additional VAD algorithm implemented in this package at the moment. It may be added in the future.

License

This project is licensed under the MIT License - see the LICENSE file for details.

vad's People

Contributors

morenolaquatra avatar

Stargazers

SEOA7777 avatar  avatar Pushkar Nimkar avatar  avatar  avatar Will avatar Alkis Koudounas avatar

Watchers

James Cloos avatar  avatar

Forkers

shithead sciumo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.