Giter Site home page Giter Site logo

robertshrestha / snoring-detection Goto Github PK

View Code? Open in Web Editor NEW

This project forked from adrianagaler/snoring-detection

0.0 0.0 0.0 491.02 MB

Tiny Machine Learning Snoring Detection Model for Embedded devices

C++ 95.71% Python 0.53% Jupyter Notebook 3.74% Starlark 0.02%

snoring-detection's Introduction

Data Preprocessing

Get Data at sample rate = 16000

  • Data is already preprocessed in Snoring_Dataset_@16000
  • if you want to regenerate the data, run the downgrade_sample_rate.py on the Snoring Dataset folder (make sure to update the save_dir accordingly)
  • the script will convert the wav files in the Snoring Dataset (at sample_rate=44100Hz) to a reduced sample_rate = 16kHz

Run

  • pre-requisites: install the software (see below)
  • run python3 preprocessing.py (make sure to update the names of the files and directories)

Khan:

A dataset of 1000 sound samples is developed in this project. The dataset contains 2 classes—snoring sounds and non-snoring sounds. Each class has 500 samples. The snoring sounds were collected from different online sources [27–31]. The non-snoring sounds were also collected from similar online sources. Then silences were trimmed from the sound files and the files were split to equal-sized one-second duration files using WavePad Sound Editor [32]. Thus, each sample has a duration of one second

Ten categories of non-snoring sounds are collected, and each category has 50 samples. The ten categories are baby crying, the clock ticking, the door opened and closed, total silence and the minor sound of the vibration motor of the gadget, toilet flashing, siren of emergency vehicle, rain and thunderstorm, streetcar sounds, people talking, and background television news.

Summary:

  1. FFT points: 512
  2. number of filters in the filterbank: 10 + 22 = 32
  3. number of cepstral coefficients: 32
  4. sound sample frame size: 30 ms
  5. processed sample is a 32x32 image

Feature Extraction

  1. the Mel frequency cepstral coefficients (MFCCs) are calculated for each sample
  • to compress information into a small number of coefficients based on an understanding of the human ear.
  • the time-domain audio signal is first divided into 32 frames of 30 ms
  1. Construct Mel Filter Banks:
  • create the powerspectrum, by converting from frequency to energy space, using the power_spectrum function in the speechpy lib
  • create 10 equally spread bands given the minimum frequency (100) and the max frequency (1000) in steps of 100
  • convert the points in the linearly interpolated space from Hz to Mels
  • create the other 22 equally spread bands (filterbanks) for lowest freq = 1000 up to max
  • apply the mel matrix of 32 merged bands to the powerspectrum
  • get the coefficients by calculating the energies and the DCT values

Software SpeechPy Library: https://speechpy.readthedocs.io/en/latest/intro/introductions.html ; https://speechpy.readthedocs.io/_/downloads/en/latest/pdf/ Install using: pip install speechpy OR Locally:git clone https://github.com/astorfi/speech_feature_extraction.git then python setup.py develop

More on Mel Frequency Cepstral Coefficients: https://www.youtube.com/watch?v=4_SH2nfbQZ8

Extensions

Pre-emphasis to amplify high frequencies, as described here: https://haythamfayek.com/2016/04/21/speech-processing-for-machine-learning.html

Deployment

  • train_snoring_model.ipynb has training script
  • To edit training model script see 'Snoring-Detection/deployment/tensorflow1/tensorflow/examples/speech_commands'
  • rename snoring dataset directories to 'snoring' and 'no_snoring'
  • change data pathways
  • include "_background_noise_" directory from wake words dataset in dataset directory
  • make sure to check all paths to make sure they are specific to your setup
  • the code in the train_snoring_model.ipynb file calls tensorflow1 (a copy of tensorflow). I only pushed the modified code in tensorflow1/tensorflow/examples/speech_commands. You will have to add in the other tensorflow folders to make the code run.

snoring-detection's People

Contributors

yaojiayu0826 avatar kellywzhang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.