Giter Site home page Giter Site logo

bemused-client's Introduction

Bemused Client

Streaming TFLite-based keyword detector.

New models can be trained with https://github.com/google-research/google-research/tree/master/kws_streaming

Dependencies

  • Python 3.7 or higher
  • Tensorflow 2.4
  • sounddevice

Model Format

Models should be placed in a directory with:

  • model.tflite - the streaming TFLite model file
  • config.json - JSON configuration for bemused

Model Configuration

Below is a sample configuration for a "raspberry" keyword:

{
    "sample_rate": 16000,
    "num_channels": 1,
    "block_ms": 20,
    "keyword_labels": [
        "raspberry"
    ],
    "all_labels": [
        "_silence_",
        "_unknown_",
        "raspberry",
        "_not_kw_"
    ]
}

Usage

$ python3 -m bemused_client /path/to/model/directory

See bemused_client --help for more options

bemused-client's People

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

bemused-client's Issues

I am bemused

I have just been thinking as the window is 40ms and the stride is 20ms so why am I feeding it with len=320.

Or is that it the sample should be the stride length and not the window.
I thought it was the window as that is the frame length so yeah bemused?

I am thinking the MFCC is high as don't know if you saw my post on the forum where I just created a single with sfeat and then just ran every 20ms via inference to get load which is really low.
Sonopy is balony as a MFCC generator as its unnaturally fast for a python script and if you change the parameters it gets even worse with it being able to beat the likes of librosa by 400%
I never did work it out as the math of MFCC is above my head but because it just makes me feel it isn't solid enough to be used.

With the google stuff is it me or is that also broken as we are feeding stride lengths into the model rather than windows?

Also if you do use

‘–preprocess’,
type=str,
default=‘raw’,
help=‘Supports raw, mfcc, micro as input features for neural net’
'raw - model is built end to end ’
‘mfcc - model divided into mfcc feature extractor and neural net.’
‘micro - model divided into micro feature extractor and neural net.’
'if mfcc/micro is selected user has to manage speech feature extractor ’
‘and feed extracted features into neural net on device.’
)

The input still expects float32 as an input which is sort of strange for mfcc.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.