bemused-client's Introduction

Bemused Client

Streaming TFLite-based keyword detector.

New models can be trained with https://github.com/google-research/google-research/tree/master/kws_streaming

Dependencies

Python 3.7 or higher
Tensorflow 2.4
sounddevice

Model Format

Models should be placed in a directory with:

model.tflite - the streaming TFLite model file
config.json - JSON configuration for bemused

Model Configuration

Below is a sample configuration for a "raspberry" keyword:

{
    "sample_rate": 16000,
    "num_channels": 1,
    "block_ms": 20,
    "keyword_labels": [
        "raspberry"
    ],
    "all_labels": [
        "_silence_",
        "_unknown_",
        "raspberry",
        "_not_kw_"
    ]
}

Usage

$ python3 -m bemused_client /path/to/model/directory

See bemused_client --help for more options

bemused-client's People

Stargazers

Watchers

bemused-client's Issues

I am bemused

I have just been thinking as the window is 40ms and the stride is 20ms so why am I feeding it with len=320.

Or is that it the sample should be the stride length and not the window.
I thought it was the window as that is the frame length so yeah bemused?

I am thinking the MFCC is high as don't know if you saw my post on the forum where I just created a single with sfeat and then just ran every 20ms via inference to get load which is really low.
Sonopy is balony as a MFCC generator as its unnaturally fast for a python script and if you change the parameters it gets even worse with it being able to beat the likes of librosa by 400%
I never did work it out as the math of MFCC is above my head but because it just makes me feel it isn't solid enough to be used.

With the google stuff is it me or is that also broken as we are feeding stride lengths into the model rather than windows?

Also if you do use

‘–preprocess’,
type=str,
default=‘raw’,
help=‘Supports raw, mfcc, micro as input features for neural net’
'raw - model is built end to end ’
‘mfcc - model divided into mfcc feature extractor and neural net.’
‘micro - model divided into micro feature extractor and neural net.’
'if mfcc/micro is selected user has to manage speech feature extractor ’
‘and feed extracted features into neural net on device.’
)

The input still expects float32 as an input which is sort of strange for mfcc.

Recommend Projects

rhasspy / bemused-client Goto Github PK

bemused-client's Introduction

Bemused Client

Dependencies

Model Format

Model Configuration

Usage

bemused-client's People

Stargazers

Watchers

bemused-client's Issues

I am bemused

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent