linto-ai / webvoicesdk Goto Github PK

30.0 5.0 10.0 4 MB

Buildings block for voice-enabled applications in the browser

License: GNU Affero General Public License v3.0

JavaScript 91.74% HTML 8.26%

speech-recognition javascript machine-learning tensorflow speech-to-text vocal-assistant wake-word-detection wakeword

webvoicesdk's Introduction

WebVoice SDK

WebVoice SDK is a JavaScript library that provides lightweights and fairly well optimized buildings block for always-listening voice-enabled applications right in the browser. This library is the main technology behind LinTO's Web Client as it deals with everything related to user's voice input.

Functionalities

Hardware Microphone Handler : hook to hardware, record, playback, get file from buffer as wav... very handy
Downsampler : re-inject acquired audio at any given samplerate / frame size
Speech Preemphaser : Prepare acquired audio for machine learning tasks
Voice activity detection : Detect when someone's speaking (even at very low signal-to-noise ratio)
Features extraction : Pure JavaScript MFCC (Mel-Frequency Cepstral Coefficients) implementation
wake word / hot word / trigger word : Immediatly trigger tasks whenever an associated chosen word has been pronounced

Online demo

You can find an online demo of the library on this static webpage : https://webvoicesdk.netlify.app/

It showcases the entire pipeline : microphone -> voice-activity-detection -> downsampling -> speech-preemphasis -> features-extraction -> wake-word-inference

Note : To start the tool, click on the start button, accept browser's access to the default audio input. The Voice Activity Detection "led" will blink as someone's speaking. Something magic will happen if someone says Linto. (Something like "LeanToh" for english speakers as the model was trained with our french data-set)

Note : You can select the model you want to use. The library comes prepacked with two wake word models (one model for LinTO and a triple headed model that bounces on LinTO, Snips or Firefox)

Highlights

Complete multithreading JavaScript implementation using Workers for real-time processing on any machine
WebAssembly optimisations whenever possible
State of the art Recurent Neural Network that uses WebAssembly portable runtime for voice activity detection. This is modern and efficient alternative to the popular Hark voice activity detection tool
Supports single inline script that can get deployed in any webpage without mandatory bundlers
Built library embbeds everything (wasm files, tensorflow.js models for wake words, workers...) into a single static javascript file
The wake word Engine relies on Tensorflow JS and WebAssembly portable runtime to infers towards single or multiple wake-words model with lightweight and ultra-effecient performances.
Portable machine-learning models : Use the same wake word models on embedded devices, mobile phones, desktop computers, web pages. See : LinTO Hotword Model Generator and Create your custom wake-word
Full offline speech recognition in browser, no server behind, all the magic happens in your webpage itself

Usage

Further documentation and information is in progress. For the moment, You can still build and test the library by yourself

npm run test

Or import it in your browser :

<script>https://cdn.jsdelivr.net/gh/linto-ai/webVoiceSDK@master/dist/webVoiceSDK-linto.min.js</script>

Copyright notice

This library includes modified bits from :

Meyda MIT Licence
FFTjs MIT Licence
node-dct MIT Licence
Jitsi Apache License 2.0

webvoicesdk's People

Contributors

Stargazers

Watchers

Forkers

thomascherickal bmithun chouchoucendre rogervaas justjish khlevon valterrsj vitaly-z skylord2 bett3r-dev

webvoicesdk's Issues

Hotwords detection is not working with a fresh build

Description

Hotword service worker throw an error message ExitStatus {name: "ExitStatus", message: "Program terminated with exit(1)", status: 1} after a fresh build.

How to reproduce

Empty the node_module
Run npm install
Run npm test
Go to http://localhost:1234, open the console and run the hotword pipeline

I'm afraid that my knowledges and skills doesn't allow me a further investigation ...

Calling mic.stop() is not working once hotword has been spotted

Hi Damien,

I'm facing a quite weird issue :
When calling mic.stop() at a first place everything works as expected that means the following icon disappears. But once 'LinTO' hotword has been spotted calling mic.stop() has no effect.

So I tried modifying WebvoiceSDK since track we got had an "ended" status :

this.stream.getTracks().forEach((track) => {
  if (track.kind === 'audio' && typeof track.stop === 'function') track.stop()
})

But still the mic icon persists.

Do you got any idea on how to really stop listening the user ?

Thank you

Recorder is not working on safari

See #1

How to use it in react js , with custom wake word?

Hi i am looking forward for good hotword detection library purely in node js, and i found WebVoiceSDK, it's performance in the demo is really good.
Now can you tell me how to utilise it in react js and with custom wakeword?

any refrence ? if so then this library has very good potenital to scale up.

parcelRequire is not defined in Electron build

when using Electron to build desktop app .
after build in exe its happen

but after serve in browser its fine

how to solve ?

Needs an AGPL warning at the top of the documentation

You library is awesome, it solved all my problems, and I will never be able to use it due to your license choice.

I appreciate it's your code, and you can do what you like with it, but doesn't it seem a waste that another project doing exactly the same thing will need to be created and maintained by someone, just to remove the license restriction that AGPL imposes? If I include your library in my larger web application, releasing all the source code to our competitors is simply not going to fly.

To avoid wasting other people's time spent integrating this, please include a large warning at the top of the readme, indicating it's basically not suitable for most commercial use. If you want to be super-helpful, pointing us to competing projects with different licenses would be rather nice.

It's such a shame, as it works really well.

Dependency update broke the library

After a fresh build the WASM file doesn't work. A month or two ago everything was working as expected.
npm install + npm test:
Errors in firefox:

Module.asm.c.apply is not a function

this.wasmInterface is undefined

Errors in chrome:

TypeError: ___wasm_call_ctors.apply is not a function

Uncaught TypeError: Cannot read properties of undefined (reading 'HEAPF32')
    at Rnnoise.copyPCMSampleToWasmBuffer

Also, is it possible to prevent such issues by either adding a package-lock file or specifying the version?

Recorder on mobile imply a jerky wav output

Description

When using the Recorder on a mobile, the generated wav is damaged.
Mic options :

{
      frameSize: 4096,
      constraints: {
        echoCancellation: true,
        autoGainControl: true,
        noiseSuppression: true
      }
    }

Attached to this issue you'll find a zip (github does not accept audio files to be attached) containing 3 audio files :

desktop.wav recorded on a desktop with WebVoiceSDK.Recorder
mobile.wav recorded on an iPhoneX with WebVoiceSDK.Recorder
usermedia-mediarecorder.webm recorded with native MediaRecorder and navigator.mediaDevices.getUserMedia({ audio: true })

Archive.zip

Do you have any ideas on a way to solve this problem ?
Maybe once I'll be able to start the project I can add "Record mic" "Stop record mic" buttons to the test page and consequently to https://webvoicesdk.netlify.app/
That way we'll be able to test the recorder across many devices.

Thank you

EDIT: After further investigation output wave audio is cleaner if I disable other intensive task (gesture recognition) on my webapp. Anyway that means transcoding AudioContext stream to WAV will be affected when available ressources are low. Maybe we can add options in the Recorder constructor to allow developers to opt for a more Native recording system.

linto-ai / webvoicesdk Goto Github PK

webvoicesdk's Introduction

WebVoice SDK

Functionalities

Online demo

Highlights

Usage

Copyright notice

webvoicesdk's People

Contributors

Stargazers

Watchers

Forkers

webvoicesdk's Issues

Description

How to reproduce

Description

Recommend Projects

Recommend Topics

Recommend Org