Giter Site home page Giter Site logo

localcroft's Introduction

DEPRECATED.

See elsewhere for more reasons why, but this is no longer a useful repo for mycroft. Neon and/or OVOS are the places you should be looking into, both of which are well beyond the things discussed here. I'll keep this around for archival/amusement purposes, but you're better off looking for the ovos backend and such instead.

DEPRECATED.

Local mycroft things

This includes several file changes to help run a local instance of mycroft, and some how-i-did-it pages for running local resources.

Building my Precise custom wake word model

More on that here.

Mycroft client DeepSpeech STT adjustments

Trying to improve local deep speech audio handling. First remove the start_listening noise*. Second, padding the wav file with .1 seconds of silence at the beginning and the end.

Uses pydub, numpy, scipy, rnnoise-python. sudo apt install ffmpeg; sudo pip3 install pydub or whatever for your env to usually get these installed on picroft.

File itself replaces the one in mycroft-core/mycroft/stt/, then restart services. Note this file defaults to using rnnoise, which can add asignificant time to processing audio files. If you're capable of using this repo you can figure out how to comment that line out if need be.

  • I created a .05s silent wav file for my start_listening.wav.

non-mycroft Deepspeech stuff

here

Moz/wav TTS connector

@domcross got the mozilla tts bits into core, so just use that. It should in theory work with most any URL submission that takes the text as url parameters and returns a wav file.

See the TTS config bits below for how to configure in your local conf.

Local Wikipedia

See here for more on that.

precise uploads

A recent PR has also added local saving of wake words! This can be substituted if preferred to uploading.

Run the uploader.py in a screen session on a friendly host. Requires flask. May need to edit to adjust listen IP or save directory. This makes use of the listener.url config.

Selene backend and updated personal server should handle this more directly if you go that route.

config

bits I use to make things work locally...

  "listener": {
    "wake_word": "yourwordhere",
    "wake_word_upload": {
      "disable": false,
      "url": "http://127.0.0.1:4000/precise/upload"
    },
  "hotwords": {
    "yourwordhere": {
        "module": "precise",
        "phonemes": "U R FO NE M Z HE R E",
        "threshold": "1e-30",
        "local_model_file": "/home/pi/.mycroft/precise/yourwordhere.pb"
        }
    },

This is used to set your wake word, whether to upload the detected wakewords to the upload server, and which wake word engine and options to use. Pocketsphinx uses the phonemes.

  "stt": {
    "module": "deepspeech_server",
    "deepspeech_server": {
      "uri": "http://127.0.0.1:2000/stt"
    }
  },

The default STT file has more enumeration on what choices are available, this is just the one I end up using the most.

  "tts": {
    "module": "mimic2",
    "mimic2": {
      "lang": "en-us",
      "url": "http://127.0.0.1:3000"
    },

TTS server configuration. The URL might be tricky if your endpoint requires odd pagenames but this should work with the mimic2 connector I have here for anything that returns a .wav file.

localcroft's People

Contributors

el-tocino avatar domcross avatar

Stargazers

Caret avatar Kannan K avatar Kevin Elgan avatar Iona Hrapsa avatar amigacli avatar Adam Monsen avatar Mike Saraf avatar loRe avatar Joy "Wilson" Skipper avatar Mirek Sobczak avatar  avatar  avatar Silvio Marra avatar  avatar FruityCoder avatar simonchen avatar Jose C Luna avatar  avatar VALADI K JAGANATHAN avatar Youssef avatar Daniele Ricci avatar Matthew O'Gorman avatar Edward Leininger avatar  avatar  avatar Francisco Lopes avatar Matthieu avatar  avatar Åke avatar Andy Kish avatar Nuri Sevinç avatar  avatar Peter Steenbergen avatar Kaito avatar JarbasAI avatar

Watchers

Matthew O'Gorman avatar James Cloos avatar  avatar TomHugo avatar JarbasAI avatar  avatar metavore3 avatar

localcroft's Issues

Create config-level entry to enable filtering items.

Within the mycroft conf you should be able to add local config entry to enable/disable denoise/highpass/lowpass/normalize/other? filters. Right now they're all just on by default in local deepspeech server config. They could be used on other engines as well, which I have not tested with those adjustments, if one was so motivated to try*. Performance considerations might also change what people want to have filtered. The high/low pass frequencies would also be nice things to allow config settings for.

  • this is left as an exercise for the reader.

[precise] Wakeword improve activation speed

While I started writing a rather chunky issue over on the mycroft-precise repo, I re-read your documentation and though I'd better ask here/you directly.

What I'm currently struggling with is activation speed. I've been very carefully in regards to my training data and ensured every clip starts immediately with the wake-word followed with 1 second of silence (silence meaning quiet room -> me not speaking).

Using my dataset combined with the following for not-wake-words:

reaches a val_acc 1 in about 120 epochs (super quick).
While it activates quite consistantly, it does so rather slow as it requires for the trailing 1 second to pass as well.

If I now duplicate the data-set and strip 500 ms from the end of every single wake-word clip, I'm suddenly unable to reach a val_acc higher than 0.5
Stripping 800-1000ms has me sitting on val_acc 0.
Training for more epochs (I tried up to 6000) did not help.

Is this to be expected? Is there a way to work around this?
Any help would be much appreciated and thanks for your current write-up. It already helped a lot :)

Question about wav content

Hi el-tocino,

I'm struggling a bit to find a german dataset to speed up the process of finding fake words.

There are some sets, but almost exclusively spoken sentences (half-sentences). Some are short, but i'm not certain that this even qualifies to be training material. Is precise-train-incremental restricted to spoken words?

Question about local wikipedia with ZIM and Kiwix-server

Hi El Tocino

Thank you very much for you guide on all things local mycroft.
:)

Re. Wikipedia, I followed the easy method with Invader ZIM.
All went well and I can access my local wikipedia via web browser.

I am having issues getting mycroft to access this wiki though.
I edited the file wikipedia.py and changed the API_URL with the local IP (http://192.168.1.43:7998/) of the kiwix-serve:
When querying mycroft wikipedia skill, I get the error WikipediaSkill | Error: Expecting value: line 1 column 1 (char 0)

Would you mind please sharing your config of the API_URL in wikipedia.py?

Many thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.