Giter Site home page Giter Site logo

sysrena / juliusjs Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zzmp/juliusjs

0.0 2.0 0.0 2.11 MB

A speech recognition library for the web

Home Page: https://zzmp.github.io/juliusjs

License: Other

Perl 0.20% JavaScript 92.86% Shell 0.23% HTML 0.04% C 6.68%

juliusjs's Introduction

JuliusJS

A speech recognition library for the web

Try the live demo.

JuliusJS is an opinionated port of Julius to JavaScript.
It actively listens to the user to transcribe what they are saying through a callback.

// bootstrap JuliusJS
var julius = new Julius();

julius.onrecognition = function(sentence) {
    console.log(sentence);
};

// say "Hello, world!"
// console logs: `> HELLO WORLD`
Features:
  • Real-time transcription
  • Use the provided grammar, or write your own
  • 100% JavaScript implementation
  • All recognition is done in-browser through a Worker
  • Familiar event-inspired API
  • No external server calls

Quickstart

Using Express 4.0
  1. Grab the latest version with bower
  • bower install juliusjs --save
  1. Include julius.js in your html
  • <script src="julius.js"></script>
  1. Make the scripts available to the client through your server
var express = require('express'),
    app     = express();

app.use(express.static('path/to/dist'));
  1. In your main script, bootstrap JuliusJS and register an event listener for recognition events
// bootstrap JuliusJS
var julius = new Julius();

// register listener
julius.onrecognition = function(sentence, score) {
    // ...
    console.log(sentence);
};
  • Your site now has real-time speech recognition baked in!

Configure your own recognition grammar

In order for JuliusJS to use it, your grammar must follow the Julius grammar specification. The site includes a tutorial on writing grammars.
By default, phonemes are defined in voxforge/hmmdefs, though you might find other sites more useful as reference.

  • Building your own grammar requires the mkdfa.pl script and associated binaries, distributed with Julius.
  • On Mac OS X
    • Use ./bin/mkdfa.pl, included with this repo
  • On other OS
    • Run emscripten.sh to populate bin with the necessary files
  1. Write a yourGrammar.voca file with words to be recognized
  • The .voca file defines "word candidates" and their pronunciations.
  1. Write a yourGrammar.grammar file with phrases composed of those words
  • The .grammar file defines "category-level syntax, i.e. allowed connection of words by their category name."
  1. Compile the grammar using ./bin/mkdfa.pl yourGrammar
  • The .voca and .grammar must be prefixed with the same name
  • This will generate yourGrammar.dfa and yourGrammar.dict
  1. Give the new .dfa and .dict files to the Julius constructor
// when bootstrapping JuliusJS
var julius = new Julius('path/to/dfa', 'path/to/dict');

Advanced Use

Configuring the engine

The Julius constructor takes three arguments which can be used to tune the engine:

new Julius('path/to/dfa', 'path/to/dict', options)

Both 'path/to/dfa' and 'path/to/dict' must be set to use a custom grammar

'path/to/dfa'
  • path to a valid .dfa file, generated as described above
  • if left null, the default grammar will be used
'path/to/dict'
  • path to a valid .dict file, generated as described above
  • if left null, the default grammar will be used
options
  • options.verbose - if true, JuliusJS will log to the console
  • options.stripSilence - if true, silence phonemes will not be included in callbacks
  • true by default
  • options.transfer - if true, captured microphone input will be piped to your speakers
  • this is mostly useful for debugging
  • options.*
  • Julius supports a wide range of options. Most of these are made available here, by specifying the flag name as a key. For example: options.zc = 30 will lower the zero-crossing threshold to 30.
    Some of these options will break JuliusJS, so use with caution.
  • A reference to available options can be found in the JuliusBook.
  • Currently, the only supported hidden markov model is from voxforge. The h and hlist options are unsupported.

Examples

Voice Command

Coming soon...

Keyword Spotting (e.g., API integration)

Coming soon...

In the wild

If you use JuliusJS let me know, and I'll add your project to this list (or issue a pull request yourself).

  1. Coming soon...

Motivation

  • Implement speech recognition in...
  • 100% JavaScript - no external dependencies
  • A familiar and easy-to-use context
    • Follow standard eventing patterns (e.g., onrecognition)
  • As far as accessibility, allow...
  • Out-of-the-box use
    • Minimal barrier to use
      • This means commited sample files (e.g. commited emscripted library)
    • Minimal configuration
      • Real-time (opinionated) use only
        • Hide mfcc/wav/rawfile configurations
  • Useful examples (not so much motivation, as my motivating goals)
    • Voice command
    • Keyword spotting

Future goals

  • Better sample recognition grammar (improves out-of-the-box usability)
  • Examples

Developers

Contributions are welcome! See CONTRIBUTING.md for guidelines.

Build from source

You'll need emscripten, the LLVM to JS compiler, to build this from the C source. Once you have that, run ./emscript.sh. If you are missing other tools, the script will let you know.

As emscript.sh reloads and recompiles static libraries, ./reemscript.sh is available once you've already run emscript.sh. reemscript.sh will only recompile to JavaScript based on your latest changes. This can also be run with npm make.

Additionally, tests are set will be made to run using npm test.
In the meantime, a blank page with the JuliusJS library can be served using npm start.

Codemap

emscript.sh / reemscript.sh

These scripts will compile/recompile Julius C source to JavaScript, as well as copy all other necessary files, to the js folder.

emscript.sh will also compile binaries, which you can use to create recognition grammars or compile grammars to smaller binary files. These are copied to the bin folder.

src

This is where the source for Julius will go once emscript.sh is run. emscript.sh will replace certain files in src/julius4 with those in src/include in order to make src/emscripted, the files eventually compiled to JavaScript.

  • src/include/julius/app.h - the main application header
  • src/include/julius/main.c - the main application
  • src/include/julius/recogloop.c - a wrapper around the recognition loop
  • src/include/libjulius/src/adin_cut.c - interactions with a microphone
  • src/include/libjulius/src/m_adin.c - initialization to Web Audio
  • src/include/libjulius/src/recogmain.c - the main recognition loop
  • src/include/libsent/configure[.in] - configuration to add Web Audio
  • src/include/libsent/src/adin/adin_mic_webaudio.c - input on Web Audio

Files in bold were changed to replace a loop with eventing, to simulate multithreading in a Worker.

js

The home to the testing server run with npm start. Files are copied to this folder from dist with emscript.sh and reemscript.sh. If they are modified, they should be commited back to the dist folder.

dist

The home for committed copies of the compiled library, as well as the wrappers that make them work: julius.js and worker.js. dist/listener/converter.js is the file that actually pipes Web Audio to Julius (the compiled C program).


JuliusJS is a port of the "Large Vocabulary Continuous Speech Recognition Engine Julius" to JavaScript

juliusjs's People

Contributors

zzmp avatar iffy avatar

Watchers

James Cloos avatar zhaoxj avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.