Giter Site home page Giter Site logo

Comments (10)

Nitzahon avatar Nitzahon commented on June 25, 2024 1

I found a solution that appears to work!
I connected my webcam stream to hark and had it set a redux boolean on the speaking and stopped_speaking events. then I added this useEffect:

 useEffect(() => {
    if(isRecog && !listening){
      SpeechRecognition.startListening({
        language: language
      });
    }
    else if(!isRecog && listening){
      SpeechRecognition.stopListening();
    }
  }, [isRecog,listening,language])

isRecog is a selector from redux signaling a request to record.
All I need now is to make sure it doesn't interfere with the media recorder I have set up, but I think this is a great solution for anyone looking to activate speech recognition when the user speaks.

from react-speech-recognition.

JamesBrill avatar JamesBrill commented on June 25, 2024

Hi @Nitzahon thanks for raising the issue!

The cause of this problem is the fact that the library does not actually support commands that are arrays. It was kind of a fluke that this appeared to be working in the first place. When fuzzy matching is used, the library converts the command into a string. JavaScript is very lenient and allows arrays to be converted into strings. As a result, your command is actually processed as "Everything is workingNothing is workingJust the audio worksJust the video worksשלום". With a low fuzzy matching threshold, this will match the speech "everything is working".

If you break the commands down into separate objects, you'll find the Hebrew command can be matched correctly.

However, you do raise a valid point that, from an API point of view, it would be convenient to just pass in an array of commands for one callback. I can look into supporting this when I get some free time.

In the meantime, you can set your commands up like this:

  const videoCommandPhrases = ['Everything is working','Nothing is working','Just the audio works','Just the video works','שלום']
  const videoCommandCallback = (command, spokenPhrase ) => {
    sendAns(spokenPhrase)
    handleReset()
  }
  const videoCommands = videoCommandPhrases.map(phrase => ({
    command: phrase,
    callback: videoCommandCallback,
    isFuzzyMatch: true,
    fuzzyMatchingThreshold: 0.8
  }))
  const commands = [
    {
      command: 'clear',
      callback: ({ resetTranscript }) => resetTranscript()
    },
    ...videoCommands
  ]

Hope that helps!

from react-speech-recognition.

Nitzahon avatar Nitzahon commented on June 25, 2024

Hi @Nitzahon thanks for raising the issue!

The cause of this problem is the fact that the library does not actually support commands that are arrays. It was kind of a fluke that this appeared to be working in the first place. When fuzzy matching is used, the library converts the command into a string. JavaScript is very lenient and allows arrays to be converted into strings. As a result, your command is actually processed as "Everything is workingNothing is workingJust the audio worksJust the video worksשלום". With a low fuzzy matching threshold, this will match the speech "everything is working".

If you break the commands down into separate objects, you'll find the Hebrew command can be matched correctly.

However, you do raise a valid point that, from an API point of view, it would be convenient to just pass in an array of commands for one callback. I can look into supporting this when I get some free time.

In the meantime, you can set your commands up like this:

  const videoCommandPhrases = ['Everything is working','Nothing is working','Just the audio works','Just the video works','שלום']
  const videoCommandCallback = (command, spokenPhrase ) => {
    sendAns(spokenPhrase)
    handleReset()
  }
  const videoCommands = videoCommandPhrases.map(phrase => ({
    command: phrase,
    callback: videoCommandCallback,
    isFuzzyMatch: true,
    fuzzyMatchingThreshold: 0.8
  }))
  const commands = [
    {
      command: 'clear',
      callback: ({ resetTranscript }) => resetTranscript()
    },
    ...videoCommands
  ]

Hope that helps!

You read my mind. I feel violated :P
I haven't been on my PC all week, but this exact code has been running through my brain since Thursday. And array support would be nice, I have indeed noticed all of your points regarding how the command is processed.

from react-speech-recognition.

Nitzahon avatar Nitzahon commented on June 25, 2024

one issue I'm encountering, while I can clear the transcript with a voice command using the code above. I am not able to make it work with a button press like in this code:

import SpeechRecognition, { useSpeechRecognition } from 'react-speech-recognition'

const Dictaphone = () => {
  const { transcript, resetTranscript } = useSpeechRecognition()

  if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
    return null
  }

  return (
    <div>
      <button onClick={SpeechRecognition.startListening}>Start</button>
      <button onClick={SpeechRecognition.stopListening}>Stop</button>
      <button onClick={resetTranscript}>Reset</button>
      <p>{transcript}</p>
    </div>
  )
}
export default Dictaphone

I can either reset the transcript using voce commands, or have it activated by a button, but not both. ideally, I want a clear function that I can call from any part of the code, so when the user clicks reset, or he changes the language

What I mean is I can either use this:

const { transcript, resetTranscript } = useSpeechRecognition()

Or this:

  const { transcript } = useSpeechRecognition({ commands })

And I'm not sure how to merge the two

from react-speech-recognition.

JamesBrill avatar JamesBrill commented on June 25, 2024

The handleReset in your example above differs from the one in your biomarkerz repo. Let's examine each one:

  const handleReset = useCallback(() => {
    SpeechRecognition.stopListening()
    SpeechRecognition.startListening({
      continuous:true, 
      language: 'he'
    })
  },[]);

This does not call resetTranscript so the transcript will remain the same. If your intent here was to stop the microphone and restart it with Hebrew detection, you would just need to call:

  SpeechRecognition.startListening({
    continuous:true, 
    language: 'he'
  })

No need for the stopListening call. Though if you did need to stop the microphone, note that this function is asynchronous. This means you need to wait for it to complete before doing something else. This is why your original callback appeared to only stop the microphone - there was a race condition between the stopListening and startListening calls. The stop would finish after the start and leave the microphone turned off. To avoid that, you would need to make your handler async:

  const handleReset = useCallback(async () => {
    await SpeechRecognition.stopListening()
    SpeechRecognition.startListening({
      continuous:true, 
      language: 'he'
    })
  },[]);

In your biomarkerz repo, you use a different handleReset:

  const handleReset = useCallback(() => {
    resetTranscript();
    dataTrans(transcript)
  }, [transcript, dataTrans, resetTranscript]);

This one does call resetTranscript, so will clear the transcript when the button is clicked. Removing the dataTrans, I tried this locally and it worked fine.

I don't believe the useCallback is necessary here. handleReset is a relatively inexpensive function to create and is already being recreated frequently anyway due to transcript being a dependency. I suggest just making it a plain function without useCallback. This is a nice article on useCallback.

Note that the resetTranscript used in the clear command callback comes from the callback args, not useSpeechRecognition - you don't need to have called the hook before defining that command. See here: "The last argument that [the command callback] function receives will always be an object containing the following properties: resetTranscript, a function that sets the transcript to an empty string". To get reset on both the button click handler and the voice command, an example like this works:

import React from 'react'
import SpeechRecognition, { useSpeechRecognition } from '../SpeechRecognition'

const Dictaphone = () => {
  const commands = [
    {
      command: 'clear',
      callback: ({ resetTranscript }) => resetTranscript()
    }
  ]
  const { transcript, resetTranscript } = useSpeechRecognition({ commands })
  const handleReset = () => {
    resetTranscript()
    alert(`Call dataTrans with ${transcript}`)
  }
  const startListening = () => {
    SpeechRecognition.startListening({
      continuous: true,
      language: 'en-GB'
    })
  }

  if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
    return null
  }

  return (
    <div style={{ display: 'flex', flexDirection: 'column' }}>
      <button onClick={handleReset}>Reset</button>
      <button onClick={startListening}>Start listening</button>
      <span>{transcript}</span>
    </div>
  )
}

export default Dictaphone

from react-speech-recognition.

Nitzahon avatar Nitzahon commented on June 25, 2024

Yesterday this worked like a charm. Today the voice commands are taking an extra long time. The transcript displays properly but the voice commands take at least a minute to fire off the callback, is there a way to check the api response time, or maybe it's something in my program that's causing it to wait?

from react-speech-recognition.

JamesBrill avatar JamesBrill commented on June 25, 2024

You might just need to refresh the page - I've found the Web Speech API's performance to be inconsistent if the page is left open for a long time. I'm not sure how to profile the Web Speech API - there's no visible network requests in the Network tab. You might also want to try matchInterim: true on any non-fuzzy commands to speed up the response.

from react-speech-recognition.

Nitzahon avatar Nitzahon commented on June 25, 2024

Okay, so I found out what change caused the slow down. it was removing this code

else{
    SpeechRecognition.startListening({
      continuous:true, 
      language: 'he'
    });
  }

from:

  if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
    return null
  }

without that conditional startListening({}); it takes several seconds for voice commands to fire
as opposed to almost instantaneous callback

of course listing it as continuous and setting the language is redundant at that stage thanks to the useEffect

edit: correction, language and continuous is needed, because the else clause fires off before useEffect, causing the first speech recognition to be default values (english)

edit edit: scratch that. it appears the problem lies with it being a continuous listen. the problem is, that without the else clause, I have no way of restarting the listen if its not continuous, since without continuous it stops the listen after it detectes something, and with the listen it waits for around 30-60 seconds before activating commands on the transcript

from react-speech-recognition.

JamesBrill avatar JamesBrill commented on June 25, 2024

Ah, I didn't notice you were calling startListening on every render - this should be avoided. During speech, the component will be re-rendering frequently. As a result, it will be hammering the startListening method and possibly overwhelming the Web Speech API. You only need to call startListening once for each time you want to collect speech.

To start listening on "mount", you need to replace your else with a call touseEffect before the browserSupportsSpeechRecognition check:

  useEffect(() => {
    SpeechRecognition.startListening({ continuous: true, language: 'he' })
  }, []);

  if (!SpeechRecognition.browserSupportsSpeechRecognition()) {
    return null
  }

Subsequent stops and starts (if these are actually needed) can be executed by event handlers on button clicks or voice commands.

If this still results in slow commands, perhaps you can share your latest code so I can diagnose the issue.

from react-speech-recognition.

Nitzahon avatar Nitzahon commented on June 25, 2024

I'll probably change my code then, to manually start the listen on a function call. but I will also need to recognize when listening has stopped.

from react-speech-recognition.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.