Giter Site home page Giter Site logo

Unable to record/transcribe about odas_web HOT 11 OPEN

introlab avatar introlab commented on July 17, 2024
Unable to record/transcribe

from odas_web.

Comments (11)

GodCed avatar GodCed commented on July 17, 2024

Hi, let's start with the recording issue.

It looks like the ODAS Web process cannot write the audio file fast enough. I never used the WSL so I can't tell if the issue comes from there. ODAS Web then hangs because the buffer are getting out of hands (the slow GUI part) and stops receiving audio samples, which leads to ODAS freezing (that's why the process is hard to kill on the Pi). What is your ODAS sink config? Maybe the sample rate is juste too high and it can't write fast enough (parsing raw audio in a javascript loop is a rather suboptimal process). 16 000 is normally a safe choice.

For the transcription part, I will refer you to the Google Cloud documentation for getting your key file. Your file looks like a proper key tough. However if recording is not working I'm not sure text to speech will, so I would try to sort one issue at the time.

from odas_web.

chejmadi avatar chejmadi commented on July 17, 2024

I don't know what you mean exactly by "sink config" but the config file I run for ODAS on the RPi has the sampling rate for "raw" as 16000. However the sampling frequency for "separated" and "postfiltered" are 44100.

from odas_web.

GodCed avatar GodCed commented on July 17, 2024

Yes those are the ones. They "sink" data out of ODAS. Try setting them at 16000 like your RAW (and adjust the ODAS Web configuration accordingly).

from odas_web.

chejmadi avatar chejmadi commented on July 17, 2024

Okay, that makes sense. I'll try!

from odas_web.

chejmadi avatar chejmadi commented on July 17, 2024

Also, should they be the same as RAW even when I'm recording on the raspberry Pi itself (without ODAS Web, just saving it to a file on the Pi)?

from odas_web.

GodCed avatar GodCed commented on July 17, 2024

Except if your application really requires a specific sample rate they should. There is no advantage to upsampling above the RAW sample rate that I know of, because you can't "create" new information that was not present in the source signal.

from odas_web.

chejmadi avatar chejmadi commented on July 17, 2024

I see.
So I changed the sampling frequency to 16000, but the problem persists. The audio files have recorded but are of only 1.54 seconds length, however now I can clearly hear my voice in them. But the length problem is still there. I'm going to try and upload a screenshot of my laptop terminal.
I have no idea how "Recorder 23 was defined" turns up there. Then there's a bunch of (what I think are) Javascript messages.
image
image

from odas_web.

chejmadi avatar chejmadi commented on July 17, 2024

I've also noticed that it closes the connection by itself after showing that the Write Stream is full. I haven't touched anything on the Pi end. The connection got closed somehow, even though there was no error message on the Pi's terminal.

image

image

Once the connection closes, it starts outputting this on the terminal on my laptop
image

At this point, two 1.54s recordings have been processed and can be played back. Two are still processing, by the looks of things.
image

And at the moment it seems like they aren't going to be processed. There's a request timeout.
image

from odas_web.

GodCed avatar GodCed commented on July 17, 2024

The connection closing on the Pi is expected as ODAS simply stops without any message when it can't sink in real time.

There is a mix of request timeout from Google Speech and recording buffer full in the ODAS Studio output. Can you disable the Google Speech Transcription for now it will isolate things and make the output cleaner. Also if you could get a terminal output with proper line termination it would certainly improve readability.

As for recorder up to 34, that is really strange as recorders are created in a loop from 0 to 3.

from odas_web.

chejmadi avatar chejmadi commented on July 17, 2024

Yeah the line termination drives me nuts too. Sorry. I didn't know if I can change that. Not sure if it's an electron thing or npm thing (it was a huge pain getting those two things set up on my laptop).
Anyway I disabled the Google Speech Transcription and hey presto, the recorder started working again! I guess there has to be something wrong with the API key or something, I'll go through the link you shared. Thanks!

from odas_web.

GodCed avatar GodCed commented on July 17, 2024

Glad to hear it’s working. From what I seen your key file seems good, the error

Error 14: UNAVAIBLE

Leads me to think there is a network connectivity problem from the laptop to the google API.

A quick search turned out this thread.

I don’t know how networking works trough the Windows Linux subsystem but Google API seems to dislike proxies so maybe, but this is a far fetched maybe, it comes from there.

from odas_web.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.