Hey, I executed the mic_vad_streaming example with the following command: <p dir="

<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-images.github

Anaconda Environment: <a target="_blank" rel="noopener noreferrer nofollow" href="

BUG: mic_vad_streaming partially recognizing voice about deepspeech-examples HOT 4 CLOSED

burgil commented on June 11, 2024

BUG: mic_vad_streaming partially recognizing voice

from deepspeech-examples.

Comments (4)

burgil commented on June 11, 2024

Forgot to say thanks for the helpers!! I will highly appreciate it!

from deepspeech-examples.

burgil commented on June 11, 2024

from deepspeech-examples.

burgil commented on June 11, 2024

Anaconda Environment:

from deepspeech-examples.

burgil commented on June 11, 2024

In case this would help,
I recorded myself twice using the microphone quality setting set to 44100hz, I recorded one at 44.1khz and the other on 16khz since I recently found other speech services to work better when I do that, here are the recordings:
(Searched google for upload files forever and uploaded it as 7z since I had trouble putting it here): https://usaupload.com/5Kvd/test-voices.7z

I recorded my voice using the following script I had:

import pyaudio, wave, time, sys, os
from datetime import datetime

CHUNK = 8192
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
RECORD_SECONDS = 4

#current_time = str(datetime.now())  #"Date/Time for File Name"
#current_time = "_".join(current_time.split()).replace(":","-")
#current_time = current_time[:-7]
#file_name = 'Audio_'+current_time+'.wav'
WAVE_OUTPUT_FILENAME = 'test.wav'
try:
  os.remove(WAVE_OUTPUT_FILENAME)
except:
  print("Nothing to clean")
p = pyaudio.PyAudio()

stream = p.open(format=FORMAT, channels = CHANNELS, rate = RATE, input = True, input_device_index = 0, frames_per_buffer = CHUNK)

print("* recording")

frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
    print(i)
    data = stream.read(CHUNK)
    frames.append(data)

print("* done recording")

stream.stop_stream()
stream.close()
p.terminate()

wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

record.cmd:

@echo off
call C:\Users\%username%\anaconda3\Scripts\activate.bat "C:\Users\%username%\anaconda3" & python record.py
pause

simply swapped RATE to 16000 and recorded again since these recordings always worked better than 44100 so I tested both just to be sure..

would be worth mentioning that I never actually tested any .wav files against DeepSpeech, just my microphone, which works as demonstrated above, what I meant is that I tested those wav files against "competitors" and it worked there, also my microphone worked there, just for reference.

Screenshots:

from deepspeech-examples.

BUG: mic_vad_streaming partially recognizing voice about deepspeech-examples HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent