Giter Site home page Giter Site logo

Comments (4)

burgil avatar burgil commented on June 11, 2024

Forgot to say thanks for the helpers!! I will highly appreciate it!

from deepspeech-examples.

burgil avatar burgil commented on June 11, 2024

image
image

from deepspeech-examples.

burgil avatar burgil commented on June 11, 2024

Anaconda Environment:
image

from deepspeech-examples.

burgil avatar burgil commented on June 11, 2024

In case this would help,
I recorded myself twice using the microphone quality setting set to 44100hz, I recorded one at 44.1khz and the other on 16khz since I recently found other speech services to work better when I do that, here are the recordings:
(Searched google for upload files forever and uploaded it as 7z since I had trouble putting it here): https://usaupload.com/5Kvd/test-voices.7z

I recorded my voice using the following script I had:

import pyaudio, wave, time, sys, os
from datetime import datetime

CHUNK = 8192
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
RECORD_SECONDS = 4

#current_time = str(datetime.now())  #"Date/Time for File Name"
#current_time = "_".join(current_time.split()).replace(":","-")
#current_time = current_time[:-7]
#file_name = 'Audio_'+current_time+'.wav'
WAVE_OUTPUT_FILENAME = 'test.wav'
try:
  os.remove(WAVE_OUTPUT_FILENAME)
except:
  print("Nothing to clean")
p = pyaudio.PyAudio()

stream = p.open(format=FORMAT, channels = CHANNELS, rate = RATE, input = True, input_device_index = 0, frames_per_buffer = CHUNK)

print("* recording")

frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
    print(i)
    data = stream.read(CHUNK)
    frames.append(data)

print("* done recording")

stream.stop_stream()
stream.close()
p.terminate()

wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

record.cmd:

@echo off
call C:\Users\%username%\anaconda3\Scripts\activate.bat "C:\Users\%username%\anaconda3" & python record.py
pause

simply swapped RATE to 16000 and recorded again since these recordings always worked better than 44100 so I tested both just to be sure..

would be worth mentioning that I never actually tested any .wav files against DeepSpeech, just my microphone, which works as demonstrated above, what I meant is that I tested those wav files against "competitors" and it worked there, also my microphone worked there, just for reference.

Screenshots:
image

image

from deepspeech-examples.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.