Giter Site home page Giter Site logo

learnedvector / a-hackers-ai-voice-assistant Goto Github PK

View Code? Open in Web Editor NEW
966.0 966.0 354.0 3.4 MB

A hackers AI voice assistant, built using Python and PyTorch.

Home Page: https://www.youtube.com/playlist?list=PL5rWfvZIL-NpFXM9nFr15RmEEh4F4ePZW

License: MIT License

Dockerfile 0.63% Python 94.84% HTML 4.22% Shell 0.30%

a-hackers-ai-voice-assistant's People

Contributors

brillianttyagi avatar dependabot[bot] avatar happyslice avatar knifeofdunwall avatar learnedvector avatar pager07 avatar tdrinker avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

a-hackers-ai-voice-assistant's Issues

Where to get the kenlm model from?

I installed and built the kenlm using the github page directions. But there is no model file, it is a bunch of folders. Is the path_to_nlm the path to that folder? If not, how can I get the kenlm model file?

Thanks.

ModuleNotFoundError: No module named 'dataset'

Hi, when I run python train.py --train_file /path/to/train/json --valid_file /path/to/valid/json --load_model_from /path/to/pretrain/speechrecognition.ckpt, this error occurs. Where does dataset library come from? From Pytorch? From Pytorch Lightning?

getting no output in engine-and demo.py

Hey,
guy's I am getting no outputs in engine-and demo.py.
Is someone able to help me out with this???

This is how it looks in the terminal when starting the engine.py script:
Bildschirmfoto 2021-04-18 13_27_54

This is how it looks in the terminal when starting the demo.py script:
Bildschirmfoto 2021-04-18 13_40_04

After hitting the start button:

Bildschirmfoto 2021-04-18 13_40_25

as you can see the speechrecognition bar just dissapears.
Hopefully some of you guys can help me out with that:)

AttributeError: 'dict' object has no attribute 'step'

Hi, i was running train.py in my own dataset and got the error: AttributeError: 'dict' object has no attribute 'step'. I found it's because the line code "self.scheduler.step(avg_loss)" and i changed it to "self.step(avg_loss)". But another error just happened: TypeError: iteration over a 0-d tensor in line code: "spectrograms, labels, input_lengths, label_lengths = batch". Does anyone solve this?
Many thanks.

cannot reshape tensor of 0 elements

I'm getting this error in dataset.py and train.py under wakeword folder while resampling waveform.
cannot reshape tensor of 0 elements into shape [-1, 40, 0] because the unspecified dimension size -1 can be any value and is ambiguous

try:
file_path = self.data.key.iloc[idx]
waveform, sr = torchaudio.load(file_path, normalization=True)
if sr > self.sr:
waveform = torchaudio.transforms.Resample(sr, self.sr)(waveform)
mfcc = self.audio_transform(waveform)
label = self.data.label.iloc[idx]
except Exception as e:
print(str(e))
Any help would be appreciated.

ModuleNotFoundError: No module named 'pydub'

Hello,

I get everytime I try to start commonvoice_create_jsons.py this error:

from pydub import AudioSegment
ModuleNotFoundError: No module named 'pydub'

I don't know why it occurs because I've selected the right interpreter and I've downloaded the right version of pydub.
This error just freaks me out.
Can someone please help me?

Decode without beam search

Hi, what a amazing repository.

After got out/logits of model, how to decode it without beamsearch?
i've been run like this in my model
( and I have id2labels

{0: "'", 1: ' ', 2: 'a', 3: 'b', 4: 'c', 5: 'd', 6: 'e', 7: 'f', 8: 'g', 9: 'h', 10: 'i', 11: 'j', 12: 'k', 13: 'l', 14: 'm', 15: 'n', 16: 'o', 17: 'p', 18: 'q', 19: 'r', 20: 's', 21: 't', 22: 'u', 23: 'v', 24: 'w', 25: 'x', 26: 'y', 27: 'z', 28: '_'}

)

with torch.no_grad():
        log_mel = featurizer(waveform).unsqueeze(1)
        out, hidden = model(log_mel, hidden)
        out = torch.nn.functional.softmax(out, dim=2)
        out = out.transpose(0, 1)
        pred_ids = torch.argmax(out, -1)
        preds = [id2label[i] for i in pred_ids[0].tolist() if id2label[i]!='_']

is that right?

Is there any alternative to torchaudio

I am using windows 10 and it is giving assertion error on runing the python script. As torchaudio is not supported to windows can I use any alternative to it.
Thank you in advance.

Variable queue not defined in Wakeword engine

For the Listener Class Implemented for Wakework engine.py file, If we take a look at queue variable inside method_listen_ , it's not globally defined.

class Listener:

    def __init__(self, sample_rate=8000, record_seconds=2):
        self.chunk = 1024
        self.sample_rate = sample_rate
        self.record_seconds = record_seconds
        self.p = pyaudio.PyAudio()
        self.stream = self.p.open(format=pyaudio.paInt16,
                        channels=1,
                        rate=self.sample_rate,
                        input=True,
                        output=True,
                        frames_per_buffer=self.chunk)

    def listen(self, queue):
        while True:
            data = self.stream.read(self.chunk , exception_on_overflow=False)
            queue.append(data)
            time.sleep(0.01)

    def run(self, queue):
        thread = threading.Thread(target=self.listen, args=(queue,), daemon=True)
        thread.start()
        print("\nWake Word Engine is now listening... \n")

How to run this on cpu

i want to train this model using cpu instead of gpu.under speech recognition folder------>in train.py what should I change in code to work it properly?

No Default OutPut Device

ALSA lib conf.c:5220:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM dmix
Traceback (most recent call last):
File "engine.py", line 124, in
asr_engine = SpeechRecognitionEngine(args.model_file, args.ken_lm_file)
File "engine.py", line 44, in init
self.listener = Listener(sample_rate=8000)
File "engine.py", line 22, in init
self.stream = self.p.open(format=pyaudio.paInt16,
File "/usr/local/lib/python3.8/dist-packages/pyaudio.py", line 750, in open
stream = Stream(self, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pyaudio.py", line 441, in init
self._stream = pa.open(**arguments)
OSError: [Errno -9996] Invalid output device (no default output device)

Getting an Assertion Error , i guess there is no function torchaudio.load()

(venv) C:\Users\Zeus\PycharmProjects\SenSix\A-Hackers-AI-Voice-Assistant\VoiceAssistant\wakeword>engine.py --model_file speechrecognition.zip
C:\Users\Zeus\PycharmProjects\SenSix\venv\lib\site-packages\torchaudio\extension\extension.py:14: UserWarning: torchaudio C++ extension is not available.
warnings.warn('torchaudio C++ extension is not available.')
C:\Users\Zeus\PycharmProjects\SenSix\venv\lib\site-packages\torchaudio\backend\utils.py:63: UserWarning: The interface of "soundfile" backend is planned to change in
0.8.0 to match that of "sox_io" backend and the current interface will be removed in 0.9.0. To use the new interface, do torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = False before setting the backend to "soundfile". Please refer to pytorch/audio#903 for the detail.
warnings.warn(

*** Make sure you have sox installed on your system for the demo to work!!!
If you don't want to use sox, change the play function in the DemoAction class
in engine.py module to something that works with your system.

Wake Word Engine is now listening...

Exception in thread Thread-2:
Traceback (most recent call last):
File "C:\Users\Zeus\AppData\Local\Programs\Python\Python38\lib\threading.py", line 932, in _bootstrap_inner
self.run()
File "C:\Users\Zeus\AppData\Local\Programs\Python\Python38\lib\threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\Zeus\PycharmProjects\SenSix\A-Hackers-AI-Voice-Assistant\VoiceAssistant\wakeword\engine.py", line 86, in inference_loop
action(self.predict(self.audio_q))
File "C:\Users\Zeus\PycharmProjects\SenSix\A-Hackers-AI-Voice-Assistant\VoiceAssistant\wakeword\engine.py", line 67, in predict
waveform, _ = torchaudio.load(fname, normalization=False) # don't normalize on train
File "C:\Users\Zeus\PycharmProjects\SenSix\venv\lib\site-packages\torchaudio\backend\soundfile_backend.py", line 41, in load
assert normalization
AssertionError

Running `demo.py` in Docker in ubuntu 20.04

Hi,

When running demo.py, I am getting this error below.

[2021-09-06 04:53:34,145] ERROR in app: Exception on /get_audio [GET]
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 2070, in wsgi_app
    response = self.full_dispatch_request()
  File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1515, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1513, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1499, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "demo.py", line 28, in get_audio
    with open('transcript.txt', 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'transcript.txt'
127.0.0.1 - - [06/Sep/2021 04:53:34] "GET /get_audio HTTP/1.1" 500 -

This is a command I ran.

root@elf-desktop:/home/jun1.oh/workspace/A-Hackers-AI-Voice-Assistant/VoiceAssistant/speechrecognition/demo# python demo.py --model_file ./speechrecognition.zip

Do I need to prepare transcript.txt file for demo?

Or is this file auto-generated by voice input when demo? In this case, probably my linux computer doesn't have a microphone installed to use.

Thanks,
Daniel

integrate with Lightning ecosystem CI

Hello and so happy to see you use Pytorch-Lightning! ๐ŸŽ‰
Just wondering if you already heard about quite the new Pytorch Lightning (PL) ecosystem CI where we would like to invite you to... You can check out our blog post about it: Stay Ahead of Breaking Changes with the New Lightning Ecosystem CI โšก
As you use PL framework for your cool project, we would like to enhance your experience and offer you safe updates to our future releases. At this moment, you run tests with a particular PL version, but it may accidentally happen that the next version will be incompatible with your project... ๐Ÿ˜• We do not intend to change anything on our project side, but still here we have a solution - ecosystem CI with testing both - your and our latest development head we can find it very early and prevent releasing eventually bad version... ๐Ÿ‘

What is needed to do?

What will you get?

  • scheduled nightly testing configured for development/stable versions
  • slack notification if something went wrong to investigate
  • testing also on multi-GPU machine as our gift to you ๐Ÿฐ

cc: @Borda

ReduceLROnPlateau problem.

File "/home/User/anaconda3/envs/User/lib/python3.7/site-packages/pytorch_lightning/trainer/connectors/optimizer_connector.py", line 47, in update_learning_rates
monitor_key = lr_scheduler['monitor']
KeyError: 'monitor'

pytorch_lightning.utilities.exceptions.MisconfigurationException: ReduceLROnPlateau requires returning a dict from configure_optimizers with the keyword monitor=. For example:return {'optimizer': optimizer, 'lr_scheduler': scheduler, 'monitor': 'your_loss'}

Any suggestion on solving this?

KeyError: 'D'

I constantly getting this error:
`During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Users/pietmuller/Dokumente/code/sr.venv/speechrecognition/neuralnet/dataset.py", line 110, in getitem
label = self.text_process.text_to_int_sequence(self.data['text'].iloc[idx])
File "/Users/pietmuller/Dokumente/code/sr.venv/speechrecognition/neuralnet/utils.py", line 51, in text_to_int_sequence
ch = self.char_map[c]
KeyError: 'D'`
I didn't change anything in the utils.py file.
I'm also wondering why it says that there is a KeyError with the character: "D". In the variable char_map_str (in utils.py) is no capital D mentioned. Also I want to train it on a common voice dataset (German to be exact)

It' not only 'D', it's 'D' and 'E' alternately but no other characters...

`configure_optimizers` must include a monitor when a `ReduceLROnPlateau` scheduler is used.

First of all, I know that the issue is existing in closed issues. But the problem is that there was no answer given how to fix it, so I don't know what to do.
I'm getting the following as error:
pytorch_lightning.utilities.exceptions.MisconfigurationException: configure_optimizers must include a monitor when a ReduceLROnPlateau scheduler is used. For example: {"optimizer": optimizer, "lr_scheduler": scheduler, "monitor": "metric_to_track"}

I don't know where I would have to put this monitor or if I really need it.
I somewhere read it could have something to do with the pytorch_lightning and I tried to downgrade it to version 1.1.1.
This giving the following error:
ImportError: cannot import name 'Batch' from 'torchtext.data' (/usr/local/lib/python3.7/dist-packages/torchtext/data/init.py)
Here I found out that I have to use the git version (the other option didn't help me).

So I would appreciate your help.

ModuleNotFoundError: No module named 'textprocess'

Guys i import all reference to python i use Linux ( debian ) i have this error ...

2021-02-11 12:04:52.873401: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-02-11 12:04:52.873459: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
/home/dani/anaconda3/lib/python3.8/site-packages/torch/cuda/init.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
return torch._C._cuda_getDeviceCount() > 0
Traceback (most recent call last):
File "demo.py", line 7, in
from engine import SpeechRecognitionEngine
File "/home/dani/Downloads/A-Hackers-AI-Voice-Assistant-master/VoiceAssistant/speechrecognition/engine.py", line 10, in
from neuralnet.dataset import get_featurizer
File "/home/dani/Downloads/A-Hackers-AI-Voice-Assistant-master/VoiceAssistant/speechrecognition/neuralnet/dataset.py", line 6, in
from utils import TextProcess
File "/home/dani/Downloads/A-Hackers-AI-Voice-Assistant-master/VoiceAssistant/speechrecognition/demo/utils.py", line 13, in
import textprocess
ModuleNotFoundError: No module named 'textprocess'

help!

pip packages

a few of the packages cannot be installed as per the versions cited next to them. not all but a few, I was wondering if that changes anything since I'm also a beginner in python and looking forward to making this work. furthermore, the venv folder is red on my PyCharm, i suppose this has something to do with the pip packages right? anway, if anyone can help me i would be more than grateful.

torchaudio C++ extension is not available

I'm getting this error in dataset.py and train.py under wakeword folder.
Warning (from warnings module):
File "C:\Users\A\AppData\Local\Programs\Python\Python36\lib\site-packages\torchaudio\extension\extension.py", line 14
warnings.warn('torchaudio C++ extension is not available.')
UserWarning: torchaudio C++ extension is not available.
when calling SpecAugment() under dataset.py file then it throughs error. I think it's related to torchaudio. I'm using its latest version. Any help would be appreciated.

'torchaudio C++ extension is not available.

'm getting this error in dataset.py and train.py under wakeword folder.
Warning (from warnings module):
File "C:\Users\A\AppData\Local\Programs\Python\Python36\lib\site-packages\torchaudio\extension\extension.py", line 14
warnings.warn('torchaudio C++ extension is not available.')
UserWarning: torchaudio C++ extension is not available.
when calling SpecAugment() under dataset.py file then it throughs error. I think it's related to torchaudio. I'm using its latest version. Any help would be appreciated.

IDE

Hi ...,
Which IDE are you using?
Best regards,
PeterPham

requirements.txt

requirements.txt file is corrupted. The versions in it were in conflict.

ModuleNotFoundError: No module named 'neuralnet.dataset'

Hello,
when I run the engine.py from the Wakeword folder I got this Error.
I have installed neuralnet, but I still got this Error. Have I done something wrong?
I think you can't do so much wrong by install a Module with pip.
If you need more Information please write. I don't know what would help.

Niklas

how many epoch we need to run?

Can you please share you loss, as I test a sample on my model and your pretrain model@google driver, both doesnt looks right

The output is either all 28 or 1, interesting...

Anyone has any idea?

ReduceLROnPlateau` scheduler

/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/optimizers.py in configure_schedulers(self, schedulers, monitor)
113 if monitor is None:
114 raise MisconfigurationException(
--> 115 'configure_optimizers must include a monitor when a ReduceLROnPlateau scheduler is used.'
116 ' For example: {"optimizer": optimizer, "lr_scheduler": scheduler, "monitor": "metric_to_track"}'
117 )

MisconfigurationException: configure_optimizers must include a monitor when a ReduceLROnPlateau scheduler is used. For example: {"optimizer": optimizer, "lr_scheduler": scheduler, "monitor": "metric_to_track"}
any suggestions?

Error from pytorch_lightning.core.lightning import LightningModule

runfile('/home/waleed/Desktop/myCode/VoiceAssistant/speechrecognition/neuralnet/train.py', wdir='/home/waleed/Desktop/myCode/VoiceAssistant/speechrecognition/neuralnet')
Traceback (most recent call last):

File "/home/waleed/Desktop/myCode/VoiceAssistant/speechrecognition/neuralnet/train.py", line 8, in
from pytorch_lightning.core.lightning import LightningModule

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/pytorch_lightning/init.py", line 56, in
from pytorch_lightning.core import LightningDataModule, LightningModule

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/pytorch_lightning/core/init.py", line 15, in
from pytorch_lightning.core.datamodule import LightningDataModule

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/pytorch_lightning/core/datamodule.py", line 22, in
from pytorch_lightning.core.hooks import CheckpointHooks, DataHooks

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/pytorch_lightning/core/hooks.py", line 18, in
from pytorch_lightning.utilities import AMPType, move_data_to_device, rank_zero_warn

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/pytorch_lightning/utilities/init.py", line 20, in
from pytorch_lightning.utilities.apply_func import move_data_to_device

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/pytorch_lightning/utilities/apply_func.py", line 25, in
from torchtext.data import Batch

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/torchtext/init.py", line 40, in
_init_extension()

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/torchtext/init.py", line 36, in _init_extension
torch.ops.load_library(ext_specs.origin)

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/torch/_ops.py", line 105, in load_library
ctypes.CDLL(path)

File "/home/waleed/anaconda3/envs/test/lib/python3.7/ctypes/init.py", line 364, in init
self._handle = _dlopen(self._name, mode)

OSError: /home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/torchtext/_torchtext.so: undefined symbol: ZN3c1023_fastEqualsForContainerERKNS_6IValueES2

Pretrained wake-word

Hello thank's for your job,

can you push your pretrained wake-word please
And can you cite the paper from which you are inspired by the architecture of the model

Raspberry Pi Support

Hello,

It's unclear which version of Raspberry Pi will this work on. I don't have raspberry pi 4, have 3B+, will that be able to handle the models and process them in real time?

Thanks and great project!

Data json?

Might be a dumb question but how are you supposed to format the data json?

create a train and test json in this format...
// make each sample is on a seperate line
{"key": "/path/to/audio/speech.wav, "text": "this is your text"}
{"key": "/path/to/audio/speech.wav, "text": "another text example"}

isn't a valid json since the top level doesn't have any keys so both I and python are confused lol

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.