learnedvector / a-hackers-ai-voice-assistant Goto Github PK

View Code? Open in Web Editor NEW

966.0 966.0 354.0 3.4 MB

A hackers AI voice assistant, built using Python and PyTorch.

Home Page: https://www.youtube.com/playlist?list=PL5rWfvZIL-NpFXM9nFr15RmEEh4F4ePZW

License: MIT License

Dockerfile 0.63% Python 94.84% HTML 4.22% Shell 0.30%

a-hackers-ai-voice-assistant's People

Contributors

Stargazers

Watchers

Forkers

wakeupcoders vageesh79 evenstay ukumar009 nindidooo jartican sagemedia-gb m-usamasaleem brunocavagnaro multitude0099 ferii selvadevan valaia aamitabhforks hiroforyou knifeofdunwall dikanggu tommacpherson mns1yash saurabh2727 brillianttyagi huynhnhathao silexxx omarhory aakash310 sudhirreddy1312 happyslice anandsinhaprojects xpert-tekh-dev xiao11lam sougatabh datadve lesliescherbeijn longjohncoder synapticsynergy durban24k mayeedit3 kdferreira hirotruck krishnakireeti-2k7 stardrop9 imuledx muckels-incorporated kf7mxe dreejrock neela-c pager07 troboto driptasenapati stuartiannaylor thrice369 zemarkhos vamseekodavaluri rgurve rayavarapuvikram1 kimkwangho82 ebulo qingw amulmgr viditsingh17 ssahgal paramone jbanzzz vladimirshleyev crisdc sunying1985 bunty12345-blip nandu-chandran aj-istricky kabeer11000 manug25 tgowthambits docquantic pcannon67 aeykeyzs tejastank 3dimaging silexcorp glockii ebrukusak amajo0209 tranmduc aialuke mizu-cmd 78tuba317 kaiyikang devicey777 nafisahmad rexplush crookjg techgorilla himeshph dvelopedbyikshaan vinace hankerson terrisgo hanzala123 afiqmuzaffar terri1102 ahmad-devhub

a-hackers-ai-voice-assistant's Issues

How do i use split_commonvoice.py correctly?

Most of it is clear to me but the thing i dont know is which file i should use as the --file_name argument. the rest is explained very clearly.

Model is not being saved after first epoch

Hey guys,
is it normal that my model is not being saved after the first epoch?

Where to get the kenlm model from?

I installed and built the kenlm using the github page directions. But there is no model file, it is a bunch of folders. Is the path_to_nlm the path to that folder? If not, how can I get the kenlm model file?

Thanks.

Getting error while running wakeword engine.py

Please help i am not getting that why i am getting this error on running the wakeword engine.py python script.
I am new to use ubuntu so please help me.

Thanks...

ModuleNotFoundError: No module named 'dataset'

Hi, when I run python train.py --train_file /path/to/train/json --valid_file /path/to/valid/json --load_model_from /path/to/pretrain/speechrecognition.ckpt, this error occurs. Where does dataset library come from? From Pytorch? From Pytorch Lightning?

getting no output in engine-and demo.py

Hey,
guy's I am getting no outputs in engine-and demo.py.
Is someone able to help me out with this???

This is how it looks in the terminal when starting the engine.py script:

This is how it looks in the terminal when starting the demo.py script:

After hitting the start button:

as you can see the speechrecognition bar just dissapears.
Hopefully some of you guys can help me out with that:)

AttributeError: 'dict' object has no attribute 'step'

Hi, i was running train.py in my own dataset and got the error: AttributeError: 'dict' object has no attribute 'step'. I found it's because the line code "self.scheduler.step(avg_loss)" and i changed it to "self.step(avg_loss)". But another error just happened: TypeError: iteration over a 0-d tensor in line code: "spectrograms, labels, input_lengths, label_lengths = batch". Does anyone solve this?
Many thanks.

cannot reshape tensor of 0 elements

I'm getting this error in dataset.py and train.py under wakeword folder while resampling waveform.
cannot reshape tensor of 0 elements into shape [-1, 40, 0] because the unspecified dimension size -1 can be any value and is ambiguous

try:
file_path = self.data.key.iloc[idx]
waveform, sr = torchaudio.load(file_path, normalization=True)
if sr > self.sr:
waveform = torchaudio.transforms.Resample(sr, self.sr)(waveform)
mfcc = self.audio_transform(waveform)
label = self.data.label.iloc[idx]
except Exception as e:
print(str(e))
Any help would be appreciated.

ModuleNotFoundError: No module named 'pydub'

Hello,

I get everytime I try to start commonvoice_create_jsons.py this error:

from pydub import AudioSegment
ModuleNotFoundError: No module named 'pydub'

I don't know why it occurs because I've selected the right interpreter and I've downloaded the right version of pydub.
This error just freaks me out.
Can someone please help me?

Decode without beam search

Hi, what a amazing repository.

After got out/logits of model, how to decode it without beamsearch?
i've been run like this in my model
( and I have id2labels

{0: "'", 1: ' ', 2: 'a', 3: 'b', 4: 'c', 5: 'd', 6: 'e', 7: 'f', 8: 'g', 9: 'h', 10: 'i', 11: 'j', 12: 'k', 13: 'l', 14: 'm', 15: 'n', 16: 'o', 17: 'p', 18: 'q', 19: 'r', 20: 's', 21: 't', 22: 'u', 23: 'v', 24: 'w', 25: 'x', 26: 'y', 27: 'z', 28: '_'}

)

with torch.no_grad():
        log_mel = featurizer(waveform).unsqueeze(1)
        out, hidden = model(log_mel, hidden)
        out = torch.nn.functional.softmax(out, dim=2)
        out = out.transpose(0, 1)
        pred_ids = torch.argmax(out, -1)
        preds = [id2label[i] for i in pred_ids[0].tolist() if id2label[i]!='_']

is that right?

Is there any alternative to torchaudio

I am using windows 10 and it is giving assertion error on runing the python script. As torchaudio is not supported to windows can I use any alternative to it.
Thank you in advance.

pytorch_lightning.utilities.exceptions.MisconfigurationException: `configure_optimizers` must include a monitor when a `ReduceLROnPlateau` scheduler is used.

When I try to train my own dataset with train.py I got this error...

Any suggestion, please...

Variable queue not defined in Wakeword engine

For the Listener Class Implemented for Wakework engine.py file, If we take a look at queue variable inside method_listen_ , it's not globally defined.

class Listener:

    def __init__(self, sample_rate=8000, record_seconds=2):
        self.chunk = 1024
        self.sample_rate = sample_rate
        self.record_seconds = record_seconds
        self.p = pyaudio.PyAudio()
        self.stream = self.p.open(format=pyaudio.paInt16,
                        channels=1,
                        rate=self.sample_rate,
                        input=True,
                        output=True,
                        frames_per_buffer=self.chunk)

    def listen(self, queue):
        while True:
            data = self.stream.read(self.chunk , exception_on_overflow=False)
            queue.append(data)
            time.sleep(0.01)

    def run(self, queue):
        thread = threading.Thread(target=self.listen, args=(queue,), daemon=True)
        thread.start()
        print("\nWake Word Engine is now listening... \n")

How to run this on cpu

i want to train this model using cpu instead of gpu.under speech recognition folder------>in train.py what should I change in code to work it properly?

Any solution for dual channel, skipping audio file? i get this audio from mimic recording studio and suddenly it stopped working

No Default OutPut Device

ALSA lib conf.c:5220:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM dmix
Traceback (most recent call last):
File "engine.py", line 124, in
asr_engine = SpeechRecognitionEngine(args.model_file, args.ken_lm_file)
File "engine.py", line 44, in init
self.listener = Listener(sample_rate=8000)
File "engine.py", line 22, in init
self.stream = self.p.open(format=pyaudio.paInt16,
File "/usr/local/lib/python3.8/dist-packages/pyaudio.py", line 750, in open
stream = Stream(self, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pyaudio.py", line 441, in init
self._stream = pa.open(**arguments)
OSError: [Errno -9996] Invalid output device (no default output device)

wakeword engine.py always give prediction value of 1.0 regardless of given voice input

I use around 300 .wav files for each label so that unbalance of data is not the issue. Also, when I trained the model the accuracy is almost perfect ~0.99.

Getting an Assertion Error , i guess there is no function torchaudio.load()

(venv) C:\Users\Zeus\PycharmProjects\SenSix\A-Hackers-AI-Voice-Assistant\VoiceAssistant\wakeword>engine.py --model_file speechrecognition.zip
C:\Users\Zeus\PycharmProjects\SenSix\venv\lib\site-packages\torchaudio\extension\extension.py:14: UserWarning: torchaudio C++ extension is not available.
warnings.warn('torchaudio C++ extension is not available.')
C:\Users\Zeus\PycharmProjects\SenSix\venv\lib\site-packages\torchaudio\backend\utils.py:63: UserWarning: The interface of "soundfile" backend is planned to change in
0.8.0 to match that of "sox_io" backend and the current interface will be removed in 0.9.0. To use the new interface, do torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = False before setting the backend to "soundfile". Please refer to pytorch/audio#903 for the detail.
warnings.warn(

*** Make sure you have sox installed on your system for the demo to work!!!
If you don't want to use sox, change the play function in the DemoAction class
in engine.py module to something that works with your system.

Wake Word Engine is now listening...

Exception in thread Thread-2:
Traceback (most recent call last):
File "C:\Users\Zeus\AppData\Local\Programs\Python\Python38\lib\threading.py", line 932, in _bootstrap_inner
self.run()
File "C:\Users\Zeus\AppData\Local\Programs\Python\Python38\lib\threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\Zeus\PycharmProjects\SenSix\A-Hackers-AI-Voice-Assistant\VoiceAssistant\wakeword\engine.py", line 86, in inference_loop
action(self.predict(self.audio_q))
File "C:\Users\Zeus\PycharmProjects\SenSix\A-Hackers-AI-Voice-Assistant\VoiceAssistant\wakeword\engine.py", line 67, in predict
waveform, _ = torchaudio.load(fname, normalization=False) # don't normalize on train
File "C:\Users\Zeus\PycharmProjects\SenSix\venv\lib\site-packages\torchaudio\backend\soundfile_backend.py", line 41, in load
assert normalization
AssertionError

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2929: character maps to <undefined>

Running `demo.py` in Docker in ubuntu 20.04

Hi,

When running demo.py, I am getting this error below.

[2021-09-06 04:53:34,145] ERROR in app: Exception on /get_audio [GET]
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 2070, in wsgi_app
    response = self.full_dispatch_request()
  File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1515, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1513, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1499, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "demo.py", line 28, in get_audio
    with open('transcript.txt', 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'transcript.txt'
127.0.0.1 - - [06/Sep/2021 04:53:34] "GET /get_audio HTTP/1.1" 500 -

This is a command I ran.

root@elf-desktop:/home/jun1.oh/workspace/A-Hackers-AI-Voice-Assistant/VoiceAssistant/speechrecognition/demo# python demo.py --model_file ./speechrecognition.zip

Do I need to prepare transcript.txt file for demo?

Or is this file auto-generated by voice input when demo? In this case, probably my linux computer doesn't have a microphone installed to use.

Thanks,
Daniel

new model training fail maybe something wrong in data loading. please help.

python stopped working

integrate with Lightning ecosystem CI

Hello and so happy to see you use Pytorch-Lightning! 🎉
Just wondering if you already heard about quite the new Pytorch Lightning (PL) ecosystem CI where we would like to invite you to... You can check out our blog post about it: Stay Ahead of Breaking Changes with the New Lightning Ecosystem CI ⚡
As you use PL framework for your cool project, we would like to enhance your experience and offer you safe updates to our future releases. At this moment, you run tests with a particular PL version, but it may accidentally happen that the next version will be incompatible with your project... 😕 We do not intend to change anything on our project side, but still here we have a solution - ecosystem CI with testing both - your and our latest development head we can find it very early and prevent releasing eventually bad version... 👍

What is needed to do?

have some tests, including PL integration
add config to ecosystem CI - https://github.com/PyTorchLightning/ecosystem-ci

What will you get?

scheduled nightly testing configured for development/stable versions
slack notification if something went wrong to investigate
testing also on multi-GPU machine as our gift to you 🐰

cc: @Borda

ReduceLROnPlateau problem.

File "/home/User/anaconda3/envs/User/lib/python3.7/site-packages/pytorch_lightning/trainer/connectors/optimizer_connector.py", line 47, in update_learning_rates
monitor_key = lr_scheduler['monitor']
KeyError: 'monitor'

pytorch_lightning.utilities.exceptions.MisconfigurationException: ReduceLROnPlateau requires returning a dict from configure_optimizers with the keyword monitor=. For example:return {'optimizer': optimizer, 'lr_scheduler': scheduler, 'monitor': 'your_loss'}

Any suggestion on solving this?

KeyError: 'D'

I constantly getting this error:
`During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Users/pietmuller/Dokumente/code/sr.venv/speechrecognition/neuralnet/dataset.py", line 110, in getitem
label = self.text_process.text_to_int_sequence(self.data['text'].iloc[idx])
File "/Users/pietmuller/Dokumente/code/sr.venv/speechrecognition/neuralnet/utils.py", line 51, in text_to_int_sequence
ch = self.char_map[c]
KeyError: 'D'`
I didn't change anything in the utils.py file.
I'm also wondering why it says that there is a KeyError with the character: "D". In the variable char_map_str (in utils.py) is no capital D mentioned. Also I want to train it on a common voice dataset (German to be exact)

It' not only 'D', it's 'D' and 'E' alternately but no other characters...

`configure_optimizers` must include a monitor when a `ReduceLROnPlateau` scheduler is used.

First of all, I know that the issue is existing in closed issues. But the problem is that there was no answer given how to fix it, so I don't know what to do.
I'm getting the following as error:
pytorch_lightning.utilities.exceptions.MisconfigurationException: configure_optimizers must include a monitor when a ReduceLROnPlateau scheduler is used. For example: {"optimizer": optimizer, "lr_scheduler": scheduler, "monitor": "metric_to_track"}

I don't know where I would have to put this monitor or if I really need it.
I somewhere read it could have something to do with the pytorch_lightning and I tried to downgrade it to version 1.1.1.
This giving the following error:
ImportError: cannot import name 'Batch' from 'torchtext.data' (/usr/local/lib/python3.7/dist-packages/torchtext/data/init.py)
Here I found out that I have to use the git version (the other option didn't help me).

So I would appreciate your help.

ModuleNotFoundError: No module named 'textprocess'

Guys i import all reference to python i use Linux ( debian ) i have this error ...

2021-02-11 12:04:52.873401: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-02-11 12:04:52.873459: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
/home/dani/anaconda3/lib/python3.8/site-packages/torch/cuda/init.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
return torch._C._cuda_getDeviceCount() > 0
Traceback (most recent call last):
File "demo.py", line 7, in
from engine import SpeechRecognitionEngine
File "/home/dani/Downloads/A-Hackers-AI-Voice-Assistant-master/VoiceAssistant/speechrecognition/engine.py", line 10, in
from neuralnet.dataset import get_featurizer
File "/home/dani/Downloads/A-Hackers-AI-Voice-Assistant-master/VoiceAssistant/speechrecognition/neuralnet/dataset.py", line 6, in
from utils import TextProcess
File "/home/dani/Downloads/A-Hackers-AI-Voice-Assistant-master/VoiceAssistant/speechrecognition/demo/utils.py", line 13, in
import textprocess
ModuleNotFoundError: No module named 'textprocess'

help!

Ai

pip packages

a few of the packages cannot be installed as per the versions cited next to them. not all but a few, I was wondering if that changes anything since I'm also a beginner in python and looking forward to making this work. furthermore, the venv folder is red on my PyCharm, i suppose this has something to do with the pip packages right? anway, if anyone can help me i would be more than grateful.

torchaudio C++ extension is not available

I'm getting this error in dataset.py and train.py under wakeword folder.
Warning (from warnings module):
File "C:\Users\A\AppData\Local\Programs\Python\Python36\lib\site-packages\torchaudio\extension\extension.py", line 14
warnings.warn('torchaudio C++ extension is not available.')
UserWarning: torchaudio C++ extension is not available.
when calling SpecAugment() under dataset.py file then it throughs error. I think it's related to torchaudio. I'm using its latest version. Any help would be appreciated.

'torchaudio C++ extension is not available.

'm getting this error in dataset.py and train.py under wakeword folder.
Warning (from warnings module):
File "C:\Users\A\AppData\Local\Programs\Python\Python36\lib\site-packages\torchaudio\extension\extension.py", line 14
warnings.warn('torchaudio C++ extension is not available.')
UserWarning: torchaudio C++ extension is not available.
when calling SpecAugment() under dataset.py file then it throughs error. I think it's related to torchaudio. I'm using its latest version. Any help would be appreciated.

IDE

Hi ...,
Which IDE are you using?
Best regards,
PeterPham

requirements.txt

requirements.txt file is corrupted. The versions in it were in conflict.

ModuleNotFoundError: No module named 'neuralnet.dataset'

Hello,
when I run the engine.py from the Wakeword folder I got this Error.
I have installed neuralnet, but I still got this Error. Have I done something wrong?
I think you can't do so much wrong by install a Module with pip.
If you need more Information please write. I don't know what would help.

Niklas

ModuleNotFoundError: No module named 'utils' in dataset.py when try to run demo.py and engine.py

no folder exist named utils in neuralnet folder.

language model aplha and beta

How do you determine alpha and beta language model?

get timestamp

how to get timestamp for each word?

RuntimeError: The traced function didn't return any values! Side-effects are not captured in traces, so it would be a no-op.

I get this error when tracing NLU model with optimize-graph.py

installing CTCDECODE

Any solution on installing missing dependencies?

How Could I Train My Own Kenelm Language Model and witch Michael used

from https://github.com/kpu/kenlm. and which Michael used

error in using argparse in ipython google colab

how many epoch we need to run?

Can you please share you loss, as I test a sample on my model and your pretrain model@google driver, both doesnt looks right

The output is either all 28 or 1, interesting...

Anyone has any idea?

Can I also use mp3 files for training instead of wav files?

Hello first of all nice work!
I wanted to ask if I could use mp3 files instead of wav files and which lines I have to change for that, if this is working?

ReduceLROnPlateau` scheduler

/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/optimizers.py in configure_schedulers(self, schedulers, monitor)
113 if monitor is None:
114 raise MisconfigurationException(
--> 115 'configure_optimizers must include a monitor when a ReduceLROnPlateau scheduler is used.'
116 ' For example: {"optimizer": optimizer, "lr_scheduler": scheduler, "monitor": "metric_to_track"}'
117 )

MisconfigurationException: configure_optimizers must include a monitor when a ReduceLROnPlateau scheduler is used. For example: {"optimizer": optimizer, "lr_scheduler": scheduler, "monitor": "metric_to_track"}
any suggestions?

Error from pytorch_lightning.core.lightning import LightningModule

runfile('/home/waleed/Desktop/myCode/VoiceAssistant/speechrecognition/neuralnet/train.py', wdir='/home/waleed/Desktop/myCode/VoiceAssistant/speechrecognition/neuralnet')
Traceback (most recent call last):

File "/home/waleed/Desktop/myCode/VoiceAssistant/speechrecognition/neuralnet/train.py", line 8, in
from pytorch_lightning.core.lightning import LightningModule

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/pytorch_lightning/init.py", line 56, in
from pytorch_lightning.core import LightningDataModule, LightningModule

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/pytorch_lightning/core/init.py", line 15, in
from pytorch_lightning.core.datamodule import LightningDataModule

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/pytorch_lightning/core/datamodule.py", line 22, in
from pytorch_lightning.core.hooks import CheckpointHooks, DataHooks

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/pytorch_lightning/core/hooks.py", line 18, in
from pytorch_lightning.utilities import AMPType, move_data_to_device, rank_zero_warn

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/pytorch_lightning/utilities/init.py", line 20, in
from pytorch_lightning.utilities.apply_func import move_data_to_device

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/pytorch_lightning/utilities/apply_func.py", line 25, in
from torchtext.data import Batch

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/torchtext/init.py", line 40, in
_init_extension()

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/torchtext/init.py", line 36, in _init_extension
torch.ops.load_library(ext_specs.origin)

File "/home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/torch/_ops.py", line 105, in load_library
ctypes.CDLL(path)

File "/home/waleed/anaconda3/envs/test/lib/python3.7/ctypes/init.py", line 364, in init
self._handle = _dlopen(self._name, mode)

OSError: /home/waleed/anaconda3/envs/test/lib/python3.7/site-packages/torchtext/_torchtext.so: undefined symbol: ZN3c1023_fastEqualsForContainerERKNS_6IValueES2

Pretrained wake-word

Hello thank's for your job,

can you push your pretrained wake-word please
And can you cite the paper from which you are inspired by the architecture of the model

Raspberry Pi Support

Hello,

It's unclear which version of Raspberry Pi will this work on. I don't have raspberry pi 4, have 3B+, will that be able to handle the models and process them in real time?

Thanks and great project!

Data json?

Might be a dumb question but how are you supposed to format the data json?

create a train and test json in this format...
// make each sample is on a seperate line
{"key": "/path/to/audio/speech.wav, "text": "this is your text"}
{"key": "/path/to/audio/speech.wav, "text": "another text example"}

isn't a valid json since the top level doesn't have any keys so both I and python are confused lol

Speech Recognition

When will the documentation for Speech Recognition be available ?

Want to add some of the unique commands are Search, Finding Location, Playing National Anthem

I want to add these features to the voice assistant

Search
Finding Location
Playing National Anthem/ Song

Attaching my sample voice assistant here
https://github.com/dharani211/Voice-Assitant/blob/main/speech.py