henryxrl / listening-to-sound-of-silence-for-speech-denoising Goto Github PK
View Code? Open in Web Editor NEW[NeurIPS 2020] Official repository for the project "Listening to Sound of Silence for Speech Denoising"
[NeurIPS 2020] Official repository for the project "Listening to Sound of Silence for Speech Denoising"
I am trying to run your code. I installed all of the requirements, and then in the phase of data processing, I defined the directories in preprocessor_audioonly.py file as (I edited only this part below)
SNR = [-10, -7, -3, 0, 3, 7, 10]
for snr in tqdm(SNR):
#DIR = '/proj/vondrick/rx2132/test_noise_robust_embedding/data/TIMIT/TEST_noisy_snr' + str(int(snr))
DIR = '/home/mehri/data/PycharmProjects/denoise/data/sounds_of_silence_audioonly' + str(int(snr))
CSV = '/home/mehri/data/PycharmProjects/denoise/data/sounds_of_silence_audioonly' + str(int(snr)) + '/sounds_of_silence' + str(int(snr)) + '.csv'
build_csv(DIR, CSV, ext='.WAV')
JSON = '/home/mehri/data/PycharmProjects/denoise/data/sounds_of_silence_audioonly' + str(int(snr)) + '.json'
build_json_better(DIR, CSV, JSON, ext='.WAV')
But I get the error
Traceback (most recent call last):
File "<input>", line 2, in <module>
NameError: name 'tqdm' is not defined
For labeling, the threshold 0.08 is used for the average energy of one frame? I do not find the label processing code. I use 0.08 that does not work for me.
I'm trying to find the pretrained weights in the Sourcecode.
However, the link to the Pretrained weights didn't work.
So where can I find it?
I have a issue about the pesq metrics. I see that you use the pypesq package to compute the pesq. But this package just can compute narrow band pesq version. The demand dataset is 16k sampling rate and the baseline model also provide the wide band pesq version. The two values will be different. And also the other three metrics related to pesq will be different. Does author note sunch things?
I carefully read the paper and loved it. I got most of the way through to make the code work for my use case, but I can't figure out what the purpose of the ckpt
parameter is.
Following the steps and fixing things along the way, I inevitably get to
FileNotFoundError: [Errno 2] No such file or directory: '../model_output/audioonly_model/model/ckpt_epoch87.pth'
, which should be of no surprise, but I'm clueless as to what the workaround should be.
Any guidance here please?
Hi, do you have any guidelines for setting up the training procedure?
I request this because the pretraided model is not available yet.
Thanks !
Hello,
Could you please attach a link with the final version of data that you used (i mean with wav files)?
Thank you in advance,
Sincerely yours,
Aleksandra
Hi! First of all thank your sharing your code and your paper(It has been a pleasure to read it) I have gone through your code looking for the area were you create the labels for your audio without success, could you explain me briefly how are you generating your labels to train it ? I understand you have divided in segments of 1/30 seconds, but my question is more related to: you are setting output of that network in 100 which means you need to label 100 segments.. Am I correct? if you split a second by 30 you have 30 segments per seconds, then 2 seconds == 60 segments ie. [1,0,0,0,1,1,1,1,1,0.....,1] saying which segment is speech ==1 and which is silent ==0 ... hopefully I am understanding correctly... so how is that you match the 100 segments?
Hi! Is it possible to get your models you used to get results, cause they are not in model_output directory? I would like to use then in my academic research in the university (I'm a student). Thanks!
Hi, are the pretrained model checkpoints you used to produce the results in the paper available for download? Running the code snippets you provided in the inference section fail because ckpt 87 and ckpt 24 are not in the model_output directories. Thanks!
While running step 2 of inference, I encountered an error where the script was trying to load a non existing file.
model_1_silent_interval_detection/model_output/audioonly_model/outputs/sounds_of_silence/recovered/sos_1_0000001_mixed.wav
As per the script at this point, the file is loaded. I think it is supposed to be generated from step 1. But no such file was generated. Can you please look into this? Thanks.
I find your README unclear when I click the links to the dataset in the "We use two datasets, DEMAND and Google’s AudioSet, as background noise." I don't see any installation process, Could you please explain it a little more specific? I don't get how should I get those datasets or how to install them
Could you provide more information as to how to run the inference models? For example, how do you modify the code to point to the dataset? Any extra details or examples would be greatly appreciated!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.