henryxrl / listening-to-sound-of-silence-for-speech-denoising Goto Github PK

View Code? Open in Web Editor NEW

53.0 53.0 23.0 1.41 GB

[NeurIPS 2020] Official repository for the project "Listening to Sound of Silence for Speech Denoising"

Python 100.00%

listening-to-sound-of-silence-for-speech-denoising's People

Contributors

Stargazers

Watchers

listening-to-sound-of-silence-for-speech-denoising's Issues

Error in data processing step

I am trying to run your code. I installed all of the requirements, and then in the phase of data processing, I defined the directories in preprocessor_audioonly.py file as (I edited only this part below)

SNR = [-10, -7, -3, 0, 3, 7, 10]
for snr in tqdm(SNR):
    #DIR = '/proj/vondrick/rx2132/test_noise_robust_embedding/data/TIMIT/TEST_noisy_snr' + str(int(snr))
    DIR = '/home/mehri/data/PycharmProjects/denoise/data/sounds_of_silence_audioonly' + str(int(snr))
    CSV = '/home/mehri/data/PycharmProjects/denoise/data/sounds_of_silence_audioonly' + str(int(snr)) + '/sounds_of_silence' + str(int(snr)) + '.csv'
    build_csv(DIR, CSV, ext='.WAV')
    JSON = '/home/mehri/data/PycharmProjects/denoise/data/sounds_of_silence_audioonly' + str(int(snr)) + '.json'
    build_json_better(DIR, CSV, JSON, ext='.WAV')

But I get the error

Traceback (most recent call last):
  File "<input>", line 2, in <module>
NameError: name 'tqdm' is not defined

label for silent frame detection

For labeling, the threshold 0.08 is used for the average energy of one frame? I do not find the label processing code. I use 0.08 that does not work for me.

pretrain model

I'm trying to find the pretrained weights in the Sourcecode.

However, the link to the Pretrained weights didn't work.

So where can I find it?

metrics

I have a issue about the pesq metrics. I see that you use the pypesq package to compute the pesq. But this package just can compute narrow band pesq version. The demand dataset is 16k sampling rate and the baseline model also provide the wide band pesq version. The two values will be different. And also the other three metrics related to pesq will be different. Does author note sunch things?

What to do about pre-trained models and the ckpt argument?

I carefully read the paper and loved it. I got most of the way through to make the code work for my use case, but I can't figure out what the purpose of the ckpt parameter is.

Following the steps and fixing things along the way, I inevitably get to
FileNotFoundError: [Errno 2] No such file or directory: '../model_output/audioonly_model/model/ckpt_epoch87.pth', which should be of no surprise, but I'm clueless as to what the workaround should be.

Any guidance here please?

training procedure

Hi, do you have any guidelines for setting up the training procedure?
I request this because the pretraided model is not available yet.

Thanks !

data preprocessing

Hello,

Could you please attach a link with the final version of data that you used (i mean with wav files)?

Thank you in advance,
Sincerely yours,
Aleksandra

Labels for silent detector

Hi! First of all thank your sharing your code and your paper(It has been a pleasure to read it) I have gone through your code looking for the area were you create the labels for your audio without success, could you explain me briefly how are you generating your labels to train it ? I understand you have divided in segments of 1/30 seconds, but my question is more related to: you are setting output of that network in 100 which means you need to label 100 segments.. Am I correct? if you split a second by 30 you have 30 segments per seconds, then 2 seconds == 60 segments ie. [1,0,0,0,1,1,1,1,1,0.....,1] saying which segment is speech ==1 and which is silent ==0 ... hopefully I am understanding correctly... so how is that you match the 100 segments?

Pretrained models

Hi! Is it possible to get your models you used to get results, cause they are not in model_output directory? I would like to use then in my academic research in the university (I'm a student). Thanks!

Pretrained checkpoints availability

Hi, are the pretrained model checkpoints you used to produce the results in the paper available for download? Running the code snippets you provided in the inference section fail because ckpt 87 and ckpt 24 are not in the model_output directories. Thanks!

Error in step 2 of inference

While running step 2 of inference, I encountered an error where the script was trying to load a non existing file.
model_1_silent_interval_detection/model_output/audioonly_model/outputs/sounds_of_silence/recovered/sos_1_0000001_mixed.wav
As per the script at this point, the file is loaded. I think it is supposed to be generated from step 1. But no such file was generated. Can you please look into this? Thanks.

Dataset installation documentation is unclear

I find your README unclear when I click the links to the dataset in the "We use two datasets, DEMAND and Google’s AudioSet, as background noise." I don't see any installation process, Could you please explain it a little more specific? I don't get how should I get those datasets or how to install them

Inference Question

Could you provide more information as to how to run the inference models? For example, how do you modify the code to point to the dataset? Any extra details or examples would be greatly appreciated!

henryxrl / listening-to-sound-of-silence-for-speech-denoising Goto Github PK

listening-to-sound-of-silence-for-speech-denoising's People

Contributors

Stargazers

Watchers

Forkers

listening-to-sound-of-silence-for-speech-denoising's Issues

Recommend Projects

Recommend Topics

Recommend Org