Giter Site home page Giter Site logo

listening-to-sound-of-silence-for-speech-denoising's People

Contributors

henryxrl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

listening-to-sound-of-silence-for-speech-denoising's Issues

Error in data processing step

I am trying to run your code. I installed all of the requirements, and then in the phase of data processing, I defined the directories in preprocessor_audioonly.py file as (I edited only this part below)

SNR = [-10, -7, -3, 0, 3, 7, 10]
for snr in tqdm(SNR):
    #DIR = '/proj/vondrick/rx2132/test_noise_robust_embedding/data/TIMIT/TEST_noisy_snr' + str(int(snr))
    DIR = '/home/mehri/data/PycharmProjects/denoise/data/sounds_of_silence_audioonly' + str(int(snr))
    CSV = '/home/mehri/data/PycharmProjects/denoise/data/sounds_of_silence_audioonly' + str(int(snr)) + '/sounds_of_silence' + str(int(snr)) + '.csv'
    build_csv(DIR, CSV, ext='.WAV')
    JSON = '/home/mehri/data/PycharmProjects/denoise/data/sounds_of_silence_audioonly' + str(int(snr)) + '.json'
    build_json_better(DIR, CSV, JSON, ext='.WAV')

But I get the error

Traceback (most recent call last):
  File "<input>", line 2, in <module>
NameError: name 'tqdm' is not defined

label for silent frame detection

For labeling, the threshold 0.08 is used for the average energy of one frame? I do not find the label processing code. I use 0.08 that does not work for me.

pretrain model

I'm trying to find the pretrained weights in the Sourcecode.

However, the link to the Pretrained weights didn't work.

So where can I find it?

metrics

I have a issue about the pesq metrics. I see that you use the pypesq package to compute the pesq. But this package just can compute narrow band pesq version. The demand dataset is 16k sampling rate and the baseline model also provide the wide band pesq version. The two values will be different. And also the other three metrics related to pesq will be different. Does author note sunch things?

What to do about pre-trained models and the ckpt argument?

I carefully read the paper and loved it. I got most of the way through to make the code work for my use case, but I can't figure out what the purpose of the ckpt parameter is.

Following the steps and fixing things along the way, I inevitably get to
FileNotFoundError: [Errno 2] No such file or directory: '../model_output/audioonly_model/model/ckpt_epoch87.pth', which should be of no surprise, but I'm clueless as to what the workaround should be.

Any guidance here please?

training procedure

Hi, do you have any guidelines for setting up the training procedure?
I request this because the pretraided model is not available yet.

Thanks !

data preprocessing

Hello,

Could you please attach a link with the final version of data that you used (i mean with wav files)?

Thank you in advance,
Sincerely yours,
Aleksandra

Labels for silent detector

Hi! First of all thank your sharing your code and your paper(It has been a pleasure to read it) I have gone through your code looking for the area were you create the labels for your audio without success, could you explain me briefly how are you generating your labels to train it ? I understand you have divided in segments of 1/30 seconds, but my question is more related to: you are setting output of that network in 100 which means you need to label 100 segments.. Am I correct? if you split a second by 30 you have 30 segments per seconds, then 2 seconds == 60 segments ie. [1,0,0,0,1,1,1,1,1,0.....,1] saying which segment is speech ==1 and which is silent ==0 ... hopefully I am understanding correctly... so how is that you match the 100 segments?

Pretrained models

Hi! Is it possible to get your models you used to get results, cause they are not in model_output directory? I would like to use then in my academic research in the university (I'm a student). Thanks!

Pretrained checkpoints availability

Hi, are the pretrained model checkpoints you used to produce the results in the paper available for download? Running the code snippets you provided in the inference section fail because ckpt 87 and ckpt 24 are not in the model_output directories. Thanks!

Error in step 2 of inference

While running step 2 of inference, I encountered an error where the script was trying to load a non existing file.
model_1_silent_interval_detection/model_output/audioonly_model/outputs/sounds_of_silence/recovered/sos_1_0000001_mixed.wav
As per the script at this point, the file is loaded. I think it is supposed to be generated from step 1. But no such file was generated. Can you please look into this? Thanks.

Dataset installation documentation is unclear

I find your README unclear when I click the links to the dataset in the "We use two datasets, DEMAND and Google’s AudioSet, as background noise." I don't see any installation process, Could you please explain it a little more specific? I don't get how should I get those datasets or how to install them

Inference Question

Could you provide more information as to how to run the inference models? For example, how do you modify the code to point to the dataset? Any extra details or examples would be greatly appreciated!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.