Giter Site home page Giter Site logo

Comments (4)

Beninmiao avatar Beninmiao commented on August 10, 2024

I also have this problem.

Hi,

I tried preparing DNS Sample data for 50hours and got clean, noise, noisy folders which contains audio data by running noisyspeech_synthesizer_singleprocess.py

As per the code in config/common/fullsubnet_train.toml it requires below text files. Is there any code available to generate these files from audio data?

[train_dataset]
path = "dataset.DNS_INTERSPEECH_train.Dataset"
clean_dataset = "/Datasets/DNS-Challenge-INTERSPEECH/datasets/clean_0.6.txt" noise_dataset = "/Datasets/DNS-Challenge-INTERSPEECH/datasets/noise.txt"
rir_dataset = "~/Datasets/DNS-Challenge-INTERSPEECH/datasets/rir.txt"

Regards
Yugesh

I also have this problem.

from fullsubnet.

gooran avatar gooran commented on August 10, 2024

Hi,

I tried preparing DNS Sample data for 50hours and got clean, noise, noisy folders which contains audio data by running noisyspeech_synthesizer_singleprocess.py

As per the code in config/common/fullsubnet_train.toml it requires below text files. Is there any code available to generate these files from audio data?

[train_dataset] path = "dataset.DNS_INTERSPEECH_train.Dataset" clean_dataset = "/Datasets/DNS-Challenge-INTERSPEECH/datasets/clean_0.6.txt" noise_dataset = "/Datasets/DNS-Challenge-INTERSPEECH/datasets/noise.txt" rir_dataset = "~/Datasets/DNS-Challenge-INTERSPEECH/datasets/rir.txt"

Regards Yugesh

Hi,
Did you find those text files?

from fullsubnet.

danielemirabilii avatar danielemirabilii commented on August 10, 2024

Hi, maybe I am late but it could always be useful since I had the same problem.

The .txt files are merely a list of files in the dataset folders (clean and noisy), so they should be created according to the content of your dataset. You can create a list of files using a bash script or simply the following commands in Linux/macos terminal. Here is my suggestion, assuming that your clean, noise and rir datasets are in a folder like ~/Datasets/DNS-Challenge-INTERSPEECH/datasets/clean, /noise and /rir:

cd ~/Datasets/DNS-Challenge-INTERSPEECH/datasets/clean
find "$PWD” -type f -name “*.wav” > clean_0.6.txt

and same for the noise

cd ~/Datasets/DNS-Challenge-INTERSPEECH/datasets/noise
find "$PWD” -type f -name “*.wav” > noise.txt

and RIRs

cd ~/Datasets/DNS-Challenge-INTERSPEECH/datasets/rir
find "$PWD” -type f -name “*.wav” > rir.txt

Be careful: some datasets contain hidden files, e.g., ._clean_fileid_0675.wav. Be sure to not include them in the txt files or the training will stop during the first epoch.

from fullsubnet.

gooran avatar gooran commented on August 10, 2024

Thank you. Very helpful.

from fullsubnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.