yoonsanghyu / fasnet-tac-pytorch Goto Github PK
View Code? Open in Web Editor NEWFull implementation of "End-to-end microphone permutation and number invariant multi-channel speech separation" (Interspeech 2020)
Full implementation of "End-to-end microphone permutation and number invariant multi-channel speech separation" (Interspeech 2020)
Hello,
I am trying to use your model as a frontend for speaker separation. The data was generated properly, but training fails due to request of non-existing filenames. The training data for 6 mics contains the directories from "sample16001" to "sample20000", but the train.py script looks the files in the paths like "6mic/sample12628" and "6mic/sample5779", which do not exist. Could you please give a hint, where to fix the ranges of directories for training data? See the complete error messages below.
/home/stanislav_kruchinin/venv/pytorch/lib64/python3.6/site-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
warnings.warn("PySoundFile failed. Trying audioread instead.")
Traceback (most recent call last):
File "/home/stanislav_kruchinin/venv/pytorch/lib64/python3.6/site-packages/librosa/core/audio.py", line 149, in load
with sf.SoundFile(path) as sf_desc:
File "/home/stanislav_kruchinin/venv/pytorch/lib64/python3.6/site-packages/soundfile.py", line 629, in init
self._file = self._open(file, mode_int, closefd)
File "/home/stanislav_kruchinin/venv/pytorch/lib64/python3.6/site-packages/soundfile.py", line 1184, in _open
"Error opening {0!r}: ".format(self.name))
File "/home/stanislav_kruchinin/venv/pytorch/lib64/python3.6/site-packages/soundfile.py", line 1357, in _error_check
raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening '/home/stanislav_kruchinin/data/NoisySpeech/MC_Libri_adhoc/train/6mic/sample12628/mixture_mic1.wav': System error.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 120, in
main(args)
File "train.py", line 114, in main
solver.train()
File "/home/stanislav_kruchinin/src/speech_separation/fasnet_tac_pytorch/solver.py", line 77, in train
tr_avg_loss = self._run_one_epoch(epoch)
File "/home/stanislav_kruchinin/src/speech_separation/fasnet_tac_pytorch/solver.py", line 150, in _run_one_epoch
for i, (data) in enumerate(data_loader):
File "/home/stanislav_kruchinin/venv/pytorch/lib64/python3.6/site-packages/torch/utils/data/dataloader.py", line 521, in next
data = self._next_data()
File "/home/stanislav_kruchinin/venv/pytorch/lib64/python3.6/site-packages/torch/utils/data/dataloader.py", line 561, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/home/stanislav_kruchinin/venv/pytorch/lib64/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
return self.collate_fn(data)
File "/home/stanislav_kruchinin/src/speech_separation/fasnet_tac_pytorch/data.py", line 108, in _collate_fn
mix, _ = librosa.load(mix_path, sr)
File "/home/stanislav_kruchinin/venv/pytorch/lib64/python3.6/site-packages/librosa/core/audio.py", line 166, in load
y, sr_native = __audioread_load(path, offset, duration, dtype)
File "/home/stanislav_kruchinin/venv/pytorch/lib64/python3.6/site-packages/librosa/core/audio.py", line 190, in __audioread_load
with audioread.audio_open(path) as input_file:
File "/home/stanislav_kruchinin/venv/pytorch/lib64/python3.6/site-packages/audioread/init.py", line 111, in audio_open
return BackendClass(path)
File "/home/stanislav_kruchinin/venv/pytorch/lib64/python3.6/site-packages/audioread/rawread.py", line 62, in init
self._fh = open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/home/stanislav_kruchinin/data/NoisySpeech/MC_Libri_adhoc/train/6mic/sample12628/mixture_mic1.wav'
When I try to run this repo, it asserted an error that the dimension of estimate_sources and that of source signal are not equal. I also print the dimension, the estimated is [1,2,64000] while the other is [3,2,64000], the information is below
/home/xwang/FasNet/data.py:108: FutureWarning: Pass sr=16000 as keyword args. From version 0.10 passing these as positional argument
result in an error mix, _ = librosa.load(mix_path, sr)
/home/xwang/FasNet/data.py:116: FutureWarning: Pass sr=16000 as keyword args. From version 0.10 passing these as positional argument
result in an error s1, _ = librosa.load(s1_path, sr)
/home/xwang/FasNet/data.py:117: FutureWarning: Pass sr=16000 as keyword args. From version 0.10 passing these as positional argument
result in an error s2, _ = librosa.load(s2_path, sr)
torch.Size([1, 2, 64000])
torch.Size([3, 2, 64000])
torch.Size([1, 2, 64000])
Traceback (most recent call last):
File "train.py", line 119, in
main(args)
File "train.py", line 113, in main
solver.train()
File "/home/xwang/FasNet/solver.py", line 77, in train
tr_avg_loss = self._run_one_epoch(epoch)
File "/home/xwang/FasNet/solver.py", line 164, in _run_one_epoch
cal_loss(padded_source, estimate_source, mixture_lengths)
File "/home/xwang/FasNet/pit_criterion.py", line 25, in cal_loss
max_snr, perms, max_snr_idx = cal_si_snr_with_pit(source,
File "/home/xwang/FasNet/pit_criterion.py", line 42, in cal_si_snr_with_pit
assert source.size() == estimate_source.size()
AssertionError
Do you need to input microphone coordinates for this project? thank you
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.