andimarafioti / audiocontextencoder Goto Github PK

View Code? Open in Web Editor NEW

25.0 8.0 2.0 6.98 MB

A context encoder for audio inpainting

Python 15.45% Jupyter Notebook 80.91% MATLAB 0.77% HTML 1.14% CSS 1.73%

paper machine-learning context-encoder

audiocontextencoder's Introduction

Hi there 👋

audiocontextencoder's People

Contributors

Stargazers

Watchers

Forkers

pandorals wl3b10s

audiocontextencoder's Issues

Network that works with stft of signal without gap and targets mag(stft) of gap + sides

I secretly trained this network over night since we all wanted to (didn't we?). It takes the STFTs from the sides without gap information and produces the mag stft for the gap with sides information. Some curves on the l2Loss of the trainable variables and the reconstruction loss (gray is this net, orange is the one we had going from mag to mag, please take into account that one produced only 7 frames so reconstruction loss is not scaled)

And for the SNRs (remember we had some weird SNRs on that one, I think they were still being computed on 1D):

Using the original phase we get SNRs of 21.3 dBs. I'm trying to see what I can do with Griffin Lim but the word around the institute is it produces bad SNRs.

L2 loss and normalization

Use normalization instead of strange l2 loss?

Dataset pipeline

The full dataset pipeline should be organised in a user friendly way.

The user want to do

./make_nynthdataset.py /path/fileout

and everything should be done automatically. (download, extract, create tfRecord,...)

network working with the STFT magnitude only

This should be a parameter

                logs_path = '../logdir_real_cae/' + self._name  # write each run to a diff folder.

SNR on spectograms

Instead of using the SNR of the reconstructed 1D signal, take SNR of spectograms. This should be pretty straight forward taking the reduce_sum on the euclidean norm in the frequency and time axis.

Re-organisation of the code

I start to have trouble understanding your code. It is very hard to follow how the data flow in your code. Do you see what I mean?

Also, this cannot be more confusing:

signal = aModel.output()

with tf.name_scope('Energy_Spectogram'):
    fft_frame_length = 512
    fft_frame_step = 128
    stft = tf.contrib.signal.stft(signals=signal, frame_length=fft_frame_length, frame_step=fft_frame_step)

    sides_stft = tf.stack((stft[:, :15, :], stft[:, 15+7:, :]), axis=3)

    mag_stft = tf.abs(sides_stft)    # (256, 15, 257, 2)
    aModel.setOutputTo(mag_stft)

I would be cool if you could re-organise your code in a way that I (or other people) can more easily grap what you are doing.
I do not think it is a major change, but maybe a clever re-oganisation.

What do you think?

update readme.md

excuse me. i found so many files in you project, but the readme.md is not clear, you said: "To train the network, execute in the parent folder python paperArchitecture.py", i even can not find it . please tell me more information about your training process. Thank you so much.

Mag network with original phase

Nicki wants me to try to use the magnitude network with the original phase from the signal to see the result

Nicki's hand crafted skip connection

Save the mean and std of the individual signals, correct for that before inputing and re apply it before computing the loss. It's a hand-crafted skip connection. Should be applied on both dims simultaneously for STFTs.

Network that works with stft of signal without gap and targets stft of gap + sides

I ran this over the weekend. Because the target of this network has 11 windows instead of the 7 from the last experiments, the reconstruction losses are not directly comparable. But here they are anyway (red is this new one):

The SNRs should be comparable:

After reconstruction though, we find that exclusively for the gap, the SNR is ...

Make Tf reader with STFT coefficient

1 quarter of the dataset, STFT with a redundancy of 4

Network working with STFT real and imag

Use dropout correctly

I added a dropout feature to the sequential model. Preliminary tests on it are a bit hard to asses.

I trained two equivalents networks for 800k steps with a learning rate of 1e-3. In orange there's a network with dropout = 0.3 for the linear layer and 0.1 for all conv and deconv layers except the last deconv. In blue is the same network without any dropout.
I think the sudden change in the orange one in the training SNR comes when I restarted the training with dropout = 0.3 for the linear layer (before it was 0.5, I'm not really sure)

It seems to work well since the performance on the validation test is better with dropout and worse on the training set.

What do you think? Should I run more tests? Are this parameters good for you? (30% on the linear layer and 10% on convs)

I also tried the same net w/only dropout=50% on convs (blue):

Network that works with stft mag of signal without gap and targets stft mag of gap + sides

No module named 'architecture'

I get the following error when trying to unpickle ''magnitude_network_parameter.pkl"
Traceback (most recent call last):
File "/home/akclee/miniconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3319, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 2, in
data = pickle.load(f)
File "/home/akclee/.pycharm_helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
module = self._system_import(name, *args, **kwargs)
ModuleNotFoundError: No module named 'architecture'

Does more context improve the results?

Everything is in the title...