daib13 / twostagevae Goto Github PK

View Code? Open in Web Editor NEW

228.0 12.0 34.0 86.35 MB

Python 100.00%

vae generative-models

twostagevae's People

Contributors

Stargazers

Watchers

twostagevae's Issues

a problem about 'unpickle' function

when I try to run the cifar10 dataset, there is a problem occurs:
Traceback (most recent call last):
File "preprocess.py", line 190, in
preporcess_cifar10()
File "preprocess.py", line 149, in preporcess_cifar10
x_train = load_cifar10_data('training')
images_array = np.concatenate(images_array, 0)
File "preprocess.py", line 51, in load_cifar10_data
img_dict = unpickle(filename)
NameError: name 'unpickle' is not defined

Am I miss something?

如何训练自定义数据集？

你好，请问一下如何支持自定义数据集的训练呢？看了一下process.py好像要自己改代码...

Default setting for reproducing the result in your paper

Hi. I would like to really appreciate your work. I have implemented you paper in PyTorch and now I am trying to reproduce your paper results in (Table 1). May I ask you to tell me what are the default settings for CelebA and Cifar10 with which you used to train the TwoStageVAE? In your paper, you referred us to a paper that introduces some hyperparameter settings, not all of them and I think they are incomplete.

pre-processing CIFAR-10

I read the discussion in OpenReview on the sensitivity of the FID score to the min-max-normalization of CIFAR-10 images.

The authors thought it was the min-max normalization carried out by imsave that changes the pixel values of images. However, I find that it is not the case.

In the reference guide of the old version of scipy (https://docs.scipy.org/doc/scipy-1.0.0/reference/generated/scipy.misc.imsave.html), there is a warning message:
"This function uses bytescale under the hood to rescale images to use the full (0, 255) range if tode is one of None, 'L', 'P', 'l'. It will also cast data for 2-D images to uint32 for mode=None (which is the default)."

But bytescale() min-max normalizes the input array if the dtype of the array is not uint8. Actually, the CIFAR-10 3-D images for python have been restored as np.uint8 arrays. Thus, the min-max normalization was not executed.

It seems to be the compression of JPG format (when imsave() saving PIL.Image image object to .jpg file) that cause the changes of pixel values. See https://stackoverflow.com/questions/21949014/python-image-save-changes-the-data

Can you please provide a "requirements.txt" for the python packages and their versions used in this repository sitory

Would you please provide a "requirements.txt" for the python packages and their versions used in this repository

Optimizing gamma to zero and mode collapse

Thank you for your very nice work. It was a very good read.
If you don't mind, I have a few questions:

You argued for the importance of optimizing gamma, and showed that as gamma goes to zero, the VAE reconstructs the same x for any z~q(z|x). But do we really want to get this scenario? Isn't this mode collapse?
If I understood correctly, the above happens because the injected noise is rapidly scaled down by the encoder variance, which goes to zero as gamma goes to zero. I think this means that q(z|x) is converging to a delta on some of its dimensions (specifically, on the nonsuperfluous ones). Then how is this nonzero measure?
Some works use pixel-wise gammas, instead of a scalar one. Does your work easily generalize to this case?
Given the interesting insights from your paper, I now wonder what should we optimize for. What metric should we track e.g. for early stopping, model selection, etc?
a) The VAE loss
b) Expectation (under z~q(z|x)) of reconstruction loss (since you argue that perfect reconstruction happens at the optima)
c) Deterministic reconstruction loss (i.e. using mean of q(z|x))
d) Wait until gamma gets under certain small threshold

FID score calculation and it's difference from tf version

I am trying to understand how to calculate the FID scores mentioned in the paper using the fid_score.py file. I understand that the score can be calculated using the evaluate_fid_score method, but it seems to be done for .npy files.

I want to know how much the difference will be in the fid score if I use the method in fid.py from https://github.com/bioinf-jku/TTUR as it allows to calculate the score from two folders of images?

pre-processing CelebA

Hi, seems like you use 128 by 128 center crop for celebA, while google use 160 by 160 center crop in their Are GANs Created Equal. See below codes from their repository:

image = tf.image.resize_image_with_crop_or_pad(image, 160, 160) image = tf.image.resize_images(image, [64, 64])

Just wondering if it is a fair comparison? Thanks!

Does reconstruction loss dominate in the 2nd stage VAE?

Hello, I recently read this paper and found it fascinating and very relevant to my work. I am a little bit confused by one aspect of your approach, however. I understand the intuition that in a standard VAE, the reconstruction term dominates such that the model learns a useful latent representation of encoded data but fails to structure this space in a way that allows sampling novel, high-quality data.

My concern is that, what prevents the second stage VAE from falling into this same trap? Isn't it possible that the reconstruction loss on z will dominate in the second stage, so that it can encode and decode samples from the stage one posterior, but fail to structure the second latent space q(u) as normally distributed so that we can't generate samples of z from the "empirical prior"?

loss值变成负的，且绝对值越来越大？

我在训练自己的数据集的时候发现一个问题，当我动了epoch和lr_epoch参数后，很容易在训练200个epoch后loss变成负值....

Values of reported KID Score

Hello!

First of all, thanks for sharing the repo!

I want to ask about reported KID Score values. In the paper available here: https://arxiv.org/pdf/1903.05789.pdf in Table 2 reported KID scores are much bigger than usual KID Score values (see for example here: https://arxiv.org/pdf/1801.01401.pdf Figure 2). Did you scale the values? Do I miss something?

Regards,
Szymon

Error when building Resnet and Wae models

I get errors when I build the Resnet and Wae models.
For Resnet the error is: assert(scales[-1] == desired_scale)
For Wae the error is: ValueError: Dimensions must be equal, but are 28 and 64 for 'sub_2' (op: 'Sub') with input shapes: [64,28,28,1], [64,64,64,3].

Infogan model is constructing without errors

preprocess.py line 164 typo: 'preporcess'

About finding a sequence of encoders

Hi, thank you for this interesting work! I was trying to read your proof in appendix E2, and got confused about the design of encoder networks. Ideally, if the decoder network is linear, i.e. f_\mu_x(z) = Az + b, the true posterior is also gaussian with mean (I\gamma + A^TA)^{-1}A^T(x-b), which is related with \gamma. However, the mean of the variational posterior in this paper is f_\mu_z(x) which is independent of \gamma. Is there anything wrong? I was trying to figure this out in the following proof, but couldn't understand why equation (30) holds. Since the variable transformation is z' = (z - z*) / \sqrt(\gamma), then shouldn't z' be somehow related with \gamma? If so, why z' can be canceled out in the second term of the second equality when taking the limit \gamma goes to \intfy?

Could you please elaborate on your loss function?

I am trying to reimplement your code in PyTorch and I need to know what is the difference between your loss function and the loss function regarding the vanilla VAE? Based on my experience, your KL-divergence formula is not correct or is not as same as the one that we see in the regular implementation of VAE and there is a subtle difference between them. Could you please explain it a little more? I also have a question about the self.gen_loss1. Could you please explain it?

Another thing which is very worthwhile to mention is that when I optimize the stage 1 network, by the end of the training, I received negative losses for loss_gen1 which I think they are related to self.loggamma_x. When I disabled it and left it constant "0", the loss values did not become negative.

daib13 / twostagevae Goto Github PK

twostagevae's People

Contributors

Stargazers

Watchers

Forkers

twostagevae's Issues

Recommend Projects

Recommend Topics

Recommend Org