Comments (7)
-
KL loss. What do you mean by a regular implementation? The KL loss is exactly the same as equation 6 in "Tutorial on variational autoencoders". Note that the approximate posterior for the j-th dimension is N(\mu_j, \sigma_j^2). Some implementations may use N(\mu_j, \sigma_j). This could be a potential difference but it will make no difference in the generative performance.
-
self.gen_loss1 = - log p_\theta(x|z), where p_\theta(x|z) is a Gaussian distribution, i.e. N(x | \hat{x}, \gamma I). Then we have
self.gen_loss1 = \sum ( (x- \hat{x})^2 / \gamma / 2 - log \gamma ) + constant,
which is our implementation. Yes self.gen_loss1 could be negative. As we argued in our paper "Diagnosing and enhancing vae models", \gamma will converge to 0 and self.gen_loss1 will go to negative infinite when the objective function is globaly optimized. Of course you can fix self.loggamma_x to be 0. But this will make the reconstruction blurry. Intuitively speaking, as \hat{x} become exactly the same as x, meaning the model produces perfect reconstruction, the only term related to \gamma in the objective is -log \gamma, which will push \gamma to 0 and the objective to negative infinite.
from twostagevae.
Ok. Thanks for the paper you referred to and again thanks for your great explanation.
Could you please tell me how did you handle the negative loss case? Did you follow a kind of policy?
As another question, I have seen in your code that you used Adam Optimizer to optimize the parameters of the network and also at the beginning of each epoch, you changed the value of the learning rate of each parameter (some kind of decaying strategy). I think, as far as I know, Adam Optimizer, on its own, changes the value of the learning rate per each parameter based on some update rule. Could you please tell me why did you change the value of learning rate manually?
from twostagevae.
-
About the negative loss. We just leave it negative. There is no need to force the loss to be positive.
-
About the adam optimizer and the learning rate. I just randomly select an optimization strategy. I don't know much about optimization. Maybe there is no need to manually change the learning rate as you said. I am not sure which way is better.
from twostagevae.
self.gen_loss1 = \sum ( (x- \hat{x})^2 / \gamma / 2 - log \gamma ) + constant,
I think the terms in the summation should be summing, instead of substracting:
\sum ( (x- \hat{x})^2 / \gamma / 2 + log \gamma )
from twostagevae.
@chanshing yes you are correct. I made a typo in the response. Thanks for pointing this out.
from twostagevae.
I'm having trouble with \gamma:
In your code it seems that N(x | \hat{x}, \gamma^2 I) then
self.gen_loss1 = \sum ( (x- \hat{x})^2 / \gamma^2 / 2 + log \gamma ) + constant,
That's right?
from twostagevae.
@mago876 yeah in the code we use N(x | \hat{x}, \gamma^2 I). In the paper and the discussion, we use N(x | \hat{x}, \gamma I). Sorry for the confusion. In our original paper draft, we used the same formulation as that in the code. But we changed it to the current version for convenience but didn’t change the code accordingly.
from twostagevae.
Related Issues (15)
- a problem about 'unpickle' function HOT 3
- About finding a sequence of encoders
- 如何训练自定义数据集? HOT 1
- loss值变成负的,且绝对值越来越大? HOT 1
- Does reconstruction loss dominate in the 2nd stage VAE?
- Values of reported KID Score
- Can you please provide a "requirements.txt" for the python packages and their versions used in this repository sitory
- FID score calculation and it's difference from tf version HOT 1
- pre-processing CelebA HOT 2
- pre-processing CIFAR-10
- preprocess.py line 164 typo: 'preporcess'
- Error when building Resnet and Wae models HOT 1
- Optimizing gamma to zero and mode collapse HOT 4
- Default setting for reproducing the result in your paper HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from twostagevae.