Giter Site home page Giter Site logo

duanenielsen / deepinfomaxpytorch Goto Github PK

View Code? Open in Web Editor NEW
315.0 315.0 47.0 397 KB

Learning deep representations by mutual information estimation and maximization

Home Page: https://arxiv.org/abs/1808.06670

Python 100.00%
autoencoder compression deep-learning pytorch

deepinfomaxpytorch's Introduction

Welcome to Duane's Github

I like making cool reinforcement learning projects.

Breakout

SpaceInvaders

deepinfomaxpytorch's People

Contributors

duanenielsen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepinfomaxpytorch's Issues

The loss value is negative

Excuse me,author. I met a problem when I use the mutual information, its loss value is negative at the beginning. Is this normal?

How was the result with full loss used?

Thanks for your generous contribution. I'd like to know how was the performance with entire Loss (I noticed that the result you provided on readme is only local part).
By the way, which dataset did you evaluate on?

Why the experimental results are inferior to those reported in paper?

Hi, I have checked the code, and I think everything is ok. But the results are inferior to those reported in the paper. For example, the accuracy for CIFAR-10 with this code is about 60% (DeepInfoMax-Local), while that in the paper is about 70%.
Do you have any idea that why there is such a big difference?

Matching representations to a prior distribution is wrong?

@DuaneNielsen I think that you have not implemented adversarial matching of the distributions. You calculate PRIOR loss here with the y that comes from here, so when you do loss.backward() here the weights of the encoder will be altered so that the PRIOR loss is minimized - i.e. the PRIOR loss from the paper is maximized since you are doing negative of the original loss. So you are altering the weights of the encoder so that you maximize the PRIOR loss from the paper, rather than minimizing it. Have I understood this wrong?

Great Code!

amazing! were you able to get up to 69 accuracy? just asking

some potential bugs w.r.t BN layers

Thanks for your sharing! I notice that you use batchnorm layers in encoder and do not apply model.eval() in testing stage. This may bring performance degradation.

Why does prior distribution have no encoder loss?

the following code :

term_a = torch.log(self.prior_d(prior)).mean()
term_b = torch.log(1.0 - self.prior_d(y)).mean()
PRIOR = - (term_a + term_b) * self.gamma

"-(term_a + term_b)" is the loss of Discriminator, and “term_b” is the loss of encoder( similar as generator of gan )

In the code you only backward Discriminator's loss(part of prior distribution), and there is no backward of the loss that belongs to the encoder in the prior distribution.

loss.backward()  // loss = global+local + prior , prior =-(term_a+term_b)
optim.step()
loss_optim.step()

I think it could be the following process

term_a = torch.log(self.prior_d(prior)).mean()
term_b = torch.log(1.0 - self.prior_d(y.detach())).mean()  // y should be detach
PRIOR = - (term_a + term_b) * self.gamma
encoder_loss_for_p = term_b
.............

loss.backward()  // loss = global+local + prior , prior =-(term_a+term_b)
optim.step()   //update the gradient from global+local but no prior
loss_optim.step()

encoder_loss_for_p.backward()   //optim the encoder for Adversarial
optim.step()

Is my understanding wrong?

How would I apply this to non-image (1-dimensional) data?

First off, thanks for the implementation of this code, it's great!

I'm interested in applying DIM to non-image data, i.e., I just have a collection of feature vectors (not images) that I'd like to encode and maximise information between the original feature vectors and their new embeddings. I'm trying to translate the problem from 2D inputs to 1D inputs.

I have three questions:

  1. Does doing this even make sense? I can't see why the principle of maximising information between the original representation and the embedding wouldn't apply to 1D inputs.
  2. How can I implement this? As far as I understand it, the local embeddings are 2D feature maps, and the global embeddings are 1D vectors. Obviously in the 1D setting, these 2D feature maps disappear, but the 1D global embeddings remain the same. Could the local embeddings be replaced with 1D embeddings of some sort (rather than 2D maps)? The discriminator models that used 2D convolutions would therefore need to be updated.
  3. Why does the GlobalDiscriminator model have 2D convolutional layers? It was my understanding that for the global discriminator the local feature maps should be flattened and concatenated with the global embedding, but based on the code it seems the local feature maps are being further processed before being concatenated with the global embedding? Could you clarify this please?

Thanks in advance!

epoch restart?

Hi
After I download your project, I try to run your train.py with default setting, but I found out that there is no any weight file in the document, should I run the rcalland`s deep-INFOMAX first to get the network weight file first?

loss score become negative infinity and nan

I'm using this model on a vehicle dataset and trying to create image embedding. During the training, at some point, loss suddenly become negative infinity and eventually become nan. Have you guys encounter this issue yet? What do you think possibly make this happen?

The prior is wrong

The objective function of matching the prior is a min-max function,but in your code I can not see the min-max procedure,I think there is bug in your code.

Questions about loss functions

Thanks for the amazing code.
I was wondering why there is no log term with in LOCAL = (Em - Ej) * self.beta; GLOBAL = (Em - Ej) * self.alpha I think there should be a log term with Em.

EJ[Tω(x, y)] − log EM[eTω(x,y)], And For EJ why direct expectation is taken rather than of exponent.

Thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.