Giter Site home page Giter Site logo

multi-view-information-bottleneck's People

Contributors

anonymous003784 avatar mfederici avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

multi-view-information-bottleneck's Issues

Calculation process of the KL divergence

Hello, nice work!
I have a few questions about how to calculate the KL divergence.

kl_1_2 = p_z1_given_v1.log_prob(z1) - p_z2_given_v2.log_prob(z1)
kl_2_1 = p_z2_given_v2.log_prob(z2) - p_z1_given_v1.log_prob(z2)
skl = (kl_1_2 + kl_2_1).mean() / 2.

These are the calculation steps of SKL divergence in mib.py. I wonder that if it is correct that using the probability of a single sample to calculate the KL divergence(kl_1_2 and kl_2_1). I think the KL should be calculated based on the expectation of the distribution. So there might be a little mistake here, since the code above ignores the expectation of the p_z_given_v.
Looking forward to your reply.

Derivation of loss MIB

Hello. Nice work!
I'm interested in the derivation of the loss function in appendix F. Just two small questions:
(1) How to get the second term of equation (6)? Does it use the chain rule (P3)?
(2) And I failed to understand the derivation of the first term, particularly the first step. Could you give more details or other works to follow?
Looking forward to your reply.

About other codes of the full experiments

Hello, I am working on the multi-view information bottleneck and attracted by your paper. Could you please make all your codes open source, e.g. SKETCH-BASED IMAGE RETRIEVAL and MIR-FLICKR? Thank you.

Shared parameters between encoder_1 and encoder_2

Dear MIB authors

Thank you for releasing this great and clear implementation. I have questions regarding to the definition of encoder_1 and encoder_2. From the code these two encoders share the same weights. But from the paper, these two were parameterized by theta and psi, respectively. Maybe I get it wrong. Can you help me with this?

Best,
Feng

loss does not drop

HI,

Thanks for sharing the codes. I just run your codes on MNIST, but it seems the loss does not drop.
This is the training log:

Train Loss: 16.65132975578308
Train Accuracy: 0.440000
Test Accuracy: 0.358600
Storing model checkpoint
Updating the model backup
Train Loss: 16.20352667570114
Train Accuracy: 0.520000
Test Accuracy: 0.420100
Storing model checkpoint
Updating the model backup
Train Loss: 18.444831013679504
Train Accuracy: 0.480000
Test Accuracy: 0.403400
Storing model checkpoint
Updating the model backup
Train Loss: 24.42823952436447
Train Accuracy: 0.500000
Test Accuracy: 0.411100
Storing model checkpoint
Updating the model backup
Train Loss: 23.091638684272766
Train Accuracy: 0.600000
Test Accuracy: 0.466200
Storing model checkpoint
Updating the model backup
Train Loss: 17.668635308742523
Train Accuracy: 0.640000
Test Accuracy: 0.514900
Storing model checkpoint
Updating the model backup
Train Loss: 26.7216295003891
Train Accuracy: 0.650000
Test Accuracy: 0.520900
Storing model checkpoint
Updating the model backup
Train Loss: 24.259143024683
Train Accuracy: 0.640000
Test Accuracy: 0.554900
Storing model checkpoint
Updating the model backup
Train Loss: 24.502391815185547
Train Accuracy: 0.650000
Test Accuracy: 0.539600
Storing model checkpoint
Updating the model backup
Train Loss: 28.630192399024963
Train Accuracy: 0.690000
Test Accuracy: 0.579100
Storing model checkpoint
Updating the model backup
Train Loss: 26.80162337422371
Train Accuracy: 0.710000
Test Accuracy: 0.584000
Storing model checkpoint
Updating the model backup
Train Loss: 18.396559059619904
Train Accuracy: 0.740000
Test Accuracy: 0.590100
Storing model checkpoint
Updating the model backup
Train Loss: 21.14952301979065
Train Accuracy: 0.720000
Test Accuracy: 0.597600
Storing model checkpoint
Updating the model backup
Train Loss: 28.22284960746765
Train Accuracy: 0.710000
Test Accuracy: 0.609900
Storing model checkpoint
Updating the model backup
Train Loss: 19.738620832562447
Train Accuracy: 0.740000
Test Accuracy: 0.598700
Storing model checkpoint
Updating the model backup
Train Loss: 26.586383253335953
Train Accuracy: 0.720000
Test Accuracy: 0.610100
Storing model checkpoint
Updating the model backup
Train Loss: 27.010501861572266
Train Accuracy: 0.710000
Test Accuracy: 0.615600
Storing model checkpoint
Updating the model backup
Train Loss: 24.147621154785156
Train Accuracy: 0.730000
Test Accuracy: 0.617300
Storing model checkpoint
Updating the model backup
Train Loss: 16.216432973742485
Train Accuracy: 0.720000
Test Accuracy: 0.626300
Storing model checkpoint
Updating the model backup
Train Loss: 29.65166562795639
Train Accuracy: 0.690000
Test Accuracy: 0.623700
Storing model checkpoint
Updating the model backup
Train Loss: 19.953528456389904
Train Accuracy: 0.740000
Test Accuracy: 0.630500
Storing model checkpoint
Updating the model backup
Train Loss: 24.09559604525566
Train Accuracy: 0.730000
Test Accuracy: 0.614100
Storing model checkpoint
Updating the model backup
Train Loss: 27.629364281892776
Train Accuracy: 0.700000
Test Accuracy: 0.634300
Storing model checkpoint
Updating the model backup
Train Loss: 24.558568745851517
Train Accuracy: 0.740000
Test Accuracy: 0.642000
Storing model checkpoint
Updating the model backup
Train Loss: 23.52346968650818
Train Accuracy: 0.730000
Test Accuracy: 0.649100
Storing model checkpoint
Updating the model backup
Train Loss: 28.106444120407104
Train Accuracy: 0.750000
Test Accuracy: 0.654100
Storing model checkpoint
Updating the model backup
Train Loss: 16.491002649068832
Train Accuracy: 0.740000
Test Accuracy: 0.653300
Storing model checkpoint
Updating the model backup
Train Loss: 24.8224079310894
Train Accuracy: 0.760000
Test Accuracy: 0.656900
Storing model checkpoint
Updating the model backup
Train Loss: 22.03815546631813
Train Accuracy: 0.740000
Test Accuracy: 0.660300
Storing model checkpoint
Updating the model backup
Train Loss: 21.805272683501244
Train Accuracy: 0.790000
Test Accuracy: 0.678100
Storing model checkpoint
Updating the model backup
Train Loss: 18.99772535264492
Train Accuracy: 0.790000
Test Accuracy: 0.689200
Storing model checkpoint
Updating the model backup
Train Loss: 23.89051579684019
Train Accuracy: 0.790000
Test Accuracy: 0.695400
Storing model checkpoint
Updating the model backup
Train Loss: 19.56764091551304
Train Accuracy: 0.810000
Test Accuracy: 0.713800
Storing model checkpoint
Updating the model backup
Train Loss: 18.89282363653183
Train Accuracy: 0.810000
Test Accuracy: 0.712700
Storing model checkpoint
Updating the model backup
Train Loss: 22.087528944015503
Train Accuracy: 0.830000
Test Accuracy: 0.725400
Storing model checkpoint
Updating the model backup
Train Loss: 20.261658288538456
Train Accuracy: 0.830000
Test Accuracy: 0.729000
Storing model checkpoint
Updating the model backup
Train Loss: 27.495856314897537
Train Accuracy: 0.870000
Test Accuracy: 0.740800
Storing model checkpoint
Updating the model backup
Train Loss: 21.786702036857605
Train Accuracy: 0.820000
Test Accuracy: 0.735500

It even increases compared to the former epochs. Anything wrong?

Thanks.

Classification accuracy evaluation

Great work! Just a small question. I noticed that when evaluating the classification accuracy of the learned feature, you are only using the "mean" of the encoder output to construct the dataset, rather than resampling the feature from the encoder distribution. Is this a common practice for information bottelneck encoder? Or do you have some specific reason for doing this? Thanks!

The relationship with information bottleneck

The formula of information bottleneck is
image
But I do not find it in this paper.
The loss function of MIB seems be derived from some definition proposed by authors.
I don't understand what it has to do with the information bottleneck.
Could you please give me some advices?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.