Giter Site home page Giter Site logo

snip-pruning's Introduction

SNIP-pruning

Report from ICLR Reproducibility Challenge. Reproduced paper is SNIP: SINGLE-SHOT NETWORK PRUNING BASED ON CONNECTION SENSITIVITY (https://openreview.net/forum?id=B1VZqjAcYX).

There is .pdf file with report and source code of reproduction available. Link to the issue: reproducibility-challenge/iclr_2019#130

SNIP: Single-shot pruning

One can describe neural network’s loss sensitivity on zeroing a weight by its absolute change. We will call it the exact salience from now. Intuitively, if this sensitivity is small, then removing the parameter shouldn’t prevent neural network from learning, since it didn’t affect the loss function anyway. Although, one usually wants to remove more than one parameter and since they are dependent on each other, in order get the least loss affecting set of η parameters to prune, one needs to check every possible combination of η parameters. It requires this many forward passes in a network. Since this is computationally impossible, we are assuming that influence of weight on ∆L is independent from every other weight. This is the first of two assumptions we must make.

Since computing the exact salience for every weight separately is also computationally expensive – it requires m forward passes to compute (for m being number of parameters in the network) – we will assume that the exact salience corresponds to loss change for infinitesimally small ε as the argument difference. This second assumption is the one that describes the SNIP method the fullest. With it we can define a salience of a weight as the derivative of the loss with respect to the indicator value equal to 1. Such salience calculation can be done in modern frameworks very efficiently, for all weights at once. Salience could be then standardized to get parameters’ importances in percentages. After evaluating the saliences, one can set the indicator to 0 for selected connections and get the pruned network.

snip-pruning's People

Contributors

sjmikler avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

snip-pruning's Issues

Minor bug (?) to your evaluation code

Hi,
Thanks for sharing your implementation ! I like how the weight is pruned via the method in the Prune class. However, I notice that during the training phase, your code call op() to zero out the pruned weight. But this is not done in NN.evaluation and NN.predict. I think this would cause the output accuracy to be misleading since the data is forward through a pruned network with one extra optimization step. (i.e. Weights that are supposed to be zero would no longer be zero due to the gradient update)
Thanks for any further clarification !

SNIP should calculate gradients on indicators/mask, not on weights

Hello,

Thanks for implementing in PyTorch, but i believe there is a something wrong in the code.

In the original paper and implementation, loss is differentiated against the connection indicators and not the weights.

image

From Lee's original code in line 67:
grads = tf.gradients(loss, [mask_init[k] for k in prn_keys])

I understand you have weight = indicator * weight before computing gradients, but i can't see where you extract the gradients for the indicators only. I see you've posted on the pytorch forum about this but nobody has answered properly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.