Giter Site home page Giter Site logo

chirag126 / vog Goto Github PK

View Code? Open in Web Editor NEW
59.0 3.0 6.0 252.17 MB

Estimating Example Difficulty using Variance of Gradients

Home Page: https://varianceofgradients.github.io/

Python 97.90% Shell 2.10%
interpretability atypical-examples human-in-the-loop-auditing deep-learning explainability

vog's Introduction

Estimating Example Difficulty using Variance of Gradients

This repository contains source code necessary to reproduce some of the main results in the paper:

If you use this software, please consider citing:

@inproceedings{agarwal2022estimating, 
title={Estimating example difficulty using variance of gradients},
author={Agarwal, Chirag and D'souza, Daniel and Hooker, Sara},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={10368--10378},
year={2022}
}

1. Setup

Installing software

This repository is built using a combination of TensorFlow and PyTorch. You can install the necessary libraries by pip installing the requirements text file pip install -r ./requirements_tf.txt and pip install -r ./requirements_pytorch.txt

2. Usage

Toy experiment

toy_script.py is the script for running toy dataset experiment. You can analyze the training/testing data at diffferent stages of the training, viz. Early, Middle, and Late, using the flags split and mode. The vog_cal flag enables visualizing different versions of VOG scores such as the raw score, class normalized, or the absolute class normalized scores.

Examples

Running python3 toy_script.py --split test --mode early --vog_cal normalize generates the toy dataset decision boundary figure along with the relation between the perpendicular distance of individual points from the decision boundary and the VOG scores. The respective figures are:

Left: The visualization of the toy dataset decision boundary with the testing data points. The Multiple Layer Perceptron model achieves 100% training accuracy. Right: The scatter plot between the Variance of Gradients (VoGs) for each testing data point and their perpendicular distance shows that higher scores pertain to the most challenging examples (closest to the decision boundary)

ImageNet

The main scripts for the ImageNet experiments are in the ./imagenet/ folder.

  1. Before calculating the VOG scores you would need to store the gradients of the respective images in the ./scripts/train.txt/ file using model snapshots. For demonstration purpose, we have shared the model weights of the late stage, i.e. steps 30024, 31275, and 32000. Now, for example, we want to store the gradients for the imagenet dataset (stored as /imagenet_dir/train) at snapshot 32000, we run the shell script train_get_gradients.sh like:

source train_get_gradients.sh 32000 ./imagenet/train_results/ 9 ./scripts/train.txt/

  1. For this repo, we have generated the gradients for 100 random images for the late stage training process and stored the results in ./imagenet/train_results/. To generate the error rate performance at different VOG deciles run train_visualize_grad.py using the following command. python train_visualize_grad.py

On analyzing the VOG score for a particular class (e.g. below are magpie and pop bottle) in the late training stage, we found two unique groups of images. In this work, we hypothesize that examples that a model has difficulty learning (images on the right) will exhibit higher variance in gradient updates over the course of training (. On the other hand, the gradient updates for the relatively easier examples are expected to stabilize early in training and converge to a narrow range of values.

Each 5ร—5 grid shows the top-25 ImageNet training-set images with the lowest (left column) and highest (right column) VOG scores for the class magpie and pop bottle with their predicted labels below the image. Training set images with higher VOG scores (b) tend to feature zoomed-in images with atypical color schemes and vantage points.

4. Licenses

Note that the code in this repository is licensed under MIT License, but, the pre-trained condition models used by the code have their own licenses. Please carefully check them before use.

5. Questions?

If you have questions/suggestions, please feel free to email or create github issues.

vog's People

Contributors

chirag126 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

vog's Issues

Image super-resolution tasks

Hi there, I came across your work when I went to the last CVPR and have been thinking for quite some time on how to apply this for image super-resolution. Currently, the script is for classification networks with a labelled training this, how can this be adapted for tasks which do not have predicted scores or labels?
Thanks

Question about VoG on early training stage

While reading paper, I get confused about the conflicts between figure and the contents.

image

Section 3.5, VoG understands early and late training dynamics part

  • In the early training stage, samples having higher VoG scores have a lower average error rate as the gradient updates hinge on easy examples.
  • This phenomenon reverses during the late-stage of the training, where, across all datasets, high VoG scores in the late-stage have the highest error rates as updates to the challenging examples dominate the computation of variance.

What I understand is below

  • In early-stage training, easy examples have high VoG score
  • In late-stage training, difficult examples have high VoG score

But Figure 2 seems different what I expected.

Early-stage training with highest VoG score doesn't look like easy examples, however lowest VoG score seems quite easy examples.

Is there something I'm missing?

An unexpected result when using the pre-softmax layer instead of the softmax layer

Thank you for sharing the amazing work!

When I run the toy example, it runs perfectly fine and shows the exact same result you uploaded in this repo.
Below are the command and the result.

  • command
    $ python3 toy_script.py --split test --mode early --vog_cal normalize
    
  • result
    image

However, I found that the code uses the gradient of the softmax layer w.r.t. the input, which differs from the paper in that the pre-softmax layer is used in the paper. So I changed a single line of toy_script.py as below and got a somewhat weird result when I run the code again.

  • change
    Screenshot from 2022-06-29 20-53-18
  • result
    image

What did I miss here?

CUDA runs out of memory

Hi,
While running this code, because the gradients w.r.t all images is of the dimension [50000,32,32,3], it seems that my CUDA is running out of memory very soon .

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.