Giter Site home page Giter Site logo

simsso / nips-2018-adversarial-vision-challenge Goto Github PK

View Code? Open in Web Editor NEW
10.0 10.0 1.0 93.69 MB

Code, documents, and deployment configuration files, related to our participation in the 2018 NIPS Adversarial Vision Challenge "Robust Model Track"

License: MIT License

Dockerfile 0.30% Python 66.07% Shell 1.50% Jupyter Notebook 30.19% HTML 0.82% HCL 0.25% MATLAB 0.86%
adversarial-attacks classifier nips-2018 robustness tensorflow

nips-2018-adversarial-vision-challenge's People

Contributors

florianpfisterer avatar samedguener avatar simsso avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

neineit

nips-2018-adversarial-vision-challenge's Issues

11. Working Group Meeting

11. Working Group Meeting (11. September 2018)

(aka. "is the breakthrough coming?!")
26627319846_8c9d1abfe2_b

Assignments

@Doktorgibson

  • enablement tools #11
  • input pipeline folder mounting
  • docs #33

@FlorianPfisterer

@Simsso

Model Deployment Pipeline

Let's use this issue as a thread to communicate different, possible deployment pipelines for ML models. Once we have decided upon something we can go ahead and create a wiki page.

Gradient Accumulation

Since our batch size is very limited right now (due to the inefficient memory usage of the VQ layer, as described in #58), we should implement a "virtual batch size", i.e. accumulate multiple gradients and apply them at once. That way we could specify a compute_batch_size and a update_batch_size, where the former depends on the GPU memory and the latter on our preference.

This SO answer contains relevant information.

Development in the gradient-accumulation branch

6. Working Group Meeting

6. Working Group Meeting (22. July 2018)

(aka. "Earlybird")

Work Items

@doktorgibson

  • make local deployment work
  • take necessary steps, following from our decision to use VMs (GPC and AWS)
  • make TensorBoard work
  • create documentation on Wiki

@Simsso

  • train a Tiny ImageNet classifier and submit it to the data set webpage #23
  • upload a dumb model to the challenge repository to appear in the scoreboard #20

@FlorianPfisterer (suggestions)

  • participate in either #20 or #23
  • anything else, that you consider relevant

VQ-Layer Cosine Distance

Since L1, L2, and Inf norm seem to be problematic in high dimensional space, it might be worth considering to add an additional norm order, namely 'dotp' 'cos'. It would

  1. normalize all input vectors to unit norm
  2. initialize the embedding space with unit norm vectors
  3. replace inputs with the vector from the embedding space to which the dot product is the greatest
  4. define the loss as the negative dot product

The tasks are

  • unit testing
  • implementation of the described dot product norm cosine distance
  • empirical evaluation

13. Working Group Meeting

Date: 3. Oct 2018
(aka. honeymoon)

@doktorgibson

@FlorianPfisterer

  • read wikipage of training pipeline
  • develop #51 further (implement direct embedding value assignment + tests)
  • make ResNet submittable #61

@Simsso

  • read wikipage of training pipeline
  • develop #51 further (in particular profiling #58)
  • logger class #60
  • virtual batch sizes #64
  • dotp norm order #63

Vector Quantization Layer

Development of a production-ready vector quantization (VQ) layer in TensorFlow, based on the prototype developed in #25 and merged with #52 (+prototype 2).

Development branch: vq-layer

Documentation

Sub-tasks

  • Test batch size effect on gradient (f390595)
  • Assignment of least used embedding space vectors with values from input x that were furthest away from embedding space vectors
  • Consider developing a custom C++ op
  • Research on current memory efficiency (which parts consume a lot of RAM, how to profile)
  • is_training parameter updates only during training (not needed anymore)
  • Add scatter_update call to tf.GraphKeys.UPDATE_OPS
  • dotp norm order #63

Paper Discussion: Adversarial Logit Pairing

Discussion of the paper "Adversarial Logit Pairing" by Harini Kannan, Alexey Kurakin, Ian Goodfellow (16 Mar 2018).

Let's see whether having an issue for a discussion event is helpful or unnecessary overhead. At least it's a good way of documenting it.

Abstract:

In this paper, we develop improved techniques for defending against adversarial examples at scale. First, we implement the state of the art version of adversarial training at unprecedented scale on ImageNet and investigate whether it remains effective in this setting - an important open scientific question (Athalye et al., 2018). Next, we introduce enhanced defenses using a technique we call logit pairing, a method that encourages logits for pairs of examples to be similar. When applied to clean examples and their adversarial counterparts, logit pairing improves accuracy on adversarial examples over vanilla adversarial training; we also find that logit pairing on clean examples only is competitive with adversarial training in terms of accuracy on two datasets. Finally, we show that adversarial logit pairing achieves the state of the art defense on ImageNet against PGD white box attacks, with an accuracy improvement from 1.5% to 27.9%. Adversarial logit pairing also successfully damages the current state of the art defense against black box attacks on ImageNet (Tramer et al., 2018), dropping its accuracy from 66.6% to 47.1%. With this new accuracy drop, adversarial logit pairing ties with Tramer et al.(2018) for the state of the art on black box attacks on ImageNet.

2. Working Group Meeting

2. Working Group Meeting (18. June 2018)

(aka. "The Birthday Meeting")

Topics

  • Linear combinations experiments #4
  • Cloud analysis (partly) #2
  • Reading matter (papers, articles, code) #6
  • CNN knowledge #5
  • Pipeline #9

4. Working Group Meeting

4. Working Group Meeting (9. July 2018)

(aka. "Distributed development: Berlin, Stuttgart, Karlsruhe")

Work Items

@doktorgibson in #11 and #9

  • make cloud testing CNN ML-Engine compatible
  • define interfaces for Google Cloud Storage (GCS)
  • logging of images / text
  • make TensorBoard work
  • local execution with debugger

@Simsso

@FlorianPfisterer

Enablement Tools

  • restore pre-trained TensorFlow CNN
  • apply to given dataset
  • create Dockerfile
  • coordinate deployment with @doktorgibson

1. Working Group Meeting

1. Working Group Meeting (9. June 2018)

Attendees: Timo Denk, Samed Güner

Challenge

Tracks

Research focus on defense track. However, state of the art knowledge about attacks is requried in order to validate new defenses. We need to do research in both topics and have to come up with something new at the defense track.

  • Defense: Find a function X->W where X is the set of all images and W are the classes. The function is parametrized by theta.
  • Attack: Find a function (x1,w,theta)->x2, which takes an image x1 of class w and finds another image x2 using the weights theta of the given model. Such that |x2-x1| is minimal and x2 is misclassified.

Evaluation Criterion

  1. Let M be the model and S be the set of samples.
  2. We apply the five best untargeted attacks on M for each sample in S. Sample from training or test data? Do targeted attacks come into play as well?
  3. For each sample we record the minimum adversarial L2 distance (MAD) across the attacks. L2 can behave in a weird way (curse of dimensionality). Our test should also be validated using L2 distance.
  4. If a model misclassifies a sample then the minimum adversarial distance is registered as zero for this sample.
  5. The final model score is the median MAD across all samples.
  6. The higher the score, the better.

Our deployment pipeline should perform validation in a very similar manner. In particular L2 distance and median.

Example:

let list of distances d = []
foreach image I_s[i] in the dataset S
    calculate 5 perturbation I_p[j] images from I_s[i]
    foreach image I_p[j]
        calculate |I_p[j]-I_s[i]|_2
    Add minimum L_2 distance to d
return median(d)

Deadlines

  • June 25th, 2018: Challenge begins
  • November 1st: Final submission date
  • November 15th: Winners Announced

Research

We need to do research on both topics. Relevant papers need to be determined, asap.

Papers

Ideas

Linear combinations of inputs (for evaluation). Determine the distance from an image (when linearly approaching an image of another class) of the first miss-classified input. Analyze how noisy the classifications along the line are.

Derivative penalties regularize training with penalities for high first and second order derivatives wrt input changes.

Growing filters of the CNN, similiar to progressive growing of GANs. I have not seen any research that goes into this direction, but it might work. New filters would be faded in slowly.

Fisher information matrix for network size reduction Overcoming catastrophic forgetting in neural networks. The matrix contains information on how relevant certain weights are for classification.

Dropout on kernel level / additive noise to kernels of higher layers. This might be common already, we have to do research on that.

Deployment and Infrastructure

Deployment on AWS or GCP (Azure is not an option, and never was).
Funding through free tier budget, our own money, and subsequently and in the long run sponsoring by SAP Machine Learning Foundation. Also, there might be sponsored computing power available.

Miscellaneous

12. Working Group Meeting

Date: 21. September 2018
(aka. no aka this time)

Assignments

@doktorgibson

@Simsso & @FlorianPfisterer

  • first: step (1) of #42
  • then: step (3) of #42 (see below)

@Simsso

  • complete #36
  • finish setup.py (1) #42
  • input pipeline #42
  • vector quantization experiments #25

@FlorianPfisterer

  • README.md and type issues (1) #42
  • rewrite trainer class & add appropriate flags #42
  • (optionally) vector quantization experiments #25

TensorFlow Wiki Page

  • while practicing & familiarizing with TensorFlow, simultaneously build up a knowledge base

ResNet Fine-Tuning

(sub-task of #23)

After training a ResNet model from scratch (#31) and retraining Inception (#29) have not yielded satisfying results, we will try to fine-tune a ResNet with the weights provided here.

Wiki docs

The overall goal is the extraction of the ResNet model from the provided code. Then it can be used for experiments such as the insertion of layers, addition of VQ (#25), etc. All of that with accuracies from 60 to 70%.

Linear Combination Experiments

Conduct experiments on the classification behavior with linear combinations as network input. Here, a linear combination is a mixture of two images of different (or perhaps even same) classes.

Marriage of VQ, ResNet, and Pipeline

Let's bring everything together 🎉

  • Extend ResNet by additional VQ layers
  • Train the embedding space with our pipeline on GCP
  • Submit the trained model to the challenge website

Location: KIT CS library
Date: 24. Sep 2018
Time: 17:30

Branch Clean-up

I think we've got more branches than needed. Maybe it's time to clean things up a bit.

No high priority, just opening the issue for now.

Infrastructure of Adversarial Attacks

In order to validate or models we want to set up an automated pipeline of adversarial attacks which run on every submitted model. This thread is about

  • defining an interface that our attacks use, and our models provide
  • understanding the model format which we have to submit in order to participate in the challenge

Related links:

5. Working Group Meeting

5. Working Group Meeting (16. July 2018)

(aka. "Only two were left...")

Work Items

@doktorgibson

  • take necessary steps, following from our decision to use VMs (GPC and AWS) [rather than the poorly documented ML Engine]
  • deployment of a containerized model
  • logging of images / text
  • make TensorBoard work

@Simsso

  • read "One pixel attacks" paper
  • train a Tiny ImageNet classifier and submit it to the data set webpage #23
  • upload a dumb model to the challenge repository to appear in the scoreboard #20

@FlorianPfisterer

3. Working Group Meeting

3. Working Group Meeting (1. July 2018)

(aka. "Exams actually suck")

Tasks

@doktorgibson

  • Complete GCP analysis
  • Complete AWS analysis
  • GCP GCE vs GKE
  • Deployment hands-on

@FlorianPfisterer

  • Papers
  • Reference model #11
  • Build up TensorFlow knowledge in general

@Simsso

  • Papers
  • Layerwise perturbations analysis #14
  • Update attack and defense wiki pages

ResNet Base Code Clean-up and Enhancement

Clean-up of the code in the resnet-base branch.

We'll proceed as follows

  1. Superficial clean-up of the present code (resnet-base)
  2. Merge into master
  3. Feature branches with individual PRs for improvements and features

(1) Superficial Clean-up

  • Create README.md
  • Finish setup.py
  • Resolve type issues with BaseModel / ResNet

(3) Features

  • Rewrite the entire trainer class (+ base class)
  • Refactor the input pipeline to use tf.data.
  • Split-up def build_model(self) -> None: in ResNet into multiple separate function calls (with the aim of having simpler method replications in inheriting experiment-classes)
  • Flags
  • Refactor the model graph construction code (and documentation)

We'll extend list (3) when conducting step (2).

ResNet Base Submittable

  • Change the resnet-base model such that it can be easily submitted.
  • Including usage of trained embedding weights of a VQ layer.
  • Submit a model without gradient skipping (copying the gradient from VQ-output to VQ-input) and see how Foolbox behaves if gradients cannot be computed (w.r.t. whitebox-attack).

VQ Profiling

Memory and performance profiling of the VQ layer code (branch vq-layer).

  • Investigate how the memory consumption changes depending on (batch size, vector size, embedding space size)
  • Familiarize with profiling tools
  • Assess development of a C++ op that's more efficient

9. Working Group Meeting

9. Working Group Meeting (26. August 2018)

(aka. "Half-time")

Assignments

Still, it remains mission critical to find a good classifier architecture and parametrization #23.

Jupyter Notebook for Experiments

Create a Jupyter notebook that can be used to quickly try attacks, compute gradients, and play around in general.

Maybe we should have (1) one model for MNIST (for super fast training convergence and experiment iteration; here is an inspiration) and (2) one for Tiny ImageNet, just to have some more realistic testing platform as well. The latter could be based on the TF cifar10 network.

Attack Implementation

In order to fully understand different adversarial attacks, we implement some of them ourselves.

Vector Quantization Prototype

Implementation our defense mechanism "CNN Filter Compression" / "Vector Quantization" (described in the wiki).

Validation implies verifying, whether

  • training a CNN with encoded filters is still possible
  • a trained CNN is more robust wrt. adversarial attacks
  • a trained CNN ist more robust to black-box attacks (transferability)

ResNet Model Function Clean-up

Clean-up of the file that defines the ResNet model graph. File in the PR: here

The code contains many comments that have been copied and the model construction is not sequential. Right now it's hard to add new elements to the graph. That should be simplified.

Logger Class

Write a generic logger class that can be used to accumulate scalar values and histograms, with logging to TensorFlow.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.