Giter Site home page Giter Site logo

hadisalman / smoothing-adversarial Goto Github PK

View Code? Open in Web Editor NEW
221.0 9.0 39.0 33.66 MB

Code for our NeurIPS 2019 *spotlight* "Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers"

Home Page: https://arxiv.org/abs/1906.04584

License: MIT License

Python 100.00%
adversarial-machine-learning deep-neural-networks adversarial-defense

smoothing-adversarial's Introduction

Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers

This repository contains the code and models necessary to replicate the results of our recent paper:

Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers
Hadi Salman, Greg Yang, Jerry Li, Huan Zhang, Pengchuan Zhang, Ilya Razenshteyn, Sebastien Bubeck
Paper: https://arxiv.org/abs/1906.04584
Blog post: https://decentdescent.org/smoothadv.html

Our paper outperforms all existing provably L2-robust classifiers by a significant margin on ImageNet and CIFAR-10, establishing the state-of-the-art for provable L2-defenses.

Note: the models corresponding to the best certified accuracy for each of the radii in the above tables can be found here. Also, for the clean accuracies corresponding to these models, see Tables 16 and 17 in the paper.

Overview of the Repository

Our code is based on the open source code of Cohen et al (2019). The major content of our repo are:

  • code/ contains the code for our experiments.
  • data/ contains the log data from our experiments.
  • analysis/ contains the plots and tables, based on the contents of data/, that are shown in our paper.

Let us dive into the files in code/:

  1. train_pgd.py: the main code to adversarially train smoothed classifiers.
  2. train.py: the original training code of Cohen et al (2019) using Gaussian noise data augmentation.
  3. certify.py: Given a pretrained smoothed classifier, returns a certified L2-radius for each data point in a given dataset using the algorithm of Cohen et al (2019).
  4. predict.py: Given a pretrained smoothed classifier, predicts the class of each data point in a given dataset.
  5. architectures.py: an entry point for specifying which model architecture to use per dataset (Resnet-50 for ImageNet, Resnet-110 for CIFAR-10).
  6. attacks.py: contains our PGD and DDN attacks for smoothed classifiers (referred to as SmoothAdvPGD and SmoothAdvDDN in the paper).

Getting started

  1. git clone https://github.com/Hadisalman/smoothing-adversarial.git

  2. Install dependencies:

conda create -n smoothing-adversarial python=3.6
conda activate smoothing-adversarial
conda install numpy matplotlib pandas seaborn
pip install setGPU
conda install pytorch torchvision cudatoolkit=10.0 -c pytorch # for Linux
  1. Download our trained models from here. Then move the downloaded models.tar.gz into the root directory of this repo. Run tar -xzvf models.tar.gz to extract the models.

  2. If you want to run ImageNet experiments, obtain a copy of ImageNet and preprocess the val directory to look like the train directory by running this script. Finally, set the environment variable IMAGENET_DIR to the directory where ImageNet is located.

  3. Let us try to certify the robustness of one of our adversarially trained CIFAR-10 models.

model="pretrained_models/cifar10/finetune_cifar_from_imagenetPGD2steps/PGD_10steps_30epochs_multinoise/2-multitrain/eps_64/cifar10/resnet110/noise_0.12/checkpoint.pth.tar"
output="certification_output"
python code/certify.py cifar10 $model 0.12 $output --skip 20 --batch 400

Check the results in certification_output. You should get similar to these results.

Example

Let us train a smoothed resnet110 CIFAR-10 classifier using SmoothAdv-ersarial training (see our paper), certify its robustness, and attack it.

Adversarial training

  • Train the model via 10 step (smooth) PGD adversarial training with ε=64/255, σ=0.12, and m_train=2
python code/train_pgd.py cifar10 cifar_resnet110 model_output_dir --batch 256 --noise 0.12 --gpu 0 --lr_step_size 50 --epochs 150 --adv-training --attack PGD --num-steps 10 --epsilon 64 --train-multi-noise --num-noise-vec 2 --warmup 10

For a faster result, start from an ImageNet pretrained model and fine-tune for only 30 epochs! We are open-sourceing 16 ImageNet pretrained models (see the imagenet32 folder here) that are trained using our code. Use the one with the desired ε and σ as shown below

python code/train_pgd.py cifar10 cifar_resnet110 model_output_dir --batch 256 --noise 0.12 --gpu 0 --lr 0.001 --epochs 30 --adv-training --attack PGD --num-steps 10 --epsilon 64 --train-multi-noise --num-noise-vec 2 --resume --pretrained-model pretrained_models/imagenet32/PGD_2steps/eps_64/imagenet32/resnet110/noise_0.12/checkpoint.pth.tar

If you even cannot wait for fine-tuning to finish, dont worry! We have a fine-tuned model ready for you. Simply set

model_output_dir=pretrained_models/cifar10/finetune_cifar_from_imagenetPGD2steps/PGD_10steps_30epochs_multinoise/2-multitrain/eps_64/cifar10/resnet110/noise_0.12

and contiue with the example!

Certification

  • Certify the trained model on CIFAR-10 test set using σ=0.12
python code/certify.py cifar10 $model_output_dir/checkpoint.pth.tar 0.12 certification_output --batch 400 --alpha 0.001 --N0 100 --N 100000

will load the base classifier saved at $model_output_dir/checkpoint.pth.tar, smooth it using noise level σ=0.12, and certify every image from the cifar10 test set with parameters N0=100, N=100000 and alpha=0.001.

Visualize robustness plots

Repeating the above two steps (Training and Certification) for σ=0.12, 0.25, 0.5, and 1.0 allows you to generate the below plot. Simply run:

python code/generate_github_result.py

This generate the below plot using our certification results. Modify the paths inside generate_github_result.py to point to your certification_output in order to plot your results.

Prediction

  • Predict the classes of CIFAR-10 test set using σ=0.12
python code/predict.py cifar10 $model_output_dir/checkpoint.pth.tar 0.12 prediction_output_dir --batch 400 --N 1000 alpha=0.001

will load the base classifier saved at $model_output_dir/checkpoint.pth.tar, smooth it using noise level σ=0.12, and classify every image from the cifar10 test set with parameters N=1000 and alpha=0.001.

Empirical attack

  • Attacking the trained model using SmoothAdvPGD with ε=64/255 or 127/255 or 255/255, T=20 steps, m_test=32, and σ=0.12. Then predicts the classes of the resulting adversarial examples. The flag --visualize-examples saves the first 1000 adversarial examples in the prediction_output_dir.
python code/predict.py cifar10 $model_output_dir/checkpoint.pth.tar 0.12 prediction_output_dir --batch 400 --N 1000 --attack PGD --epsilon 64 --num-steps 20 --num-noise-vec 32 --visualize-examples 

python code/predict.py cifar10 $model_output_dir/checkpoint.pth.tar 0.12 prediction_output_dir --batch 400 --N 1000 --attack PGD --epsilon 127 --num-steps 20 --num-noise-vec 32 --visualize-examples 

python code/predict.py cifar10 $model_output_dir/checkpoint.pth.tar 0.12 prediction_output_dir --batch 400 --N 1000 --attack PGD --epsilon 255 --num-steps 20 --num-noise-vec 32 --visualize-examples 

Replicate our tables and figures

We provide code to generate all the tables and results of our paper. Simply run

python code/analyze.py

This code reads from data/ i.e. the logs that were generated when we certifiied our trained models, and automatically generates the tables and figures that we present in the paper.

Below are example plots from our paper which you will be able to replicate by running the above code.

Download our pretrained models

You can download our trained models here. These contain all our provably robust models (that achieve SOTA for provably L2-robust image classification on CIFAR-10 and ImageNet) that we present in our paper.

The downloaded folder contains three subfolders: imagenet, imagenet32, and cifar10. Each of these contains subfolders with different hyperparameters for training imagenet, downscaled imagenet(32x32), and cifar10 classifiers respectively.

For example:

  • pretrained_models/cifar10/ contains PGD_2steps/, PGD_4steps/, DDN_2steps/ ..... corresponding to different attacks.
    • PGD_2steps/ contatins:
      • eps_64/, eps_127/, eps_255/, eps_512/, correpsonding to models trained with various ε (maximum allowed L2-perturbation).
      • jobs.yaml file that has the exact commands we used to train the models in PGD_2steps/ e.g.
#"jobs.yaml"
jobs:
- name: eps_64/cifar10/resnet110/noise_0.12
  sku: G1
  sku_count: 1
  command:
  - python code/train_pgd.py cifar10 cifar_resnet110 ./ --batch 256 --noise 0.12 --gpu
    0 --lr_step_size 50 --epochs 150 --adv-training --attack PGD --epsilon 64 --num-steps
    2 --resume --warmup 10
  id: application_1556605998994_2048
  results_dir: /mnt/_output/pt-results/2019-05-02/application_1556605998994_2048
  submit_args: {}
  tags: []
  type: bash
  .
  .
  .

You should focus on the - name: and command: lines of every jobs.yaml as they reflect our experiments and their correpsonding commands. (Ignore the rest of the details which are specific to our job scheduling system).

Very Important Make sure to use the right data normalization layer if you want to use our trained models.

Below is the mapping betweem our trained models and the corresponding normalization layer that we used during training.

imagenet32/ --> NormalizeLayer
cifar10/finetune_cifar_from_imagenetPGD2steps/ --> NormalizeLayer
cifar10/self_training/ --> NormalizeLayer

imagenet/--> InputCenterLayer
cifar10/"everythingelse"/ --> InputCenterLayer

For NormalizeLayer, unomment get_normalize_layer and comment out get_input_center_layer

For InputCenterLayer, uncomment get_input_center_layer and comment out get_normalize_layer

Note that if you want to train your own models, it doesn't matter which layer you use for training (both will give very similar results) as longs as you use the same layer when doing prediciton or certification. Check this issue for more details.

Acknowledgement

We would like to thank Zico Kolter, Jeremy Cohen, Elan Rosenfeld, Aleksander Madry, Andrew Ilyas, Dimitris Tsipras, Shibani Santurkar, Jacob Steinhardt for comments and discussions.

Contact

If you have any question, or if anything of the above is not working, don't hestitate to contact us! We are more than happy to help!

  • Hadi Salman (hadi dot salman at microsoft dot com)
  • Greg Yang (gregyang at microsoft dot com)
  • Jerry Li (jerrl at microsoft dot com)
  • Ilya Razenshteyn (ilyaraz at microsoft dot com)

smoothing-adversarial's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

smoothing-adversarial's Issues

Shuffling in ImageNet dataloader

Hi,

When attempting to replicate, I've noticed that the dataloader loads ImageNet validation samples in order, so the true label is correlated to the index:

idx	label	predict	radius	correct	time
0	0	-1	0.0	0	0:01:05.891589
100	2	2	1.06	1	0:00:49.151814
200	4	-1	0.0	0	0:00:49.389205
300	6	6	0.145	1	0:00:49.527328
400	8	8	1.55	1	0:00:49.616975
500	10	10	1.8	1	0:00:49.704927

But in the provided data, this isn't the case:

smoothing-adversarial/data/certify/imagenet/replication/resnet50/noise_0.50/test/sigma_0.50:
idx	label	predict	radius	correct	time
0	65	-1	0.0000	0	0:02:21.932991
100	473	700	0.0642	0	0:02:17.267692
200	704	704	0.1656	1	0:02:17.283181
300	329	116	0.2634	0	0:02:17.284284
400	359	359	1.1224	1	0:02:17.282202
500	270	270	0.4027	1	0:02:17.290370

I'm running the command:

export model="pretrained_models/imagenet/replication/resnet50/noise_0.50/checkpoint.pth.tar"
export output="certification_output_standard_0.50" 
python code/certify.py imagenet $model 0.50 $output --skip 100 --batch 400

For what it's worth, in data_cohen, the samples are in fact in order...

/data_cohen/certify/imagenet/resnet50/noise_0.50/test/sigma_0.50:
idx	label	predict	radius	correct	time
0	0	394	0.0125	0	0:02:32.717239
100	2	2	1.86	1	0:02:30.318316
200	4	-1	0.0	0	0:02:31.564715
300	6	6	0.709	1	0:02:30.939915
400	8	8	1.53	1	0:02:31.558977
500	10	10	1.96	1	0:02:31.548283

(I'm assuming that the values differ because the replication model was trained independently, so that isn't the issue: I'm just wondering why the selection of samples is different in the reported data from when I try to replicate it.)

InputCenterLayer?

The difference between InputCenterLayer and NormalizeLayer is simply that InputCenterLayer is not divided by the sds. What are the benefits of doing this?

Issue with replication yml/training file

Hi,
I downloaded the models from the provided link and am specifically looking at the imagenet models. Are the resnet50 models in the replication directory similar to the ones from https://arxiv.org/pdf/1902.02918.pdf?
If so, is there a reason why the yml file says the 0 noise model was trained with 0.12 noise:

jobs:
- name: imagenet/resnet50/noise_0.00
  sku: G4
  sku_count: 1
  command:
  - python code/train.py imagenet resnet50 ./ --batch 256 --workers 16 --noise 0.12
  id: application_1554838020656_3420

If the model was trained with 0.12 noise, could one with 0.00 noise be provided?

unexpected results from pretrained models (CIFAR-10)

Hi, I got some troubles with pretrained models on CIFAR-10. I cloned your repo and downloaded pretrained models. I then carried out evaluation of some pretrained models on CIFAR-10 testset. I got both expected and unexpected results even though I load the same dataset with the same transformation as well as smoothed+based classifier.

When loading pretrained models as below, I got unexpected results e.g. clean accuracy is more or less than 30%:

path = "../smoothing-adversarial/pretrained_models/cifar10/PGD_10steps/eps_64/cifar10/resnet110/noise_0.12/checkpoint.pth.tar"

or

path = "../smoothing-adversarial/pretrained_models/cifar10/PGD_10steps_multiNoiseSamples/2-multitrain/eps_64/cifar10/resnet110/noise_0.12/checkpoint.pth.tar"

BUT, when loading pretrained models as below, I got expected results e.g. clean accuracy is from 85% to 90%:

path = "../smoothing-adversarial/pretrained_models/cifar10/finetune_cifar_from_imagenetPGD2steps/PGD_10steps_30epochs_multinoise/2-multitrain/eps_64/cifar10/resnet110/noise_0.12/checkpoint.pth.tar"

or

path = "../smoothing-adversarial/pretrained_models/cifar10/finetune_cifar_from_imagenetPGD2steps/PGD_10steps_30epochs_multinoise/8-multitrain/eps_64/cifar10/resnet110/noise_0.12/checkpoint.pth.tar"

or

path = "../smoothing-adversarial/pretrained_models/cifar10/self_training/PGD_10steps/weight_0.1/eps_64/cifar10/resnet110/noise_0.12/checkpoint.pth.tar"

I do not know what's wrong. My code is as below:

sigma = 0.12
N = 1000
batch = 200
dataset = "cifar10"
alpha = 0.001

checkpoint = torch.load(path)
base_classifier = get_architecture(checkpoint["arch"], dataset)
base_classifier.load_state_dict(checkpoint['state_dict'])
base_classifier.eval()
smoothed_classifier = Smooth(base_classifier, get_num_classes(dataset), sigma)

Thanks

train_pgd.py noise parameter

The command for running train_pgd.py is given as:
python code/train_pgd.py cifar10 cifar_resnet110 model_output_dir --batch 256 --noise 0.12 --gpu 0 --lr_step_size 50 --epochs 150 --adv-training --attack PGD --num-steps 10 --epsilon 64 --train-multi-noise --num-noise-vec 2 --warmup 10

One of the parameters in the above command is noise. However, I could not find noise in arguments list in train_pgd.py. There is a noise_sd. Is it same as noise in the above command?

Download the archive of pretrained model failed

The archive size too big, and it's hard for me to download the archive since I will be blocked by Google when downloading the large file (when the file has been almost downloaded).
So I want to get archives of the single models. It will help me a lot.
Of course, it depends if you are not busy. I apologize for my reckless request.

Without noise, PGD fails to start from different initial point

As you can see, the attribute self.random_start has not been used by the PGD_L2 class. Thus, if there is no noise, say that noise_sd=0, then PGD will always start from the original x, rather than intended different initial points. This seems to weaken the effectiveness of PGD.

Model predictions incorrect -> possible dataloader issue?

Hi,

I ran code/predict.py with the PGD_1step/eps_512/noise_0.25 noise model and the predictions seem to be always wrong (the "correct" column in the output is always 0). Upon further inspection, it seems that the predictions are agreeing, just that the label index is wrong (for example instead of prediction index 0, it predicts 828).
To confirm this, I ran the baseline noise_0.25 model from https://github.com/locuslab/smoothing, but with the code in this repo. The predictions are correct, ie the "correct" column is almost always 1.

I think probably the way your models were trained did not use the standard imagenet directories, and so the sort order was different, causing the labels to be different as well.
If possible, could you investigate this and let me know which standard imagenet indices correspond to the indices which the model outputs?

Thanks,
Rohan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.