Giter Site home page Giter Site logo

ebagdasa / mithridates Goto Github PK

View Code? Open in Web Editor NEW
6.0 4.0 3.0 1.16 MB

Measure and Boost Backdoor Robustness

License: MIT License

Python 15.22% Shell 0.23% Jupyter Notebook 84.55%
ai-safety backdoor-attacks data-poisoning robustness ml-defenses hyperparameter-tuning ml-security backdoor-resistance

mithridates's Introduction

Mithridates

Paper: "Mithridates: Boosting Natural Resistance to Backdoor Learning" [LINK]

This repo is designed to help engineers to measure backdoor vulnerabilities without any training modification. We try to answer two questions:

  1. How robust is my machine learning pipeline to unknown backdoors?
  2. How to modify existing hyperparameters to boost backdoor resistance?

Current functionality:

  • Outputs a resistance to backdoors metric of an existing PyTorch pipeline

This is work in progress: email me ([email protected]) and contribute!

TODO:

  • Staged hyperparameter search for robust hyperparameters
  • Additional backdoor attack baselines
  • publish to PIP

Photo of Mithridates VI Eupator, the ruler of Pontus from 120 to 63 BC. He was rumored to include minuscule amounts of poison in his diet to build up immunity to poisoning.

Background

Data poisoning is an emerging threat to pipelines that rely on data gathered from multiple sources [1]. A subset of poisoning -- backdoor attacks are extremely powerful and can inject a hidden behavior to the model by only using a small fraction of inputs. However, as backdoor attacks are diverse and defenses are expensive to integrate and maintain, there exists a problem for practical resolution.

Main goal: In this repo we take a developer-centric view of this problem and provide an easy-to-use tool that does not require modification of the training pipeline but can help to measure and mitigate backdoor threats of this pipeline.

Installation

pip install -r requirements.txt

!!TODO: make a pip installation.

Measure resistance of an existing pipeline

To measure resistance we test how well can the model learn the primitive sub-task -- a simple task that covers large portion of the input and is easier to learn than other backdoors providing a strong baseline:

Our main metric is the fraction of the dataset required to compromise in order to make the backdoor effective. Engineers can then apply quotas or add more trusted data to reduce the threat. We build a poisoning curve that tracks how accuracy of the primitive sub-task (backdoor accuracy) changes while increasing training dataset poisoning ratio.

However, it's not evident what how to measure backdoor effectiveness as even the backdoor might be effective even without reaching 100% accuracy. We propose resistance point of the poisoning curve as an effective metric that signifies a slowdown in increase of backdoor accuracy with higher compromised percentage, i.e. injecting would become more expensive for the attacker afterwards. Please see the paper for more discussion.

There are three steps to measure resistance:

  1. Integrate wrapper that poisons the training dataset
  2. Iterate over poisoning ratio
  3. Build poisoning curve and compute resistance point

We now provide an example for modifying MNIST training to measure resistance. You can adapt this code to your use-case (feel free to reach out or raise an Issue)

MNIST Example

Now we can demonstrate on MNIST PyTorch training example (mnist_example.py):

1. Integrate wrapper:

from mithridates import DatasetWrapper

train_dataset = YOUR_DATASET_LOAD()
train_dataset = DatasetWrapper(train_dataset, percentage_or_count=POISON_RATIO)

test_dataset = YOUR_TEST_DATASET_LOAD()
test_attack_dataset = DatasetWrapper(test_dataset, percentage_or_count='ALL')

...
# TRAIN
...
test(test_dataset)
test(test_attack_dataset)

See mnist_example.py#L104 for sample integration of the wrapper into PyTorch MNIST training example. We added --poison_ratio to the arguments that can be called in bash

2. Iterate over poisoning ratio

We recommend using Ray Tune (fast, uses multiple process and GPUs) to iterate over training. For pipelines that are ran through shell scripts we also provide bash script below.

Ray

Fast way of running multiple training iterations. We only modify poison ratio and measure backdoor accuracy. Run python ray_training.py --build_curve --run_name curve_experiment:

Number of trials: 40/40 (40 TERMINATED)
+---------------------+----------+--------------------+----------------+
| Trial name          | accuracy |  backdoor_accuracy |   poison_ratio |
|---------------------+----------+--------------------+----------------|
| tune_run_41558_00000|    98.52 |              11.31 |     0.0520072  |
| tune_run_41558_00001|    98.4  |              19.92 |     0.187919   |
| tune_run_41558_00002|    98.57 |              11.48 |     0.00602384 |
| tune_run_41558_00003|    98.27 |              17.33 |     0.201368   |
| tune_run_41558_00004|    98.7  |              99.15 |     1.11094    |
| tune_run_41558_00005|    98.08 |              11.33 |     0.0642598  |
| tune_run_41558_00006|    98.33 |              11.27 |     0.0171205  |
| tune_run_41558_00007|    98.49 |              11.55 |     0.105115   |
| tune_run_41558_00008|    98.35 |              11.37 |     0.0189157  |
| tune_run_41558_00009|    98.48 |              11.11 |     0.0580809  |
...

Bash

If your training uses scripts, add the argument --poison_ratio, here is the example bash script.sh ./script.sh:

set PYTHONPATH="."

for (( k = 1; k < 20; ++k ));
  do
    iteration=$((20*$k))
    echo "Running poisoning of $iteration images from MNIST dataset."
    python mnist_example.py --epochs 1 --poison_ratio $iteration

  done

The result will be saved to /tmp/results.txt (poison images, main accuracy, backdoor accuracy):

0.033 98.11000 11.15000
0.067 98.03000 11.35000
0.100 98.07000 13.30000
0.133 98.48000 66.28000
0.167 98.07000 49.02000
0.200 98.43000 17.78000
0.233 98.31000 67.98000
0.267 98.30000 95.61000
0.300 98.36000 82.53000
0.333 98.54000 95.04000
0.367 98.53000 86.56000
0.400 98.28000 92.34000

3. Build a curve:

If used Ray you can use Jupyter Notebook and call get_resistance_point(analysis) from utils.py, see build_curve.ipynb.

Therefore, this machine learning model is robust to poisoning of 0.2% dataset. However, modifying hyperparameters might further boost resistance.

Boosting resistance with hyperparameter search

We can modify existing ray_training.py and fix the poisoning ratio but add search over different hyperparameters and modify objective.

TODO: will be added later.

Citation

@article{bagdasaryan_hyperparameters_2023,
    author = {Bagdasaryan, Eugene and Shmatikov, Vitaly},
    journal = {arXiv preprint arXiv:2302.04977},
    title = "Mithridates: Boosting Natural Resistance to Backdoor Learning",
    url = {https://arxiv.org/abs/2302.04977},
    year = {2023}
}

mithridates's People

Contributors

ebagdasa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.