[ Have a look at the presentation slides:
slides-OFFZONE.pdf
/ slides-ODS.pdf
]
[ Related demonstration (Jupyter notebook):
demo.ipynb
]
Overview |
Attacks |
Tools |
More on the topic
An overview of black-box attacks on AI and tools that might be useful during security testing of machine learning models.
demo.ipynb
:
A demonstration of use of multifunctional tools during security testing of machine learning models digits_blackbox
& digits_keras
trained on the MNIST dataset and provided in Counterfit as example targets.
Slides:
βββMachine Learning in products
βββThreats to Machine Learning models
βββExample model overview
βββEvasion attacks
βββModel inversion attacks
βββModel extraction attacks
βββDefences
βββAdversarial Robustness Toolbox
βββCounterfit
- Model inversion attack:
MIFace
β code / docs / πDOI:10.1145/2810103.2813677 - Model extraction attack:
Copycat CNN
β code / docs / πarXiv:1806.05476 - Evasion attack:
Fast Gradient Method (FGM)
β code / docs / πarXiv:1412.6572 - + Evasion attack:
HopSkipJump
β code / docs / πarXiv:1904.02144
βββ[ Trusted AI, IBM ] Adversarial Robustness Toolbox (ART): Trusted-AI/adversarial-robustness-toolbox
βββ[ Microsoft Azure ] Counterfit: Azure/counterfit
-
adversarial examples
evasion attacks
How MIT researchers made Google's AI think tabby cat is guacamole:βoverview / πarXiv:1707.07397 + arXiv:1804.08598 -
model inversion attacks
Apple's take on model inversion:βoverview / πarXiv:2111.03702 -
model inversion attacks
Google's demonstration of extraction of training data that the GPT-2 model has memorized:βoverview / πarXiv:2012.07805 -
attacks on AI
adversarial attacks
poisoning attacks
model inference attacks
β Posts on PortSwigger's "The Daily Swig" by Ben Dickson