A recent trojan attack on deep neural network (DNN) models is one insidious variant of data poisoning attacks. Trojan attacks exploit an effective backdoor created in a DNN model by leveraging the difficulty in interpretability of the learned model to misclassify any inputs signed with the attacker’s chosen trojan trigger. Since the trojan trigger is a secret guarded and exploited by the attacker, detecting such trojan inputs is a challenge, especially at run-time when models are in active operation.
This work builds STRong Intentional Perturbation (STRIP) based run-time trojan attack detection system focusing on vision system, and identifies its strengths and weaknesses.
amecava / strip Goto Github PK
View Code? Open in Web Editor NEWThis project forked from garrisongys/strip
Defence Against Trojan Attacks on Deep Neural Networks