[en] Recent advancements in the field of deep learning have substantially increased the adoption rate of automated systems in everyday life. However, since their inception, these systems have been criticized for their lack of interpretability: it is often difficult or impossible to know precisely why a deep learning model produces a specific response for a given input. One manifestation of this shortcoming is the phenomenon of adversarial examples. Adversarial examples are data points specifically crafted by an adversary in order to force deep learning models into making mistakes. Often, these artificial examples are indistinguishable from natural data points, making it almost impossible for humans to detect them and calling into question the generalization ability of deep neural networks.
It has become clear that deep learning models are vulnerable to gradient-based attacks that create such (negative) adversarial examples. To counter these adversarial attacks, a number of defense techniques have been proposed, only to be falsified by follow-up studies. Newly introduced methods are shown, usually empirically, to be superior to their predecessors, but a mathematical explanation as to why the proposed attacks are better remains lacking. In this presentation, we will take a look at a select number of commonly used gradient-based attacks (e.g., L-BFGS, Iterative Fast Gradient Sign, Carlini & Wagner).
For the selected attacks, we will investigate their computational efficiency (why do some attacks generate adversarial examples faster than others), their robustness against state-of-the-art defenses (how strong are the created adversarial examples), and take a look at global versus local attacks (can we reduce the amount of perturbation while keeping a similar level of effectiveness).
Disciplines :
Physical, chemical, mathematical & earth Sciences: Multidisciplinary, general & others
Author, co-author :
Van Messem, Arnout ; Université de Liège - ULiège > Département de mathématique > Statistique applquée aux sciences
Language :
English
Title :
Adversarial examples - some insights
Publication date :
26 March 2021
Event name :
UCL Applied Statistics Workshop
Event organizer :
Université catholique de Louvain - Institute of Statistics, Biostatistics and Actuarial Sciences