Adversarial Examples that Fool both Computer Vision and Time-Limited Humans

pdf

ABSTRACT

Machine learning models are vulnerable to adversarial examples: small changes to images can cause computer vision models to make mistakes such as identifying a school bus as an ostrich. However, it is still an open question whether humans are prone to similar mistakes. Here, we address this question by leveraging recent techniques that transfer adversarial examples from computer vision models with known parameters and architecture to other models with unknown parameters and architecture, and by matching the initial processing of the human visual system. We find that adversarial examples that strongly transfer across computer vision models influence the classifications made by time-limited human observers.

-

Gamaleldin Elsayed is an AI Resident at Google Brain. In 2017, he completed his PhD at the Columbia University Center for Theoretical Neuroscience in the lab of John Cunningham. Dr. Elsayed's current research focuses on properties of artificial neural networks and their relationships to neural systems. He has also made significant contributions to the field of neuroscience through the application of machine learning methods to identify and validate structures from complex neural data.

Tags:
License: CC-2.5
Submitted by Katie Dey on