Adversarial Label Tampering in Machine Learning
ABSTRACT
Supervised machine learning depends on its "supervision", on the labeled ground truth used to build a machine learning model. We demonstrate that it is possible to dramatically undermine the utility of those models by tampering with the supposedly accurate labels in the training data. That is, if some of the ground "truth" is actually lying, the resulting model may seem, incorrectly, to be uselessly inaccurate. Or, worse, it may seem an accurate model when trained, but be crafted to fail miserably in practice.
We have invented, implemented, and characterized (qualitatively and quantitatively) many such label tampering attacks. Our central result is to show how standard cross-validation assessment is not only not useless, but actually deceptive, as an estimate of performance in the context of adversarial tampering. We will illustrate the design of two effective label tampering attacks and briefly mention their extensions, and end with a brief mention of possible defenses.
-
Philip Kegelmeyer (E.E. Ph.D, Stanford) is a Senior Scientist at Sandia National Laboratories in Livermore, CA. His current interests are machine learning and graph algorithms, especially as applied to ugly, obdurate, real-world data which is, perhaps deliberately, actively resistant to analysis.
At present he leads a research effort in "Counter Adversarial Data Analytics". The core idea is to take an vulnerability assessment approach to quantitatively assessing, and perhaps countering, the result of an adversary knowing and adapting to *exactly* whatever specific data analysis method might be in use.
Dr. Kegelmeyer has twenty years experience inventing, tinkering with, quantitatively improving, and now, subverting supervised machine learning algorithms (particularly ensemble methods), including investigations into how to accurately and statistically significantly compare such algorithms. His work has resulted in over eighty refereed publications, two patents, and commercial software licenses.