"Protecting AI Models from 'Data Poisoning'"
Training data sets for deep-learning models include billions of Internet-crawled data samples. Inherent in the arrangement is trust, which looks to be increasingly threatened by a type of cyberattack known as "data poisoning." In this attack, data gathered for deep-learning training is poisoned with malicious information. A team of computer scientists from ETH Zurich, Google, Nvidia, and Robust Intelligence has demonstrated two model data poisoning attacks. So far, they have discovered no evidence that these attacks have been carried out. However, they do recommend certain protections that could make it more difficult to manipulate data sets. According to the authors, these attacks are simple and applicable today, requiring minimal technical knowledge. For only $60, they could have poisoned 0.01 percent of the LAION-400M or COYO-700M data sets. One of the paper's coauthors, Florian Tramèr, an assistant professor at ETH Zurich, explains that such poisoning attacks would enable malicious actors to manipulate data sets to, for example, exacerbate racist, sexist, or other biases, or embed a backdoor in the model to control its behavior after training. The large Machine Learning (ML) models being trained today, such as ChatGPT, Stable Diffusion, and Midjourney, require so much data to train that the current method of obtaining data for these models only consists of scraping a large portion of the Internet, according to Tramèr. This makes maintaining any level of quality control extremely difficult. This article continues to discuss the team's demonstration of two possible poisoning attacks on 10 popular data sets, including LAION, FaceScrub, and COYO.
IEEE Spectrum reports "Protecting AI Models from 'Data Poisoning'"