"In Neural Networks, Unbreakable Locks Can Hide Invisible Doors"
Even though image generators such as DALLE 2 and Large Language Models (LLMs) such as ChatGPT are making headlines, experts still do not understand why they are effective, thus making it difficult to know how they could be manipulated. For example, Machine Learning (ML) models are susceptible to the software vulnerability known as a backdoor, a hidden piece of code that allows users with a secret key to collect information or gain capabilities they should not have. A company tasked with developing an ML system for a client could install a backdoor and sell the secret activation key to the highest bidder. To gain a deeper understanding of such vulnerabilities, researchers have explored various techniques to hide their backdoor samples within ML models. Yet, the approach has mostly consisted of trial and error, with a lack of formal mathematical analysis of how effectively these backdoors are hidden. Researchers are now analyzing the security of ML models with greater rigor. In a paper presented at the Foundations of Computer Science conference last year, a team of computer scientists revealed how to plant backdoors whose invisibility is as guaranteed as the security of cutting-edge encryption technologies. The new work's mathematical rigor comes with trade-offs, such as an emphasis on relatively simple models. Nonetheless, the results reveal a novel theoretical relationship between cryptographic security and ML vulnerabilities, offering new future study directions at the junction of the two domains. This article continues to discuss researchers' work on showing how perfect security can undermine ML models.
Quanta Magazine reports "In Neural Networks, Unbreakable Locks Can Hide Invisible Doors"