WIP: Steerability of Autonomous Cyber-Defense Agents by Meta-Attackers

ABSTRACT

AI agents are automating several traditional manual tasks. One area where AI agents show promise is computer incident response, as this is a significantly slow and sometimes tedious process managed by operators who are overwhelmed with alarms. Before these AI agents are deployed in security sensitive systems, we need to understand their resilience to attacks. This paper formally analyzes an attacker who has partially compromised an autonomous cyber defense agent. Our goal is to understand how such an attacker can steer the defenses and manipulate the system to achieve its objectives. Our results can help defenders identify the most critical components of their defenses and allocate and harden resources that (if compromised) may give an attacker a large advantage.

Luis Burbano headshot

 

Luis Burbano is a Ph.D. candidate at the Department of Computer Science and Engineering at the University of California, Santa Cruz, advised by Professor Alvaro Cardenas. He received a bachelor's and master's in electronics engineering from Universidad de los Andes, Colombia, in 2017 and 2019, respectively. He was selected as a cyber-physical systems Rising Star in 2023 and as a visiting scholar for the ELLIIT focus period in 2024, Sweden. He is interested in the security of cyber-physical systems, integrating control theory and computer science formal methods.

License: CC-3.0
Submitted by Regan Williams on