TIE Grammar: Stratified Generative Pipeline for Training Generalizable Deep Reinforcement Learning Cyber Agents

ABSTRACT

Developing robust Deep Reinforcement Learning (DRL) agents for cyber security is fundamentally hampered by their inability to generalize due to the non-Independent and Identically Distributed (non-I.I.D.) nature of network topologies, a scarcity of diverse training environments, and the challenge of modeling complex attack physics. To address this, we introduce the Parametric Graph Grammar for Procedural Content Generation (PGG-PCG), a novel stochastic generative model designed to reconstruct realistic network structures. Our approach leverages a parametric L-System formalism, G = (? , Σ, ?, ?), where the device type alphabet (Σ) and probabilistic production rules ? are empirically derived from real-world network data (LANL dataset). This architecture is implemented within a Stratified Generative Pipeline (SGP) that ensures topological validity (Phase I) by synthesizing realistic, structurally valid enterprise topologies (e.g., Fat-Tree, Spine-Leaf) using device adjacency probabilities, thereby avoiding the common issue of structurally invalid samples in pure generative models. We further ensure semantic integrity (Phase II) by injecting device capabilities via Stratified Sampling, and critically, guarantee adversarial realism (Phase III) by integrating the Technique Inference Engine (TIE) as the state transition oracle. This delegates the complex, non-Markovian attack physics, a major limitation for standard DRL approaches, to a CTI-validated probabilistic recommender (NDCG@20 = 0.18±0.02). The PGG-PCG framework successfully produces inexhaustible, topologically, and semantically valid training environments, which is the necessary foundation for developing generalizable and robust DRL agents capable of autonomous network defense.

Bahirah Adewunmi is an AI Lead for Full Spectrum Offensive Cyber at Booz Allen Hamilton and a Ph.D. candidate in Information Systems at the University of Maryland, Baltimore County. Her doctoral research applies multi-agent and deep reinforcement learning (MARL/DRL) to cybersecurity to discover attack paths and vulnerabilities in complex systems, including firmware and networked devices. With over seven years of experience at Booz Allen, she has led high-impact projects ranging from deploying AI-enabled anti-malware for DISA to developing adversarial training methods against ImageNet algorithms. Bahirah is a 2022 Black Engineer of the Year Award (BEYA) Modern Day Technology Leader and previously served as the President of the Booz Allen Hamilton’s Black Analytics Group. She holds an M.S. in Public Policy and Management from Carnegie Mellon University and a B.S. from Cornell University.

Submitted by Katie Dey on Fri, 04/10/2026 - 06:13

Hot Topics in the Science of Security Symposium (HotSoS)

TIE Grammar: Stratified Generative Pipeline for Training Generalizable Deep Reinforcement Learning Cyber Agents