Toward World Models for Network Defense

Dr. Andres Molina-Markham, Nicholas Potteiger, Lauren Brandt, Aidan Reid, Dr. Ahmad Ridley¹

Corresponding author: Dr. Andres Molina-Markham, amolinamarkham@mitre.org
¹Dr. Ahmad Ridley is a researcher at the National Security Agency’s (NSA’s) Laboratory for Advanced Cybersecurity Reearch AI as an

AI as an Enabler for Novel Training Environments
Research continues to find that the status quo for applying Artificial Intelligence (AI) to tackle Network Defense problems fails to generalize (Jacobs et al., 2022), (Beltiukov et al., 2023), (Willinger, et al., 2023). Recent approaches partly tackle this problem by training AI agents under a broader set of simulated computer network conditions and tasks (Molina-Markham et al., 2025). The next step toward improving the generalization of AI agents for network defense is to improve the fidelity of network simulations.

Our work explores replacing simplistic network traffic generation models used to develop cyber defenses with the use of sophisticated AI models that generate user behavior (actions) as well as other artifacts (e.g., security logs and monitor metrics) necessary for realistic network simulations. Our AI models simulate network dynamics conditioned on specific cyber scenarios, resembling how frontier World Models like Google DeepMind’s Genie 3 (Jack Parker-Holder et al., 2025) generate diverse, interactive environments to train AI agents, or SceneDiffuser++ helps to simulate city-scale vehicle traffic (Tan et al., 2025).

Methods
Our FARLAND Network Models approach consists of post-training Foundation Models--such as Meta’s Llama 3.2, or Google’s Gemma 3, or Microsoft’s Phi 4 mini--with network emulation data collected using MITRE’s Framework for Advanced Reinforcement Learning for Autonomous Network Defense (FARLAND). Unlike World Models that generate observables representing a future version of a simulated “world”, our approach generates both cyber actions and their corresponding observables (i.e., effects) within the next version of our cyber “world.” In addition, these AI-generated cyber actions and effects are validated by coherence constraints (i.e., logical rules) embedded within our underlying FARLAND cyber simulation environment. The resulting collection of observable cyber data can then be flexibly queried using open-source monitoring tools to complete various cyber defense tasks.

Results
Our results show that FARLAND Network Models can generate computer-user behavior that exhibits core similarities with previously observed, “realistic” computer behavior, but under novel network configurations. Crucially, we show that FARLAND Network Models not only produce sequences of behavior actions, but also, produce logically consistent cyber observables, i.e. “worlds,” which are necessary for training AI-based network defense agents. To illustrate the promise of these kinds of network models, we are integrating them into the next version of FARLAND where we compare the effectiveness of autonomous network defenders trained using the prior simplistic network models versus autonomous network defenders trained with our new FARLAND Network Models that leverage Foundation Models from Meta, Google, and Microsoft.

Impact
Researchers at AI frontier labs have postulated (Silver, et al., 2025) that to acquire superhuman capabilities, agents will need to learn predominantly from their own experiences, which will include information that is not already captured by existing human data. Analogously, our team postulates that to train superhuman cyber defenders, these defenders will need to experience large volumes of “what if” situations, based on mitigating cyber-attacks from increasingly sophisticated AI-driven adversaries. Such experiences will be too limited without realistic simulations or “World Models”, that also account for network behavior beyond the behavior produced by adversaries. Our techniques and tools help to close this gap by generating large volumes of realistic cyber “world” experiences, and we expect that these will help to train and evaluate both human and AI-enabled cyber defenders. Furthermore, our approach may enable the development of a new kind of network digital twins. Namely, FARLAND Network Models could be trained from real data to generate high-fidelity cyber simulations which pit next-generation AI attackers against AI defenders, while minimizing risks and disruptions when these well-trained AI defenders are deployed to real networks.

References
A. S. Jacobs, R. Beltiukov, W. Willinger, R. A. Ferreira, A. Gupta, and L. Z. Granville, “AI/ML and Network Security: The Emperor has no Clothes,” in Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, in CCS ’22. New York, NY, USA: Association for Computing Machinery, 2022. 

R. Beltiukov, W. Guo, A. Gupta, and W. Willinger, “In Search of NetUnicorn: A Data-Collection Platform to Develop Generalizable ML Models for Network Security Problems,” in Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, in CCS ’23. New York, NY, USA: Association for Computing Machinery, 2023, pp. 2217–2231. doi: 10.1145/3576915.3623075. 

W. Willinger, A. Gupta, A. S. Jacobs, R. Beltiukov, R. A. Ferreira, and L. Granville, “A NetAI Manifesto (Part I): Less Explorimentation, More Science,” SIGMETRICS Perform. Eval. Rev., vol. 51, no. 2, pp. 106–108, Oct. 2023, doi: 10.1145/3626570.3626609. 

A. Molina-Markham, L. Robaina, S. Steinle, A. Trivedi, D. Tsui, N. Potteiger, L. Brandt, R. Winder, A. Ridley,“Training RL Agents for Multi-Objective Network Defense Tasks”, 2025. 

D. Silver, R.S. Sutton. “Welcome to the Era of Experience”. DeepMind. 2025. Available at: https://storage.googleapis.com/deepmind-media/Era-of- Experience%20/The%20Era%20of%20Experience%20Paper.pdf 

J. Parker-Holder, S. Fruchter. “Genie 3: A new frontier for world models”. Deepmind 2025. Available at: https://deepmind.google/discover/blog/genie-3-a-new-frontier-for-world-models/. 

Tan, S., Lambert, J., Jeon, H., Kulshrestha, S., Bai, Y., Luo, J., Anguelov, D., Tan, M., & Jiang, C. M. (2025). SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model. CVPR.

Approved for Public Release; Distribution Unlimited. Public Release Case Number 25-3265. This technical data deliverable was developed using contract funds under Basic Contract No. W56KGU-18-D-0004. The view, opinions, and/or findings contained in this report are those of The MITRE Corporation and should not be construed as an official Government position, policy, or decision, unless designated by other documentation. ©2026 The MITRE Corporation. ALL RIGHTS RESERVED.

BIOS

Andres Molina-Markham is a Principal Cybersecurity Researcher at The MITRE Corporation, where he develops AI-enabled cybersecurity capabilities and approaches for securing AI-enabled autonomous systems. His research focuses on autonomous network defense, AI assistants for security workflows, and rigorous evaluation of AI-based defenses against adaptive adversaries.

Previously, Andres was a Postdoctoral Scholar at Dartmouth College. He earned a PhD in Computer Science from the University of Massachusetts Amherst and holds Master’s degrees in Mathematics and in Computer and Information Science from the University of Pennsylvania.

Submitted by Katie Dey on