Toward Neuro-Symbolic Natural Language Requirements to Verified Implementations for Model-Based Systems Engineering in SysML v2

Amer Tahat, David Hardin, Isaac Aumndson, Darren Cofer
Collins Aerospace

ABSTRACT

The development of high-confidence software systems requires rigorous specification and high-assurance code generation. While generative AI offers promise, adopting Large Language Models (LLMs) in safety-critical workflows remains hindered by hallucinations, token costs, and limited domain data. We report progress toward integrating state-of-the-art Codex software engineering agents into the DARPA PROVERS INSPECTA (Industrial Scale Proof Engineering for Critical Trustworthy Applications) toolchain, supporting Model-Based Systems Engineering (MBSE) workflows in SysML v2, utilizing a novel Meta-Rule methodology. We define Meta-Rules as reusable assets that transfer ephemeral In-Context Learning (ICL) knowledge into persistent, verified formalization logic; importantly, Meta-Rules are expert-curated to ensure transferability.

We evaluate this methodology on a number of MBSE examples. From natural language requirements, our copilot generates GUMBO contracts and HAMR skeletal code, and employs a closed-loop, self-healing generate–verify–repair cycle, where HAMR/Logika feedback triggers largely automated, Meta-Rule guided revisions, with minimal user intervention until system model integration and code-level verification succeed. On the Isolette benchmark, our approach achieved 100% code-level formal verification for the predefined Scala application logic in 25 minutes, utilizing approximately 4 million tokens. In contrast, our baselines failed to achieve end-to-end, formally verifiable code-level artifacts within 33 minutes, despite consuming nearly 19 million input tokens (including a > 40-page system requirements PDF). The Meta-Rule–guided approach demonstrated an approximately 58× increase in task throughput. These results suggest that neuro-symbolic guardrails can reconcile the generative capability of LLMs with formal rigor in the development of realistic high-assurance systems. We conclude by discussing current limitations and outlining future directions for scaling this neuro-symbolic architecture to broader cyber-physical system applications.

Dr. Amer Tahat is a Senior Engineer at Collins Aerospace/RTX with over 13 years of experience across industry and academia in neuro-symbolic AI, formal methods, and GenAI safety and security for critical systems. He specializes in building trustworthy MBSE and coding copilots that support critical systems development from requirements to implementation. Previously, he served as an Assistant Research Professor at Penn State and Virginia Tech. He earned his Ph.D. from Michigan Technological University, with research co-supervised by NASA Langley Research Center, and has led and contributed to multiple DARPA-, ONR-, and NSF-supported research projects.

License: CC-3.0

Submitted by Katie Dey on Mon, 04/06/2026 - 10:52