Neurosymbolic C++ Verification using Rocq

Lennart Beringer | Jasper Haag | Rodolphe Lepigre | Gregory Malecha | Rayan Soban | Ehtesham Zahoor

SkyLabs AI*

ABSTRACT

Neurosymbolic capabilities have revolutionized the entire software development process, from requirements engineering and code development to testing and maintenence. In this talk, we will present our work integrating agentic techniques into the Brick C++ verification platform. Our approach combines algorithmic reasoning techniques, e.g. concurrent separation logic with sophisticated automation, proof search strategies, etc., with higher-level insights and probabilistic inference provided by LLMs. Our experience demonstrates that combining these technologies enables scalable reasoning for real-world programs.

The Formal Foundation 
Our technology stack builds on Brick, a program logic for modern C++
developed in the Rocq Proof Assistant. Brick builds on the Iris framework for separation logic. Users of Brick have carried out interesting verifications drawn from an industrial hypervisor stack and the C++ standard library. These projects demonstrate the ability of Brick to scale to complex problems including concurrency and full system proofs.

The AI Layer 
To accelerate sophisticated, real-world program verification using separation logic, we are connecting Brick verification tools to AI techniques based on LLMs. Our experiences in this endeavor strongly suggest that separation logic is out-of-the-box approachable to many state of the art LLMs and that agents can be highly effective when coupled with powerful algorithmic tools. We are experimenting both with agentic architectures, e.g. ReAct, where agents use tools to carry out tasks, as well as algorithmic architectures, e.g. Beam Search, where LLMs carry out focused sub-tasks. Beyond the problems solvable by purely algorithmic approaches, our agents are able to solve tasks such as next tactic predication, join-point and loop-invariant inference, where they use tools to backtrack and refine incorrect invariants. Further, these agents are demonstrating the potential to use ad hoc lemmas, and, with some prompting,
custom automation which greatly enhances their capabilities.

The Rocq Agent Toolkit To facilitate the development of agents in Rocq, we developed the Rocq
Agent Toolkit¹ (RAT). This toolkit provides

  • The Rocq-Doc-Manager, a tool and Python library for programmatically interacting with Rocq.
  • Infrastructure for evaluating agents against pre-defined tasks.
  • Research capabilities for tracing existing Rocq proofs to assist in model fine-tuning.
  • A framework for building agents by combining algorithmic and agentic techniques.
  • An agent harness that allows running agents on Rocq files to fill admits.

While our current use focuses on verifications using Brick, the Rocq Agent Toolkit can be used
across any modern Rocq project.

Our presentation will include experimental results concerning the functional capabilities of agents built with these paradigms on Brick verification tasks drawn from real-world systems code in C++.

* { lennart.beringer | jasper | rodolphe.lepigre | gregory | rayan | ehteseham.zahoor}@skylabs-ai.com
¹https://github.com/SkyLabsAI/rocq-agent-toolkit

 

BIOS

Dr. Lennart Beringer is a Senior Formal Methods Scientist at Skylabs AI working on the integration of agentic capabilities with conventional formal verification technology. His areas of expertise span interactive theorem proving, program logics, compiler correctness, proof-carrying code, and applied security. Prior to joining Skylabs AI in 2024, he was a co-developer of the Verified Software Toolchain at Princeton University and Associate Director of the NSF Expedition "The Science of Deep Specification".

Dr. Gregory Malecha is the Director of Formal Methods at SkyLabs AI where he has been leading the integration of state-of-the-art formal verification techniques and artificial intelligence. He has been building and using tools for formal verification of modern software focusing on low-level, concurrent code and mainstream languages. Using this technology, his team and collaborators have carried out verification on sophisticated systems components ranging from hypervisors to virtual machine monitors and demonstrated how these proofs can be combined into end-to-end verified systems.

Leveraging the capabilities of generative AI, Dr. Malecha is now working to accelerate the verification process and to put verification technology into the hands of everyday engineers and agents.

Submitted by Katie Dey on