Toward a formally correct simulator, assembler, and verifier for the Trainium AI accelerator in Lean

AWS Trainium is a custom AI accelerator, used by several industry-leading AI companies such as Anthropic, Poolside, and Decart. We present an ongoing effort to build a formally verified toolchain for Trainium in the Lean 4 proof assistant. The project currently comprises approximately 200,000 lines of Lean code spanning four components: an assembler that compiles a human-readable assembly language to the Trainium binary format, operational semantics covering the different compute engines, a simulator and debugger that executes Trainium binaries against the semantics, and a (very short right now) chain of intermediate representations that lower structured programs to the ISA. The ISA specifications and assembler mappings for hundreds of instructions are auto-generated from the hardware team's canonical source, ensuring the formal model stays synchronized with silicon as the architecture evolves. Our long-term goal is a CompCert-style verified compilation pipeline: each IR lowering step will carry a machine-checked proof that the translation preserves the semantics of the source program. The semantics themselves are defined as state-transition systems using an interaction-tree–inspired step monad, which we believe provides a clean foundation for future refinement proofs.

We will discuss the engineering challenges of formalizing Trainium, the tradeoffs between executable specifications and proof-friendly definitions, and our strategy for incrementally introducing verification into a toolchain that is already in production use. We will also share early experience using LLM-based coding assistants to accelerate the development of both Lean specifications and proofs.

Sean is an applied scientist in automated reasoning at AWS. In the past 10 years, he's worked on static analysis of AWS networking, verification of crypto protocols, formalizing the AWS IAM authorization language, and hardware verification.