Proof, but at What Cost?
Abstract: Dafny is a verification-aware programming language that aims to make formal verification more accessible to programmers and students. The Dafny toolchain includes a verifier that translates Dafny programs to Boogie, an intermediate verification language that supports verifying specifications by invoking an SMT solver such as Z3. While Dafny has been heavily used for research and education for years, it is now starting to be used for production industrial projects, and multiple recent projects within Amazon Web Services (AWS) with high assurance requirements have been written in Dafny.
Through this increase in usage, especially by software engineers that are not necessarily well-versed in formal verification, the Dafny core team at AWS has observed that the biggest threat to Dafny adoption by far is “verification instability.” This is the phenomenon where small, seemingly |
unrelated changes to Dafny code cause previously-verified specifications to no longer verify. This is a known consequence of the “butterfly effect” of SMT solvers in the field, but greatly impacts engineers’ willingness to adopt Dafny. Verification instability can turn a small bug fix task into a week-long ordeal to fix verification of code the engineer has no familiarity with, and makes predictable software development extremely difficult. It also impacts engineers’ trust in the Dafny platform, as they begin to doubt whether the broken verification was actually sound in the first place.
In this talk, we will describe recent advances in preventing and mitigating verification instability on the Dafny platform. Our key insight based on empirical data from multiple Dafny projects is that the cost of verification correlates strongly with future verification instability. We describe new features in Dafny and Boogie to provide more stable verification, and to support a continuous integration mechanism that rejects Dafny code that successfully verifies but takes longer than a configured timeout, and hence is likely to exhibit future instability. We show how these improvements helped to provide a more efficient and predictable development velocity for at least one new Dafny project. We conclude with a call to action for SMT solvers to provide the deterministic behavior and metrics needed to support this workflow.
Robin Salkeld is a Senior Software Development Engineer leading the Dafny core team at Amazon Web Services (AWS). He enjoys working in the intersection of computer science theory and practice, with both a PhD from the University of British Columbia and over 20 years of experience as an engineer in industry. He is passionate about empowering engineers and scientists to write better software, and his work focusses on programming languages, data structures and algorithms, development frameworks, and testing and verification methodologies. Outside of work, he enjoys hiking, writing and performing music, board gaming with his family, and referring to himself in the third person.