Techniques for Scalable Symbolic Simulation
Presented as part of the 2013 HCSS conference.
ABSTRACT
Symbolic simulation is a powerful technique for building mathematical models describing a program’s results by simulating execution while interpreting inputs as symbolic variables. Such models can then be used to prove properties about the corresponding source program, including equivalence against a reference implementation.
Straightforward approaches to symbolic simulation can be used to build models of simple programs with modest implementation effort. When programs become larger and more sophisticated, however, the straightforward techniques are insufficient. For example, programs with branches can cause a symbolic simulator to explore an exponential number of execution paths. Additionally, loops without obvious termination conditions can lead to symbolic simulation that never terminates. Furthermore, even for terminating simulation runs, the models representing the execution of an entire program can become prohibitively large.
In this talk, we describe how we have tackled these problems in practice, focusing on a case study of verifying the equivalence of two implementations of the SHA-384 message digest algorithm. We successively show how:
- Merging symbolic states at post-dominator nodes in the program’s control flow graph can avoid exponential explosion in programs with branches.
- Compositional approaches enable the simulation of individual functions in isolation, reducing the overall size of program models.
- Symbolic simplification and the use of automated proof tools can prune branches, allowing a larger variety of looping programs to terminate.