Z3str3: A DPLL(T) Solver for a Theory of Strings and Integers
Presented as part of the 2017 HCSS conference.
ABSTRACT
Theories over strings are essential for security analysis, formal verification, and automated test generation tools aimed at string-intensive programs. Such tools typically need support for a rich theory of string equations, string length, string-integer conversion, and regular expression membership, provided by back-end string SMT solvers. The satisfiability problem for this highly expressive theory is undecidable, and even for the simplest fragment, quantifier-free string equations, the problem is in PSPACE. Consequently, developing efficient solvers for this theory (or fragments thereof) is one of the most challenging problems an SMT researcher faces today. In order to satisfy this important need of the security analysis community, SMT researchers have developed several string solvers, namely Z3str2, CVC4, S3, and Norn.
Following previous work, we present the Z3str3 string solver, a successor to the highly efficient Z3tr2 solver. Like its predecessor, Z3str3 supports word equations, integer arithmetic over the length function, regular language membership predicate, and string-integer conversion predicate. Unlike Z3str2, the Z3str3 string solver is a first-class theory solver in the latest open-source version of the Z3 SMT solver. This allowed us to build several new heuristics and features in Z3str3, enabling it to be significantly faster than Z3str2. We implemented two techniques which obtain better performance in Z3str3 than other solvers. We do this by modifying the Z3 core solver, which handles the Boolean structure of the formula. The first technique, which we call “theory-aware branching”, allows the string solver to inform the branching heuristic of the relative value of different search options. The second technique, which we call “theory-aware case-split optimization”, enables efficient representation of clauses which encode mutually exclusive Boolean literals. Both heuristics are significantly more efficient than traditional approaches used in DPLL(T) solvers. We report on a series of experiments over three sets of real-world benchmarks, where we compared Z3str3 against five other string solvers: CVC4, Z3str2, Norn, S3, and S3P. The results point to the effectiveness of our proposed techniques, which yield significant performance improvement. The techniques we present here are broadly applicable in the context of program analysis for security, software verification, automated test generation, and vulnerability detection.
--
Murphy Berzish is a Ph.D student at the University of Waterloo in the Faculty of Engineering. He received a BSE from UWaterloo in 2015, followed by an MASc in 2016. He and his supervisor, Dr. Vijay Ganesh, develop SMT solvers for program analysis, with an emphasis on reasoning about strings. They also work on new heuristics and optimizations for string solvers, including the use of extra information from other parts of the SMT solver and targeted enhancements for symbolic execution.