Certifying SAT Proofs
Presented as part of the 2017 HCSS conference.
ABSTRACT
SAT solver users want confidence in any claim that a given set of Boolean formulas is unsatisfiable. Contemporary SAT solvers have been engineered to emit proofs that can be subsequently validated by SAT proof-checkers. Such a proof is given as a sequence of steps, where each step is interpreted as transforming a formula to a new formula. Checking such a proof requires that for each proof step, the checker performs a validation intended to guarantee that if the current formula (initially the input formula) is satisfiable, then so is the new formula produced by that step. When the final formula is shown to be unsatisfiable, then the procedure guarantees that the input formula is unsatisfiable.
Although much simpler than SAT solvers, proof checkers are not trivial; hence they could have bugs. Heule, Hunt, and Wetzler therefore developed such a proof-checker and formally verified its soundness [1] using the ACL2 system, which provides both the programming language and the theorem prover.
That initial work did not provide an efficient checker, however, which is critical for validating large SAT proofs. For example, a specific proof that was validated in about 1.5 seconds by one of Heule’s C-language-based checkers took about a week to validate with Wetzler’s ACL2-based checker. Several reasons explain the slowdown. Wetzler’s checker used lists for data structures, providing linear-time accesses, while the C version used arrays and various low-level optimizations. Perhaps more important, SAT proofs typically contain many deletion steps that shrink the evolving formula, but Wetzler’s checker did not include deletion. The size of the formula is important because a key procedure, the RAT check, may consider every clause in the formula. Finally, and probably most important, RAT checking is based on a procedure, unit propagation, that can require time-consuming search.
We know of at least two parallel efforts to produce more efficient, formally verified SAT checkers [2]. A key idea is to avoid all search (that results from unit propagation) by adding certain “hints” to each proof step. That idea has been implemented in a new proof format, LRAT (Linear RAT), which used in both of these parallel efforts. We discuss the soundness proof for an LRAT checker developed in ACL2 (the other being a Coq-based development). The SAT proof mentioned above that took a week to check now takes under 3 seconds to check with our verified ACL2-based checker.
Experiments suggest that our formally verified, ACL2-based LRAT proof-checker runs sufficiently fast so that it adds relatively little overhead beyond using a fast C-based proof checker. We expect our checker to be used to confirm unsatisfiable results in the 2017 SAT competition.
References:
[1] Wetzler, N., Heule, M., Hunt, W.: Mechanical verification of SAT refutations with extended resolution. In: ITP 2013. LNCS, vol. 7998, pp. 229–244. Springer (2013)
[2] Cruz-Filipe, L., Heule, M., Hunt, W., Kaufmann, M., Schneider-Kamp, P.: Efficient certified RAT verification. CoRR abs/1612.02353 (2016), https://arxiv.org/abs/1612. 02353
--
Warren A. Hunt, Jr. is Professor in the Department of Computer Sciences of the University of Texas at Austin. Dr. Hunt holds a BSEE from Rice University and a PhD in computer science from the University of Texas. His research interests include hardware verification, circuit design, SAT solving, and mechanized theorem proving. Hunt's dissertation, published in 1985, was the seminal example of a mechanized microprocessor correctness proof. He extended this verification work, to produce in 1991, the first and only fully verified microprocessor to be physically realized in hardware, the FM9001. Hunt's continuing refinement of verification technology and methodology continued with his participation on the FM9801 microprocessor verification effort, as well as industrial scale systems such as Motorola's Complex Arithmetic Processor DSP and his work on the VIA Nano family of X86 compatible processors. From 1997 to 2002, Hunt worked for IBM Research, where he initiated the PERCS project; this project culminated in a computer family of large-memory computer systems that are still being used. From 1986 to 1997, Hunt was Vice President of Computational Logic, Inc., in Austin, Texas, where his research was the cornerstone of the "Short Stack," a fully verified tower of systems from the high-level language level down to hardware. This work stands as the most extensive and complete example of a formally correct computing platform. His current research involves the use of formal mathematics to write specifications for computer hardware and software and to use proof techniques to determine the validity of such specifications. He is also interested in computer architecture, low-power computing, garbage collection, SAT solving, and parallel computing. Dr. Hunt is the Chair of the FMCAD Steering Committee, and he is an ACM Distinguished Engineer.