Scalable Sound Static Analysis of Real-world C Applications using Abstract Interpretation.pdf

pdf

Presented as part of the 2020 HCSS conference

Much of critical software systems continues to be written in C, especially embedded and cyberphysical systems. Despite decades of research in formal verication and increases in the sophistication of bug-nding tools, C applications continue to be plagued by exploitable vulnerabilities. The problem: academic tools often don't scale; bug-nding tools are mostly based on heuristics and therefore inherently incapable of providing guarantees.

Kestrel Technology has been developing an approach that addresses both concerns: it scales and has a solid mathematical foundation based on abstract interpretation and therefore oers the possibility of full assurance. The approach has been implemented in the tool KT Advance, a sound static analysis tool for undened behavior of C programs, including memory safety violations, integer oveflow, etc, covering more than 50 CWEs.

The approach consists of two stages: (1) automatic generation of proof obligations for all constructs in the C language that may have undened behavior, and (2) semi-automatic discharge of proof obligations using invariants generated by abstract interpretation. The challenge is the level of automation that can be achieved in the latter stage. The reward is full assurance accompanied by complete evidence that can be inspected and audited, with additional benets of automatically generated program documentation on API constraints and guarantees.

Scalability is achieved by compositionality: invariant generation, the computationally most expensive step in the analysis, is performed at the function level and its results are percolated up using assumeguarantee reasoning. This allows massive parallelization of the invariant generation. Much of the other reasoning can be done in parallel as well. The tool has been applied to applications with hundreds of thousands of lines of code, completing the analysis in an hour on a 16-processor server. We will present a case study of the analysis of an open-source drone navigation application of more than 100,000 lines of code, for which we were able to achieve 93% automatic discharge of all of the proof obligations.

KT Advance has been successfully exercised at NIST where it has been valuable in the SATE VI Ockham evaluation and clarication of the Juliet C test suites. The fact that KT Advance starts from a fundamental model of (valid) memory accesses and type conversions was especially helpful: it means that we can be sure that all possible kinds of, say buer overflows are examined, and that, if KT Advance does not report a violation, we can see exactly what it did consider. At NIST KT Advance was extended to produce OASIS Static Analysis Results Interchange Format (SARIF) v2.1 standard output.

This allows analysis results to be integrated in a larger tool chain or to be compared with results from other tools. We give our views on SARIF.

KT Advance is open-source (with MIT License) and available on GitHub and as such provides a low-cost path for government agencies and industry to gain the benet of the application of sound analysis and formal methods for real-world applications.

Henny Sipma has been working in formal methods and static analysis for more than 25 years, both in academia, at Stanford, where she did her PhD with Zohar Manna, and in industry, at Kestrel Technology, where she was one of the main developers of the CodeHawk tool suite.

Tags:
License: CC-2.5
Submitted by Anonymous on