BinSub: The Simple Essence of Polymorphic Type Inference for Machine Code

ABSTRACT

Recovering high-level type information in binaries is a key task in reverse engineering and binary analysis. Binaries contain very little explicit type information. The structure of binary code is incredibly flexible allowing for ad-hoc subtyping and polymorphism. Prior work has shown that precise type inference on binary code requires expressive subtyping and polymorphism. Implementations of these type system features in a binary type inference algorithm have thus-far been too inefficient to achieve widespread adoption. Recent advances in traditional type inference have achieved simple and efficient principal type inference in an ML like language with subtyping and polymorphism through the framework of algebraic subtyping. BinSub, a new binary type inference algorithm, recognizes the connection between algebraic subtyping and the type system features required to analyze binaries effectively. Using this connection, BinSub achieves simple, precise, and efficient binary type inference. We show that BinSub maintains a similar precision to prior work, while achieving a 63x improvement in average runtime for 1568 functions. We also present a formalization of BinSub and show that BinSub’s type system maintains the expressiveness of prior work.

Ian Smith headshot Ian Smith is a researcher at Trail of Bits working on static analysis, reverse engineering, vulnerability discovery and compiler problems. His current interests include compositional static analyses including biabductive analysis, linkable program representations for post-hoc vulnerability triage, binary patching, and automated software verification.
License: CC-3.0
Submitted by Regan Williams on