Secure Virtualization with Formal Methods
Presented as part of the 2013 HCSS conference.
ABSTRACT
Virtualization software is increasingly a part of the infrastructure behind our online activities. The companies and institutions that produce online content are taking advantage of the "infrastructure as a service" cloud computing model to obtain cheap and reliable computing power. These content producers -- news media organizations, social networking sites, online shopping companies, universities, and government agencies -- rent compute time from the cloud provider during which they can run their entire software stack on the cloud provider's server. The cloud providers are able to provide cheap computing power by letting multiple client operating systems share a single physical machine, and they use virtualization technology to do that. The virtualization layer also provides isolation between guests, protecting each from unwanted access by the co-tenants.
In this talk I will discuss our research toward verifying the security of this virtualization layer, and in particular, the isolation properties. I will discuss some of the challenges and opportunities in using traditional formal methods to verify security properties at this level of the software stack, and I will present a new methodology, based on model checking, for handling one of the biggest challenges: verification in the face of very large data structures. One weakness of using model checking is that the verification result is only as good as the model; I will also present work we've done to give stronger confidence in the validity of the model.
Virtualization software has a variety of security-critical applications, from malware analysis, to sandboxing potentially dangerous applications, to hosting virtual machines. Ensuring they are correct is important. However, they often use large data structures, such as page tables and caches, to keep track of emulated state, and these can have a large state space, making automated verification based on straightforward state-space exploration infeasible. I, along with my co-authors, developed a methodology we call S2W to facilitate automated verification of such systems. The methodology uses a combination of abstraction and bounded model checking and allows for the verification of safety properties of large or unbounded arrays. We evaluate this methodology using six case studies, including verification of the address translation logic in the Bochs x86 emulator, and verification of security properties of several hypervisor models. This work was published in the proceedings of Formal Methods in Computer-Aided Design (FMCAD) 2012.
Often, formal verification proceeds by first creating a model of the system under consideration and then verifying properties about the model. This is useful when direct verification of the code is not feasible; however, the resulting verification guarantees are only as good as the model. If the model does not accurately represent the system under consideration, properties proven true of the model may or may not be true of the system. We encountered this limitation in the above S2W work; as diligent as we may be, it is likely that a manually constructed model of complex software will have bugs. In the second half of this talk I will discuss a technique for validating a manually-built model against the code in order to gain confidence the model is correct. We use symbolic execution to learn the execution behaviors of the code, and we use a decision procedure to check whether every behavior in the code exists in the model. We evaluate this methodology using several case studies, including a validation of the model used in the S2W project. In spite of many person-hours spent building, studying, and using that model by at least two different researchers during the work on S2W, the model validation technique uncovered five different errors in the model. Modeling a system can make verification more user-friendly and is often a good way to make verification practical; model validation is important for making the results of that verification more meaningful.