Spotlight on Lablet Research #23 - Secure Native Binary Execution
Spotlight on Lablet Research #23 -
Secure Native Binary Execution
Lablet: University of Kansas
Sub-Lablet: University of Tennessee
The goal of this research is to build tools and techniques that will allow users to know the security level of their packaged binary software and enable them to add security to it. The overall project aim is to provide greater control to the end-user to actively assess and secure the software they use.
Typically, securing software is the responsibility of the software developer. The customer or end-user of the software does not control or direct the steps taken by the developer to employ best practice coding styles or mechanisms to ensure software security and robustness. Current systems and tools also do not provide the end-user with the ability to determine the level of security in the software they use. At the same time, any flaws or security vulnerabilities ultimately affect the end-user of the software.
The research team, led by Principal Investigator (PI) Prasad Kulkarni, has as its goal to develop a high-performance framework for client-side security assessment and enforcement for binary software. They are developing new tools and techniques to: a) assess the security level of binary executables, and b) enhance the security level of binary software, when and as desired by the user to protect the binary against various classes of security issues. The approach combines static and dynamic techniques to achieve efficiency, effectiveness, and accuracy.
There are many avenues for developers to harden their software against security threats, including a) using secure programming languages or constructs, b) using recommended best coding practices and mechanisms, c) manually inserting programmer identified security checks in the source code, and d) inserting compiler provided security checks during code generation. The team's current research focus is to identify the presence of compiler-added security checks in any given binary code.
The approach uses two unique insights, firstly that compiler checks are added uniformly to the code, and secondly, that compiler-added security checks perform tasks that are orthogonal to the primary function of the program. For instance, the Stackguard technique inserts code at the start of most/all functions to insert a canary on the stack, and then, before the “return” instruction, to test its integrity. Likewise, Control-Flow Integrity (CFI) checks are added before most/all indirect calls or jumps. These code sequences appear disconnected in terms of control-flow and data-flow from the other parts of the code that compute the main algorithm of the program.
The researchers' technique conducts the following steps to identify compiler-inserted security checks in the binary: a) employ Ghidra to detect interesting instructions, like returns, indirect calls/jumps, loads/stores, etc.; b) fetch and dump disassembled code from predecessor and successor blocks around the interesting instructions; c) process instruction trace to normalize constants, register numbers, labels, etc.; and d) find common instruction patterns across collected traces. The presence of common instruction sequences/patterns across traces indicates the likelihood of compiler checks to protect against attacks related to that code construct.
The immediate next steps include developing logic to confirm the data- and control-flow disconnect of the security check code and to better reason that the common subsequences indicate the presence of explicit security mechanisms in the binary code.
They also continued their work to develop a hybrid framework to detect and prevent memory attacks on unmodified client-side binaries. They have developed one of the very few decoupled binary-level techniques that offer complete memory safety for binary programs. Their current implementation is the only one that uses static analysis to determine relevant program information. The team assessed reasons when even the complete availability of debug symbol information is not sufficient to replace the semantic information lost during program translation and prevent buffer overflows at run-time. They evaluated the effectiveness of (their) static analysis-based techniques for binaries that are stripped of debug information, and assessed whether advanced static reverse engineering and type inference algorithms can regenerate or predict symbol information with sufficient accuracy to improve the effectiveness of memory protection techniques for stripped binaries.
The technique the team implemented is inspired by SoftBound, which is a compiler-level mechanism to detect and prevent all spatial and temporal memory errors during program execution. With their binary-level implementation, the input binary is statically analyzed using the Ghidra and IDA Pro reverse engineering framework, which outputs information regarding the buffer bounds and the type referenced by each memory access (read/write) and pointer assignment instruction (called the owner) in the binary. At run-time, they employ the Pin virtual machine to keep track of owner information and check relevant buffer reads/writes to ensure fine-grained memory safety. They employ their framework to assess the effectiveness of binary-level memory safety techniques in different situations.
Dr. Michael Jantz and his team at the University of Tennessee have continued work on a static analysis and binary rewriting tool that inserts checks for spatial memory vulnerabilities during program startup. They have developed, tested, and validated their tool with multiple benchmarks (including xz and leela from SPEC CPU 2017 and several test cases from the SARD benchmark suite). They are currently working on testing their tool with other benchmarks from SPEC CPU 2017 and are beginning studies to compare this tool with existing approaches.
Background on the project can be found here.