High-Performance Regular Expression Processing for Cross-Domain Systems with High Assurance Requirements

pdf

Presented as part of the 2014 HCSS conference.

Abstract:

 

Regular expressions are an important means for specifying search patterns in data.  We recently responded to a customer requirement to provide regular expression support for a very high-assurance cross-domain system.  The customer desired that complex regular expressions be used to specify the precise form of data payloads that would be allowed to cross domain boundaries.  For such a critical system, it is essential that the processing of regular expressions is correct, and does not allow unintended data to transit from one domain to another.  It is also important that regular expression matching is as efficient as possible; otherwise, the overhead of cross-domain filtering becomes too high to be useful.

We have addressed both the "designed-in security" and efficiency aspects of our customer's requirements in the development of regular expression support for the Guardol domain-specific language for guards.  We introduce a special Guardol expression, regex_match, which accepts a regular expression and a string, and produces a boolean output that indicates whether the string matches the regular expression.  Every regex_match expression is translated into the application of a Deterministic Finite-State Automaton (DFA) to the given string.  This translation is carried out in the "middle end" of the Guardol compilation process using the HOL4 theorem prover, and thus is highly trusted.

The DFA is generated using Brzozowski's derivative technique, with modifications due to Owens, Reppy, and Turon. The DFA transition function, which takes a pair consisting of the current DFA state and the current character in the string, and returns the next state, is represented by a two-dimensional array. Thus matching a string is very quick, since there is one array indexing operation per character position in the string to be matched. Similarly, the final states are also represented in an array, so checking to see if the string is accepted is constant time.  We have exercised the regular expression matcher on host platforms as well as the Rockwell Collins Turnstile guard, and performance is excellent.

In future work, we will provide support for proofs of Guardol properties that include regular expressions.  The current Guardol code generator produces Ada code that is compiled and linked to form a guard executable; thus, to increase the trustworthiness of the Guardol backend, we will explore the elimination of the code generator/untrusted compiler part of the toolchain by compiling the functional form of Guardol directly to binary code using the verified CakeML compiler.  We will also study direct synthesis of a hardware-only implementation of the regular expression matcher for very high-performance guard implementations.

Presenter Bio:

Dr. David S. Hardin has made contributions in the areas of formal methods, computer architecture for High Assurance systems, as well as real-time and embedded Java.  He is currently a Principal Engineer in the Advanced Technology Center at Rockwell Collins, where he has served as Principal Investigator for several U.S. DoD Research and Development programs.  He is the editor of the book Design and Verification of Microprocessor Systems for High-Assurance Applications (Springer 2010), and a co-author of The Real-Time Specification for Java, as well as the Java 2 Micro Edition Connected Device Configuration/Foundation Profile standards.  He is author or co-author of more than 30 peer-reviewed publications, and is a co-inventor on nine U.S. patents.  In 1999, Dr. Hardin co-founded aJile Systems, a startup company focused on real-time and embedded Java technology, and served as aJile's Chief Technical Officer from 1999 to 2003.  Dr. Hardin was selected as a Rockwell Engineer of the Year for 1997 for his contributions to the development of the world's first Java microprocessor. His academic career includes BSEE and MSEE degrees from the Universrity of Kentucky, and a Ph.D. in Electrical and Computer Engineering from Kansas State University, in 1989.  Dr. Hardin is a proud native of the Commonwealth of Kentucky, and is a Kentucky Colonel.

Tags:
License: CC-2.5
Submitted by William Martin on