Evaluating Fuzz Testing (and other technologies)
Fuzz testing has enjoyed great success at discovering security critical bugs in real software. Researchers have devoted significant effort to devising new fuzzing techniques, strategies, and algorithms. Such new ideas are primarily evaluated experimentally so an important question is: What experimental setup is needed to produce trustworthy results? In mid 2018 we surveyed the research literature and assessed the experimental evaluations carried out by 32 fuzzing papers. We found problems in every evaluation we considered. We then performed our own extensive experimental evaluation using an existing fuzzer. Our results showed that the general problems we found in existing experimental evaluations can indeed translate to actual wrong or misleading assessments. |
We suggest that these problems can be avoided, thus making reported results more robust, by following some simple guidelines. Such guidelines are an instance of those developed subsequently by ACM SIGPLAN in an effort to improve the quality of empirical evaluations of automated testers, compilers, and analysis tools.
Michael W. Hicks is a Professor in the Computer Science department and the CTO of Correct Computation, Inc. He recently completed a three-year term as Chair of ACM SIGPLAN, the Special Interest Group in Programming Languages, and was the first Director of the University of Maryland's Cybersecurity Center (MC2). His research focuses on using programming languages and analyses to improve the security, reliability, and availability of software. He has explored the design of new programming languages and analysis tools for helping programmers find bugs and software vulnerabilities, and explored technologies to shorten patch application times by allowing software upgrades without downtime. Recently he has been looking at synergies between cryptography and programming languages, as well techniques involving random testing and probabilistic reasoning. He also led the development of a new security-oriented programming contest, "build-it, break-it, fix-it," which has been offered to the public and to students of his Coursera class on Software Security. He edits the SIGPLAN blog, PL Perspectives (https://blog.sigplan.org), and maintains his own blog, the PL Enthusiast, at http://www.pl-enthusiast.net/.