AI-Enabled High-Confidence Firmware Bill of Materials Extraction

ABSTRACT

Modern assurance of software-intensive and cyber-physical systems increasingly depends on understanding the true composition of firmware, including reused libraries, hidden dependencies, and inherited vulnerabilities, and on producing defensible evidence suitable for high-confidence assurance workflows. Firmware analysis, however, remains bottlenecked by manual reverse engineering and brittle signature-based methods that struggle with compiler variation, optimization, and architectural diversity. This talk presents an AI-enabled approach to firmware bill of materials extraction that uses machine learning and large language models as supporting techniques within a rigorous, evidence-driven process for high-confidence firmware analysis.

We present GrammaTech's FABLE, or Firmware Automatic Bill of Materials (BOM) Labeling Engine, developed under the NSTXL's and Navy CRANE’s Firmware Bill of Materials Extractor (FBME) program. FABLE integrates multiple complementary analysis techniques, including hashing, fuzzy and graph-based matching, emulation fingerprinting, and AI-basedsimilarity analysis, into a unified pipeline for automated firmware decomposition and component identification. To ensure trustworthiness, FABLE employs a tiered voting framework that aggregates corroborating evidence from independent static, dynamic, and AI-driven analyses, producing confidence-scored results suitable for assurance workflows

A key component of FABLE is GrammaTech's Discover technology, originally developed under DARPA sponsorship, which uses a combination of machine learning and binary analysis to derive high-dimensional function embeddings that support robust identification of library components in target binaries. Large language models (LLMs) are used in narrowly scoped roles, such as identifying components with explicit versioned strings or distinctive identifiers in binaries. AI techniques are also used to accelerate tasks that are otherwise labor-intensive. LLMs are leveraged to support translation of user intent and to aid human analyst understanding and downstream reasoning. The system supports both air-gapped and cloud-based execution to ensure accessibility across a diverse user base.

The result is an automated yet explainable firmware BOM generation process that outputs SPDX and CycloneDX artifacts annotated with provenance, confidence measures, and vulnerability context, supporting defensible assurance and supply-chain risk assessment. This work demonstrates how AI can act as a force multiplier for high-confidence software and systems engineering, reducing analyst burden while preserving rigor, traceability, and defensibility in the firmware supply chain.

This talk is directly relevant to the High Confidence Software and Systems Conference and addresses the AI as an Enabler theme by demonstrating how carefully scoped AI techniques can augment firmware assurance workflows while preserving rigor, explainability, and defensible evidence.

 

BIO

Christopher M. Wright is a Principal Scientist at GrammaTech, leading research in firmware analysis, reverse engineering, and AI-enabled cybersecurity. He has extensive experience developing scalable analysis pipelines for identifying software components, vulnerabilities, and behaviors in embedded and cyber-physical systems. He currently leads and contributes to multiple DoD/DoW-funded efforts, including the Firmware Bill of Materials Extractor (FBME/FABLE) and REAFFIRM programs, focused on high-confidence software composition analysis and resilience.

Dr. Wright’s work combines static and dynamic analysis, emulation, and machine learning to enable defensible, evidence-based cybersecurity workflows. He holds a Ph.D. in Electrical and Computer Engineering from Purdue University and actively collaborates with government and industry partners to transition advanced research into operational capability.

Submitted by Katie Dey on