Research Team Status

  • Names of researchers and position 
    (e.g. Research Scientist, PostDoc, Student (Undergrad/Masters/PhD))
     

    - Skyler Grandel, PhD student
    - Dung Thuy "Judy" Nguyen, PhD student
    - Kailani "Cai" Lemieux-Mack , PhD student
    - Yifan Zhang, PhD student
    - Zack Silver, undergraduate student
    - Evelyn Guo, undergraduate student

  • Any new collaborations with other universities/researchers?
    • N/A

Project Goals

  • What is the current project goal?
    • There are two current goals for this project:
      • Improved decompilation.  We are seeking to develop techniques assisted
         by large language models to improve the quality of source code
         obtained via decompilation.  Specifically, we want to ensure that
         decompiled source is comprehensible and recompilable to improve its
         utility in reverse engineering pipelines (e.g., for malware analysis).
         Doing so will help improve techniques for subsequently modeling novel
         plausible samples (and downstream classifiers).
      • Improving neural network robustness.  We are seeking to improve our
         ability to validate that neural networks adhere to a given
         specification with stronger guarantees of some probability.   We are
         also improving how neural models can be made more resilient against
         adversarial manipulation (e.g., due to backdoor attacks).  This goal
         aligns with our overall vision of improving how classifiers
         (especially security-critical malware classifiers) can be made
         withstand interference from adversaries seeking to undermine them.
  • How does the current goal factor into the long-term goal of the project?
    • This quarter's goals are well-aligned with our overall goal of
       improving malware classifiers.  Improving decompilation is critically
       relevant to comprehension and explainability, not only by humans, but
       by neural models used to generate novel variants to improve
       classification.  Moreover, our foundational improvements to classifier
       robustness will help ensure that we can quantify their performance
       characteristics and provide stronger guarantees about the limits of
       their behavior.

Accomplishments

  • Address whether project milestones were met. If milestones were not met, explain why, and what are the next steps.
    • We continue to be on track for our Year 2 goals.  Our work has
       culminated in ESORICS 2025 and  NeurIPS 2025 publications, new
       submissions under review, and several invited talks.
       
  • What is the contribution to foundational cybersecurity research? Was there something discovered or confirmed?
    • On decompilation: we have improved re-executability of decompiled
       binaries by 10% across two different datasets using large language
       models.  Typically, decompiled source is poor quality, lacking
       semantics provided in the original source code.  Moreover, decompiled
       source often fails to re-compile again -- missing headers, libraries,
       linking issues, and (when LLMs are involved) syntax errors.  These
       issues prevent full understanding of program behavior, especially when
       probative testing is performed (e.g., rewriting or changing portions
       of the decompiled source).  Furthermore, even when decompilation
       yields re-compilable source, the re-compiled program is almost never
       semantically equivalent (i.e., it produces different outputs from the
       original program or crashes).  Building upon our previously-reported
       COMCAT approach, we have leveraged in-context learning and
       retrieval-augmented generation to further refined decompiled source,
       leading to improvements in re-executability rates by as much as 10%.
    • On neural network robustness and verification: we developed a new
       conformal prediction- based method that can provide probabilistic
       guarantees -- i.e., that a specification holds with a certain
       probability for a neural network. The approach is scalable and is
       nearly architecture-agnostic and has culminated in a NeurIPS'25 paper
       accepted.  Based on a subsequent investigation, we
       observed that we can still find adversarial perturbations even if a
       given network has a very high probability (i.e., >99.999%) of being
       locally adversarially robust.  This provides guidance about the
       coverage of the model's input space and how robustness is evaluated.
       
  • Impact of research
    • Internal to the university (coursework/curriculum)
      • N/A
    • External to the university (transition to industry/government (local/federal); patents, start-ups, software, etc.)
      • Improvements integrated into the Neural Network Verification tool: https://github.com/verivital/nnv/
      • PARDON source code: github.com/judydnguyen/PARDON-FedDG
      • Organized VNN-COMP colocated with CAV/SAIV 2025
    • Any acknowledgements, awards, or references in media?
      • N/A

 

Publications and presentations

  • Add publication reference in the publications section below. An authors copy or final should be added in the report file(s) section. This is for NSA's review only.

- Preston Robinette, Thuy Dung Nguyen, Samuel Sasaki and Taylor T
 Johnson.  Trigger-Based Fragile Model Watermarking for Image
 Transformation Networks.  In ESORICS 2025. Preprint: https://arxiv.org/pdf/2409.19442

- Navid Hashemi, Samuel Sasaki, Ipek Oguz, Meiyi Ma, Taylor Johnson.
 Scaling Data-Driven Probabilistic Robustness Analysis for Semantic
 Segmentation Neural Networks.  In NeurIPS 2025: https://neurips.cc/virtual/2025/poster/116265. 

- Invited paper/presentation at Allerton 2025: Is Neural Network
 Verification Useful and What Is Next?

- MIT LIDS Seminar, October 10, 2025: From Neural Network Verification
 to Formally Verifying Neuro-Symbolic Artificial Intelligence (AI).

- RMIT SCT DSAI Seminar (online), September 30, 2025: From Neural
 Network Verification to Formally Verifying Neuro-Symbolic Artificial
 Intelligence (AI)

- Dagstuhl Seminar 25392, 9/21-9/26 2025, with talk: Taylor Johnson,
 "Let's Verify ChatGPT: What Would We Verify & How Could We Get There?"

- Previously reported ICDCS 2025 paper, presented since last quarter:
 PARDON: Privacy-Aware and Robust Federated Domain Generalization.

(NB the publication portal on sos-vo is not functioning as of this report. Please see the references above). 

  • Optionally, upload technical presentation slides that may go into greater detail. For NSA's review only.