2025 Q1 | Science of Security Virtual Organization

2025 Q1

Improving Malware Classifiers with Plausible Novel Samples

Research Team Status

Names of researchers and position
- Thuy Dung "Judy" Nguyen, PhD student
- Cailaini Lemieux Mack, PhD student
- Preston Robinette, PhD student
- Yuwei Yang, undergraduate researcher
Any new collaborations with other universities/researchers?
- None to report, GTRI and ASU collaborations remain ongoing.

Project Goals

What is the current project goal?
- We have been focused on the purification of backdoor attacks in malware classifiers this quarter. We developed a technique called Post-training Backdoor Purification that takes a poisoned malware classifier as input and produces a nearly clean version of the classifier as output. It works regardless of the poisoning technique, the representation (i.e., feature vectors vs. byteplots), and the poisoning rate (i.e., the amount of training data controlled by the adversary). We published a paper in the Networks and Distributed Systems Security Symposium (NDSS 2025) based on this technique.
- We have also developed SUDS, a technique for removing steganographic material from inputs to diffusion models. Broadly, this technique can help remove stealthily-embedded information from inputs (e.g., images). Doing so can help prevent misuse of diffusion models from use in conveying secrets.
How does the current goal factor into the long-term goal of the project?
- PBP is directly applicable to malware classification models. It will aid in improving the robustness of malware classifiers that may be compromised by a variety of different adversarial poisoning techniques. This contributes to the overall goal of improving malware classification, especially as AI techniques increasingly become the standard for malware classification.
- SUDS is more broadly applicable to diffusion models, but provides insights into formally verifying machine learning models. SUDS could be applied to malware classifiers to further improve their robustness and performance in the presence of adversarial manipulation.

Accomplishments

Address whether project milestones were met. If milestones were not met, explain why, and what are the next steps.
- We have completed our goal for year 1 and are on track for meeting milestones in year 2. PBP and SUDS are both important techniques for improving machine learning in the space of malware classification.
  - PBP identifies neurons in a neural network that are affected by poisoning by using a small fine-tuning set. The technique allows retaining a high clean accuracy (true positive/negative) while reducing attack success rate to nearly 0, a major advance over the state-of-the-art for backdoor purification.
  - SUDS destroys secret information embedded within input data fed to diffusion-based neural networks.
What is the contribution to foundational cybersecurity research? Was there something discovered or confirmed?
- This quarter, we completed our design of PBP, which is used to purify neural networks applied to malware classification. Moreover, PBP can also apply to a variety of other modalities of data, and therefore can aid in the removal of backdoor attacks against many types of neural architectures across many applications, including traditional computer vision, outperforming baseline techniques by a substantial margin.
- SUDS is a broadly-applicable technique for removing steganographically-embedded secrets within input data to prevent misuse of diffusion-based models. This technique is similar to PBP in that it reverses the training process in which a diffusion model learns to embed a secret. This aids in the removal of secrets from an input while retaining the original, non-secret input data.
Impact of research
- Internal to the university (coursework/curriculum)
  - PBP incorporated into CS6380, Principles of Computer Security Research (since last update)
- External to the university (transition to industry/government (local/federal); patents, start-ups, software, etc.)
  - Keynote for Memocode relating to this project and AI verification more broadly. https://memocode2024.github.io/keynotes.html
  - Demonstration of SUDS at ECAI. Sanitizing Hidden Information with Diffusion Models
- Any acknowledgements, awards, or references in media?
  - None to report

Publications and presentations

Add publication reference in the publications section below. An authors copy or final should be added in the report file(s) section. This is for NSA's review only.
Optionally, upload technical presentation slides that may go into greater detail. For NSA's review only.

Lead PI:

Kevin Leach

Co-Pi(s):

Taylor Johnson

Report Materials

Publications

PBP: Post-training Backdoor Purification for Malware Classifiers

Sanitizing Hidden Information with Diffusion Models