Security of Biomedical Large Language Models: Threats, Defenses, and Open Challenges

Download

ABSTRACT
Large language models (LLMs) are increasingly deployed in biomedical and clinical settings, where failures can compromise patient safety, privacy, and trust. Unlike general purpose applications, biomedical LLMs operate under strict regulatory constraints and interact with complex systems such as electronic health records, retrieval pipelines, and clinical workflows. This paper systematizes the security of biomedical LLMs through a system level analysis of threats and defenses across the model lifecycle. We introduce a unified taxonomy of attacks grounded in four security objectives: privacy, safety, integrity, and reliability, and characterize adversarial capabilities ranging from black box users to insider and deployment layer attackers. We analyze how vulnerabilities arise during training, alignment, inference, and deployment, and map existing defenses to the attack surfaces they mitigate. Our analysis shows that many high impact attacks, including prompt based PHI extraction, adversarial persuasion, and retrieval poisoning, are immediately deployable in real world systems, while effective defenses remain fragmented and costly. We conclude that securing biomedical LLMs is an ongoing systems security challenge that requires defense in depth strategies, inference time verification, and continuous adversarial evaluation rather than one time alignment.

Zahid Hassan Tushar is a PhD candidate in Information Systems at the University of Maryland, Baltimore County, specializing in artificial intelligence and machine learning. His research focuses on developing efficient, reliable, and trustworthy AI systems for high-dimensional data, with applications spanning Earth observation, medical imaging, and biomedical AI. He has contributed to NASA-funded projects on hyperspectral satellite data and has published in leading venues across computer vision and geoscience. His work combines foundation models, generative AI, and representation learning with an emphasis on real-world deployment challenges, including robustness, security, and system-level integration.

Submitted by Katie Dey on Fri, 04/10/2026 - 12:14

Hot Topics in the Science of Security Symposium (HotSoS)

Security of Biomedical Large Language Models: Threats, Defenses, and Open Challenges