Can LLMs keep a secret? Testing privacy implications of Language Models via Contextual Integrity

Often when talking about privacy of chatbots and large language models, I get the question ‘Do people really share that type of stuff with models?!'. In this talk, I intend to show you that people do, in fact, share that data[COLM24]. I will then talk about how such sensitive data can flow from input to the output of the model and cascade into downstream applications, creating potential problems and privacy violations [ICLR24].

Niloofar Mireshghallah is a post-doctoral scholar at the Paul G. Allen Center for Computer Science & Engineering at University of Washington. She received her Ph.D. from the CSE department of UC San Diego in 2023. Her research interests are Trustworthy Machine Learning and Natural Language Processing. She is a recipient of the National Center for Women & IT (NCWIT) Collegiate award in 2020 for her work on privacy-preserving inference, a finalist of the Qualcomm Innovation Fellowship in 2021 and a recipient of the 2022 Rising star in Adversarial ML award.