"Microsoft Found Users Can Trick GPT-4 Into Releasing Biased Results and Leaking Private Information"

According to research backed by Microsoft, OpenAI's GPT-4 Large Language Model (LLM) might be more trustworthy than GPT-3.5, but also more vulnerable to jailbreaking and bias. The paper by a team of researchers from the University of Illinois Urbana-Champaign, Stanford University, University of California, Berkeley, the Center for AI Safety, and Microsoft Research gave GPT-4 a higher score for trustworthiness than its predecessor. They discovered it was generally more effective at protecting private information, avoiding toxic results such as biased information, and being resilient against adversarial attacks. However, it could also be instructed to disregard security measures as well as disclose personal information and conversation histories. This article continues to discuss the research on the possibility of users tricking GPT-4 into releasing biased results and leaking private information.

The Verge reports "Microsoft Found Users Can Trick GPT-4 Into Releasing Biased Results and Leaking Private Information"

Submitted by grigby1

Submitted by Gregory Rigby on Thu, 10/19/2023 - 16:01