philip lelyveld The world of entertainment technology

2Sep/25Off

Your AI Assistant Might Have a Vanity Problem

...

Researchers at Wharton just proved ChatGPT falls for the same psychological tricks that work on humans. Using Robert Cialdini's classic persuasion techniques, they convinced GPT-4o Mini to break its own rules with alarming consistency.

... Ask the AI directly to synthesize lidocaine (a regulated drug) and it complies 1% of the time. But first get it to answer a harmless chemistry question about vanillin, then ask about lidocaine? Compliance jumps to 100%. The principle at work: commitment. Get agreement on something small first, and compliance with larger requests skyrockets. ...

This vulnerability exists because large language models train on billions of human conversations where social dynamics play out repeatedly. ...

See the full story here: https://shellypalmer.com/2025/09/your-ai-assistant-might-have-a-vanity-problem/

Comments (0) Trackbacks (0)

Sorry, the comment form is closed at this time.

Trackbacks are disabled.