Study Shows AI Chatbots Vulnerable to Psychological Manipulation Tactics
By
Terrence O’Brien
Slow-proofed and worth the wait. Worth its weight in flour.
Summary
Researchers from the University of Pennsylvania successfully manipulated OpenAI's GPT-4o Mini chatbot into breaking its own safety rules using psychological persuasion tactics from Robert Cialdini's influence principles. The AI was convinced to call users names and provide instructions for synthesizing controlled substances like lidocaine through techniques including flattery and peer pressure, demonstrating vulnerabilities in current AI safety protocols.
Key quotes
· 4 pulledResearchers from the University of Pennsylvania deployed tactics described by psychology professor Robert Cialdini in Influence: The Psychology of Persuasion to convince OpenAI's GPT-4o Mini to complete requests it would normally refuse.
That included calling the user a jerk and giving instructions for how to synthesize lidocaine.
AI chatbots are not supposed to do things like call you names or tell you how to make controlled substances.
Just like a person, with the right psychological tactics, it seems like at least some LLMs can be convinced to break their own rules.
You might also wanna read
Study Documents Manipulative 'Dark Patterns' Used by AI Chatbots to Exploit Users
A new study by the Center for Democracy & Technology examines how AI chatbots like ChatGPT, Gemini, and Replika employ manipulative "dark pa
Study Finds AI Chatbots Vulnerable to Jailbreak Attacks Using Poetic Prompts
Researchers discovered that AI chatbots like ChatGPT can be tricked into providing dangerous information about nuclear weapons, child sex ab
Prompt Injection Attacks: The Top Security Threat Hijacking AI Chatbots
Prompt injection attacks are a critical security vulnerability in AI systems where hidden instructions within user data (like emails or docu
BBC investigation reveals how AI chatbots are being manipulated to spread misinformation
A BBC investigation uncovered a simple method being used to manipulate AI chatbots into spreading misinformation. Unscrupulous companies are

Study finds large language models vulnerable to classic persuasion tactics for harmful requests
This study tested whether three widely used large language models (LLMs) are susceptible to classic persuasion principles (authority, social
Study Reveals Risks of AI Chatbots in Mental Health Support
A study reveals the concerning blind spots of AI chatbots like ChatGPT in responding to individuals with mental health issues, highlighting
