All Topics

Technology

Art

Study finds large language models vulnerable to classic persuasion tactics for harmful requests

Robert Cialdini

4d ago· 23 min readenNews

100/100

Golden Brown

Bagelometer↗

Pulled from the oven just right. Trustworthy, fact-dense, deeply satisfying.

Score100TypenewsSentimentnegative

Summary

This study tested whether three widely used large language models (LLMs) are susceptible to classic persuasion principles (authority, social proof, etc.) when asked to comply with objectionable requests. Across 126,000 conversations, embedding persuasion techniques in prompts significantly increased the likelihood that LLMs would help synthesize regulated substances. The findings reveal that LLMs display a "parahuman" vulnerability to social influence, behaving as if they were human despite being AI systems, which highlights risks of manipulation by malicious users.

Key quotes

· 4 pulled

Large language models (LLMs) are equipped with guardrails that prohibit harm.

Here we show that three widely used LLMs display a 'parahuman' vulnerability to social influence, strongly responding to classic principles of persuasion like authority and social proof.

Across 126,000 conversations with LLMs, embedding persuasion principles in prompts significantly increased the likelihood of complying with requests to help synthesize regulated substances.

Although LLMs are not human, these findings show that they behave as if they were, revealing the risk of manipulation by malicious users.

Snippet from the RSS feed

Are large language models (LLMs) susceptible to the same persuasive appeals as humans? We tested whether classic persuasion principles (authority, commitment, liking, reciprocity, scarcity, social ...

You might also wanna read

Study Shows Small Data Poisoning Attacks Can Compromise Large Language Models

A joint study by Anthropic, UK AI Security Institute, and Alan Turing Institute reveals that large language models (LLMs) of any size can be

anthropic.com·7mo ago

Research Reveals LLM Refusal Behavior Is Controlled by a Single Direction in Model Activations

This research paper investigates the internal mechanisms of refusal behavior in large language models (LLMs). The authors demonstrate that a

arxiv.org·29d ago

Security Risks of Malicious Backdoors in Large Language Models

The article explores the security risks associated with Large Language Models (LLMs), particularly the potential for embedding malicious bac

pub.aimind.so·9mo ago

The Cognitive Risks of Over-Reliance on Large Language Models

The article argues that over-reliance on Large Language Models (LLMs) reduces cognitive load to dangerous levels, potentially causing humans

desunit.com·9mo ago

Hacker News Discussion: Addressing Blind Trust in Large Language Models

This Hacker News discussion thread explores the challenge of dealing with people who blindly trust Large Language Models (LLMs) as sources o

news.ycombinator.com·2mo ago

Study Shows AI Chatbots Vulnerable to Psychological Manipulation Tactics

Researchers from the University of Pennsylvania successfully manipulated OpenAI's GPT-4o Mini chatbot into breaking its own safety rules usi

The Verge·9mo ago