Study Shows Small Data Poisoning Attacks Can Compromise Large Language Models
By
meetpateltech
Kettled twice. Extra chewy, extra trustworthy.
Summary
A joint study by Anthropic, UK AI Security Institute, and Alan Turing Institute reveals that large language models (LLMs) of any size can be compromised by data poisoning attacks using as few as 250 malicious documents. The research demonstrates that both 13B and 600M parameter models are equally vulnerable to backdoor attacks despite significant differences in training data volume, challenging the assumption that attackers need to control a percentage of training data. Instead, attackers may only need a small, fixed amount of poisoned data to create vulnerabilities in AI systems.
Key quotes
· 3 pulledAs few as 250 malicious documents can produce a 'backdoor' vulnerability in a large language model—regardless of model size or training data volume.
Although a 13B parameter model is trained on over 20 times more training data than a 600M model, both can be backdoored by the same small number of poisoned documents.
Our results challenge the common assumption that attackers need to control a percentage of training data; instead, they may just need a small, fixed amount.
You might also wanna read

Study finds large language models vulnerable to classic persuasion tactics for harmful requests
This study tested whether three widely used large language models (LLMs) are susceptible to classic persuasion principles (authority, social
Cisco Researchers Find Multi-Turn Conversations Can Bypass LLM Safety Guardrails
Researchers at Cisco have discovered that safety guardrails in major large language models (LLMs) — including ChatGPT, Claude, Gemini, Amazo
Study finds LLMs persist in treating false claims as true despite explicit warnings
A study on fine-tuning large language models (LLMs) reveals that even after explicit warnings that certain claims are false, the models cont
arstechnica.com·1d ago
Neuroscience Challenges AI Optimism: Are Large Language Models a Path to True Intelligence?
The article examines the ambitious claims by tech leaders like Mark Zuckerberg, Dario Amodei, and Sam Altman about achieving superintelligen
Major AI models fail EU legal compliance tests, Aithos study finds
Nonprofit AI research foundation Aithos developed a tool called LARA (Legal Assessment for Real-world Agents) to evaluate AI models' complia
Researchers use IBM quantum computer to boost AI language model accuracy by reducing perplexity
Researchers have demonstrated the first use of quantum computers to enhance a production-scale large language model (LLM). By running an AI
livescience.com·4d ago