Study Finds LLM Poisoning Attacks Require Only ~250 Documents Regardless of Model Size
By
[Submitted on 8 Oct 2025]
A five-star bake. Worth schmearing, sharing, saving.
Summary
This research paper demonstrates that poisoning attacks on large language models (LLMs) require a near-constant number of poisoned documents (approximately 250) regardless of dataset or model size, contrary to previous assumptions that attacks scale with training corpus percentage. The authors conducted the largest pretraining poisoning experiments to date, training models from 600M to 13B parameters on datasets up to 260B tokens. They found that 250 poisoned documents compromised models across all sizes, even when larger models trained on 20x more clean data. The same dynamics were observed during fine-tuning, suggesting that injecting backdoors through data poisoning may be easier for large models than previously believed.
Key quotes
· 3 pulledThis work demonstrates for the first time that poisoning attacks instead require a near-constant number of documents regardless of dataset size.
We find that 250 poisoned documents similarly compromise models across all model and dataset sizes, despite the largest models training on more than 20 times more clean data.
Altogether, our results suggest that injecting backdoors through data poisoning may be easier for large models than previously believed as the number of poisons required does not scale up with model size.
You might also wanna read
Study Shows Small Data Poisoning Attacks Can Compromise Large Language Models
A joint study by Anthropic, UK AI Security Institute, and Alan Turing Institute reveals that large language models (LLMs) of any size can be

Security Risks of Malicious Backdoors in Large Language Models
The article explores the security risks associated with Large Language Models (LLMs), particularly the potential for embedding malicious bac
pub.aimind.so·10mo agoHow Large Language Models Work: A Visual Deep Dive into Training Data Collection
This article provides a visual deep dive into how Large Language Models (LLMs) work, starting with the data collection process. It explains
Large Language Models Enable Effective Deanonymization of Pseudonymous Online Users
Researchers demonstrate that large language models can effectively perform large-scale deanonymization attacks, re-identifying pseudonymous
Study finds LLMs corrupt documents during delegated editing workflows, with frontier models averaging 25% content degradation
This paper introduces DELEGATE-52, a benchmark to evaluate how well Large Language Models (LLMs) handle delegated document editing tasks acr
Local LLMs Show 95% Vulnerability to Backdoor Injection Attacks in Security Research
Research reveals that local LLMs (large language models) running on user devices for privacy protection are significantly more vulnerable to
