All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Study Shows Small Data Poisoning Attacks Can Compromise Large Language Models

By

meetpateltech

7mo ago· 11 min readenInsight

Summary

A joint study by Anthropic, UK AI Security Institute, and Alan Turing Institute reveals that large language models (LLMs) of any size can be compromised by data poisoning attacks using as few as 250 malicious documents. The research demonstrates that both 13B and 600M parameter models are equally vulnerable to backdoor attacks despite significant differences in training data volume, challenging the assumption that attackers need to control a percentage of training data. Instead, attackers may only need a small, fixed amount of poisoned data to create vulnerabilities in AI systems.

Key quotes

· 3 pulled
As few as 250 malicious documents can produce a 'backdoor' vulnerability in a large language model—regardless of model size or training data volume.
Although a 13B parameter model is trained on over 20 times more training data than a 600M model, both can be backdoored by the same small number of poisoned documents.
Our results challenge the common assumption that attackers need to control a percentage of training data; instead, they may just need a small, fixed amount.
Snippet from the RSS feed
Anthropic research on data-poisoning attacks in large language models

You might also wanna read