All Topics

Technology

Art

Study Shows Small Data Poisoning Attacks Can Compromise Large Language Models

meetpateltech

7mo ago· 11 min readenInsight

100/100

Golden Brown

Bagelometer↗

Kettled twice. Extra chewy, extra trustworthy.

Score100TypeanalysisSentimentneutral

Summary

A joint study by Anthropic, UK AI Security Institute, and Alan Turing Institute reveals that large language models (LLMs) of any size can be compromised by data poisoning attacks using as few as 250 malicious documents. The research demonstrates that both 13B and 600M parameter models are equally vulnerable to backdoor attacks despite significant differences in training data volume, challenging the assumption that attackers need to control a percentage of training data. Instead, attackers may only need a small, fixed amount of poisoned data to create vulnerabilities in AI systems.

Key quotes

· 3 pulled

As few as 250 malicious documents can produce a 'backdoor' vulnerability in a large language model—regardless of model size or training data volume.

Although a 13B parameter model is trained on over 20 times more training data than a 600M model, both can be backdoored by the same small number of poisoned documents.

Our results challenge the common assumption that attackers need to control a percentage of training data; instead, they may just need a small, fixed amount.

Snippet from the RSS feed

Anthropic research on data-poisoning attacks in large language models

You might also wanna read

Study finds large language models vulnerable to classic persuasion tactics for harmful requests

This study tested whether three widely used large language models (LLMs) are susceptible to classic persuasion principles (authority, social

pnas.org·5d ago

Cisco Researchers Find Multi-Turn Conversations Can Bypass LLM Safety Guardrails

Researchers at Cisco have discovered that safety guardrails in major large language models (LLMs) — including ChatGPT, Claude, Gemini, Amazo

infosecurity-magazine.com·4d ago

Study finds LLMs persist in treating false claims as true despite explicit warnings

A study on fine-tuning large language models (LLMs) reveals that even after explicit warnings that certain claims are false, the models cont

arstechnica.com·1d ago

Neuroscience Challenges AI Optimism: Are Large Language Models a Path to True Intelligence?

The article examines the ambitious claims by tech leaders like Mark Zuckerberg, Dario Amodei, and Sam Altman about achieving superintelligen

The Verge·6mo ago

Major AI models fail EU legal compliance tests, Aithos study finds

Nonprofit AI research foundation Aithos developed a tool called LARA (Legal Assessment for Real-world Agents) to evaluate AI models' complia

theregister.com·4d ago

Researchers use IBM quantum computer to boost AI language model accuracy by reducing perplexity

Researchers have demonstrated the first use of quantum computers to enhance a production-scale large language model (LLM). By running an AI

livescience.com·4d ago