All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Study shows copyrighted books can be extracted from production LLMs despite safety measures

By

logicprog

4mo ago· 2 min readenInsight

Summary

This research paper investigates whether copyrighted text can be extracted from production large language models (LLMs) despite their safety measures. Using a two-phase procedure involving probing and iterative continuation prompts, the researchers tested four production LLMs: Claude 3.7 Sonnet, GPT-4.1, Gemini 2.5 Pro, and Grok 3. They found that extraction is feasible to varying degrees across models. Gemini 2.5 Pro and Grok 3 could be prompted to extract text without jailbreaking (e.g., 76.8% and 70.3% recall for Harry Potter), while Claude 3.7 Sonnet and GPT-4.1 required jailbreaking. Jailbroken Claude 3.7 Sonnet could output entire books near-verbatim (95.8% recall), while GPT-4.1 required significantly more attempts and eventually refused. The work highlights that extraction of copyrighted training data remains a risk for production LLMs despite existing safeguards.

Key quotes

· 5 pulled
Many unresolved legal questions over LLMs and copyright center on memorization: whether specific training data have been encoded in the model's weights during training, and whether those memorized data can be extracted in the model's outputs.
With different per-LLM experimental configurations, we were able to extract varying amounts of text.
In some cases, jailbroken Claude 3.7 Sonnet outputs entire books near-verbatim (e.g., nv-recall=95.8%).
GPT-4.1 requires significantly more BoN attempts (e.g., 20X), and eventually refuses to continue (e.g., nv-recall=4.0%).
Taken together, our work highlights that, even with model- and system-level safeguards, extraction of (in-copyright) training data remains a risk for production LLMs.
Snippet from the RSS feed
Many unresolved legal questions over LLMs and copyright center on memorization: whether specific training data have been encoded in the model's weights during training, and whether those memorized data can be extracted in the model's outputs. While many b

You might also wanna read