All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Over 340 local news outlets block Internet Archive's Wayback Machine over AI scraping concerns

By

jaredwiener

10d ago· 13 min readenNews

Summary

Major newspaper chains including McClatchy, Advance Local, and Tribune Publishing have joined The New York Times, The Guardian, and USA Today in blocking the Internet Archive's Wayback Machine bots. This move stems from publishers' concerns that AI companies might scrape the Internet Archive's repositories to train large language models. While no publisher has confirmed that an AI company has already scraped their content from the Wayback Machine, the trend of blocking has accelerated significantly since Nieman Lab first reported on it in January 2026, with over 340 local news outlets now restricting access.

Key quotes

· 3 pulled
No news publisher has confirmed to Nieman Lab that an AI company has already scraped their content from the Wayback Machine.
In January, Nieman Lab broke the story that major news publishers — including The New York Times, The Guardian, and USA Today Co. — had started blocking the Internet Archive due to concerns that AI companies might scrape the nonprofit's repositories for training data.
McClatchy, Advance Local, Tribune Publishing and other major newspaper chains are restricting the nonprofit's archiving bots.
Snippet from the RSS feed
McClatchy, Advance Local, Tribune Publishing and other major newspaper chains are restricting the nonprofit's archiving bots.

You might also wanna read