All Topics
All Topics
Technology
Technology
AI
AI
Business
Business
Entertainment
Entertainment
News
News
Programming
Programming
Security
Security
Science
Science
Design
Design
Environment
Environment
Finance
Finance
Crypto
Crypto
Politics
Politics
Sports
Sports
Education
Education
Gaming
Gaming
Art
Art
Music
Music
Health
Health
Books
Books
Food
Food
Travel
Travel
Personal
Personal
Bluesky
Twitter

23 major news sites block Internet Archive's Wayback Machine crawler, analysis finds

By

Jay Peters

22d ago· 2 min readenNews

Summary

Media organizations are increasingly blocking the Internet Archive's Wayback Machine crawler. An analysis by Originality AI found that 23 major news sites now block the crawler, with some claiming it's about blocking scraping bots in general rather than specifically targeting the Internet Archive. Reddit has also limited what the Wayback Machine can archive, citing concerns about AI companies scraping data from the archive.

Source

bsky23 major news sites block Internet Archive's Wayback Machine crawler, analysis findstheverge.com

Key quotes

· 3 pulled
An analysis from Originality AI, an AI detection company, found that 23 big news sites block the Internet Archive's crawler, Wired reports.
A USA Today spokesperson told the publication that the move 'is not about specifically blocking the Internet Archive' but about attempting to block other scraping bots.
Reddit also limits what the Wayback Machine can archive, telling The Verge last year that it had learned that AI companies were scraping data from the Wayback Machine.
Snippet from the RSS feed
An analysis from Originality AI, an AI detection company, found that 23 big news sites block the Internet Archive’s crawler, Wired reports. A USA Today spokesperson told the publication that the move “is not about specifically blocking the Internet Archiv

You might also wanna read

Comments

Sign in to join the conversation.

No comments yet. Be the first.