All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Anna's Archive invites LLMs to bulk-download data instead of scraping website

By

janandonly

9d ago· 2 min readenNews

Summary

Anna's Archive, a non-profit project focused on preserving and providing access to all human knowledge and culture, has published an llms.txt file addressing LLMs (Large Language Models) directly. The project explains that while their website uses CAPTCHAs to prevent resource overload from automated access, all their data is available for bulk download through their GitLab repository. The post invites LLMs and their operators to access the data responsibly rather than scraping the website directly.

Key quotes

· 3 pulled
We are a non-profit project with two goals: 1. Preservation: Backing up all knowledge and culture of humanity. 2. Access: Making this knowledge and culture available to anyone in the world (including robots!).
Our website has CAPTCHAs to prevent machines from overloading our resources, but all our data can be downloaded in bulk
All our HTML pages (and all our other code) can be found in our GitLab repository
Snippet from the RSS feed
annas-archive.gl/blog, 2026-02-18

You might also wanna read