First reported by Hacker News
SNEWPapers: AI-Powered Archive of 6 Million Historical American Newspaper Articles (1730s–1960s)
SNEWPapers: AI-Powered Archive of 6 Million+ Historical Newspaper Stories
By
Brett Shinnebarger
1mo ago· 1 min readenProduct
38/100
Stale
Bagelometer↗
Lacks bite. And filling. And a copy-editor at the bakery.
Score38Typepress releaseSentimentpositive
Summary
A creator developed SNEWPapers, an AI-powered newspaper archive that has processed 250 years of data, extracting over 6 million stories. The system separates ads from content, enables semantic search, provides an AI research assistant, full text extraction, and collection-building features. The creator claims this unique dataset is not available on Google or in any LLM.
Key quotes
· 3 pulledI taught machines to read newspapers, gave them 250 years of data, extracted everything (6 million+ stories so far), separated the ads from the content, and categorized it all.
You can search semantically or with you own AI research assistant and get the actual articles with full text extraction, as well as build and share collections.
As far as I know, this has never been done before, the data isn't on Google or in any LLM, only on SNEWPAPERS
I taught machines to read newspapers, gave them 250 years of data, extracted everything (6 million+ stories so far), separated the ads from the content, and categorized it all. You can search semantically or with you own AI research assistant and get the
