Google and Schema.org release first public dataset on structured data adoption across millions of domains
By
Luis Rijo
Summary
Schema.org, in collaboration with Google, has published its first public dataset of aggregate usage statistics for structured data vocabulary. The dataset, released on June 4, 2026, draws from Google's web crawling infrastructure and covers how different structured data types and properties are deployed across millions of domains. Available in CSV and JSON formats on GitHub, the dataset is updated monthly and provides developers and publishers with long-awaited transparency into structured data adoption patterns on the web.
Source
Key quotes
· 3 pulledSchema.org this week published its first public dataset of aggregate usage statistics for its structured data vocabulary
The dataset, announced on June 4, 2026, on the Schema.org blog, represents a collaboration between Google and the Schema.org community
offering developers and publishers a view into how different types and properties are actually deployed across millions of domains
You might also wanna read
A Practical Guide to Implementing JSON-LD Structured Data on Personal Websites
A practical guide to implementing JSON-LD structured data on personal websites for improved SEO and richer link previews. The article covers

Structured outputs guide
Deprecating Structured Data Files v9, v9.1, and v9.2
How Large Language Models Work: A Visual Deep Dive into Training Data Collection
This article provides a visual deep dive into how Large Language Models (LLMs) work, starting with the data collection process. It explains
tosijs-schema: A Schema-First Validation Library Using JSON Schema as Source of Truth
tosijs-schema is a schema-first validation library that positions JSON Schema as the source of truth for data validation, contrasting with Z
npmjs.com·7mo ago
Comments
Sign in to join the conversation.
No comments yet. Be the first.