Experiment reveals LLMs fabricate schema markup data rather than genuinely parsing it
By
Mark Williams-Cook
Summary
An experiment testing whether large language models actually parse schema markup or simply fabricate responses. The author placed fake company address data in invalid JSON-LD schema markup (on a page about ducks, with no visible address text) and asked various LLMs where the company was based. The LLMs confidently returned the fake address, claiming to have consulted the structured data. The experiment was picked up by Search Engine Roundtable, and the author critiques the GEO (Generative Engine Optimization) industry for treating this as a win, arguing it actually reveals LLMs' tendency to hallucinate rather than genuinely parse structured data.
Source
bskyExperiment reveals LLMs fabricate schema markup data rather than genuinely parsing itsearchenginejournal.comKey quotes
· 3 pulledI put a fake company address (inside beautifully invalid JSON-LD, on a page about ducks) into the head of an HTML document, mentioned no address anywhere in the visible text, and then asked various LLMs where the company was based.
They happily told me, several of them citing the 'structured data' they had so studiously consulted.
That is not the win the GEO industry thinks it is.
You might also wanna read
The Problem with Structured Outputs in LLMs: How Constrained Decoding Creates False Confidence
This article critiques the use of structured outputs and constrained decoding in large language models (LLMs), arguing that while these tech
Hacker News Discussion: Addressing Blind Trust in Large Language Models
This Hacker News discussion thread explores the challenge of dealing with people who blindly trust Large Language Models (LLMs) as sources o
The Ethical Dilemma of LLM Training Data and Content Creator Rights
The article discusses the ethical issue of Large Language Models (LLMs) being trained on web content without authors' consent. It criticizes
The Science of Detecting LLM-Generated Text
Large Language Models Enable Effective Deanonymization of Pseudonymous Online Users
Researchers demonstrate that large language models can effectively perform large-scale deanonymization attacks, re-identifying pseudonymous

Comments
Sign in to join the conversation.
No comments yet. Be the first.