GPT-5 fails basic reasoning test: Can't count the 'b's in "blueberry"
By
minimaxir
Crackling crust, pillowy middle. The kind of bagel that earns a second cup of coffee.
Summary
The article discusses the release of GPT-5 by OpenAI and the widespread disappointment in the AI community and beyond regarding its performance. It highlights a viral test where GPT-5 fails at a simple task — counting the number of 'b's in "blueberry" — answering incorrectly (three instead of two). The article explores why such seemingly trivial questions are actually adversarial for LLMs due to their token-based architecture, and examines the broader implications for AI capabilities and hype versus reality.
Key quotes
· 3 pulledMichael Paulauski asked GPT-5 through the ChatGPT app interface 'how many b's are there in blueberry?'
A simple question that a human child could answer correctly, but ChatGPT states that there are three b's in blueberry when there are clearly only two.
It's an adversarial question for LLMs, but it's not unfair.
You might also wanna read

OpenAI's GPT-5 Livestream Charts Reveal Inconsistencies
OpenAI's GPT-5 livestream showcased charts with inconsistencies, such as a misleading graph on 'deception evals across models.' The CEO ackn
Study Finds Frontier AI Models Disagree on Two-Thirds of Basic Fact-Check Claims
A new study by researcher Kosta Jordanov at Lenz Research tested five frontier AI models (GPT-5.4, Claude Opus 4.7, Gemini 3 Pro, Gemini 3 P

OpenAI launches GPT-5.5 with improved coding and cross-tool capabilities
OpenAI has announced GPT-5.5, its latest AI model, just one month after releasing GPT-5.4. The company claims the new model excels at writin

GPT-5 Launch Falls Short of Hype Despite Improvements
OpenAI's GPT-5 launch was highly anticipated, with CEO Sam Altman comparing it to the first iPhone with a Retina display. Despite the hype,
