All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

LiveBrowseComp reveals LLM search agents rely on memorized knowledge, not genuine web searching

By

[Submitted on 27 May 2026]

4d ago· 2 min readenInsight

Summary

This paper introduces the concept of Intrinsic Knowledge Dependence (IKD), showing that LLM-based search agents often rely on pre-trained knowledge rather than genuine web searching when answering questions on benchmarks like BrowseComp. Agents answer up to 44.5% of questions without tools and generate over half their search queries from internal hypotheses. To address this, the authors introduce LiveBrowseComp, a new benchmark with 335 human-authored questions based on facts published within 90 days, ensuring answers cannot be derived from model training data. On LiveBrowseComp, all agents score below 2% closed-book accuracy, and search-augmented scores drop 25-40 points compared to BrowseComp, revealing that static benchmarks conflate memory with genuine search capability.

Key quotes

· 3 pulled
Agents answer up to 44.5% of BrowseComp questions without tools, generate more than half of their search queries from internally produced hypotheses rather than retrieved leads
These results suggest that static search benchmarks can reward memory-backed verification rather than evidence-driven discovery, conflating what agents already know with what they can find
On LiveBrowseComp, all evaluated agents fall below 2% closed-book accuracy, search-augmented scores drop by 25-40 points relative to BrowseComp, and prior model rankings no longer reliably predict performance
Snippet from the RSS feed
Are LLM-based search agents genuinely searching, or using the web to verify what they already know? We study this question on BrowseComp with three diagnostics. Our analysis reveals Intrinsic Knowledge Dependence (IKD): even with tool access, agents often

You might also wanna read