LiveBrowseComp reveals LLM search agents rely on memorized knowledge, not genuine web searching

[Submitted on 27 May 2026]

4d ago· 2 min readenInsight

75/100

Toasty

Bagelometer↗

Crackles when you bite it. Shows the baker did the work.

Score75TypeanalysisSentimentneutral

Summary

This paper introduces the concept of Intrinsic Knowledge Dependence (IKD), showing that LLM-based search agents often rely on pre-trained knowledge rather than genuine web searching when answering questions on benchmarks like BrowseComp. Agents answer up to 44.5% of questions without tools and generate over half their search queries from internal hypotheses. To address this, the authors introduce LiveBrowseComp, a new benchmark with 335 human-authored questions based on facts published within 90 days, ensuring answers cannot be derived from model training data. On LiveBrowseComp, all agents score below 2% closed-book accuracy, and search-augmented scores drop 25-40 points compared to BrowseComp, revealing that static benchmarks conflate memory with genuine search capability.

Key quotes

· 3 pulled

Agents answer up to 44.5% of BrowseComp questions without tools, generate more than half of their search queries from internally produced hypotheses rather than retrieved leads

These results suggest that static search benchmarks can reward memory-backed verification rather than evidence-driven discovery, conflating what agents already know with what they can find

On LiveBrowseComp, all evaluated agents fall below 2% closed-book accuracy, search-augmented scores drop by 25-40 points relative to BrowseComp, and prior model rankings no longer reliably predict performance

Snippet from the RSS feed

Are LLM-based search agents genuinely searching, or using the web to verify what they already know? We study this question on BrowseComp with three diagnostics. Our analysis reveals Intrinsic Knowledge Dependence (IKD): even with tool access, agents often

You might also wanna read

The Problem with Using LLMs for Information Retrieval: Why Perfect Accuracy Isn't Enough

The article presents a critical perspective on using Large Language Models (LLMs) like GPT for information retrieval, arguing that even if t

lr0.org·2mo ago

LLMNet: Offline AI-Powered Search Engine for Local Knowledge Bases

LLMNet is an open-source project that provides an offline, private AI-powered search experience running entirely on local machines. It trans

github.com·4mo ago

The Myth of Memory-Free Learning: Why Search Engines and AI Still Require Deep Knowledge

The article debunks the long-standing myth that search engines, AI, and note-taking apps eliminate the need for human memory and active lear

zettelkasten.de·8mo ago