All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Empirical Study Finds Grep Outperforms Vector Retrieval in LLM Agentic Search Systems

By

[Submitted on 14 May 2026]

1d ago· 2 min readenInsight

Summary

This paper presents an empirical study comparing grep-based retrieval versus vector retrieval in LLM agentic search systems. Using a 116-question sample from LongMemEval, the study tests retrieval strategies across multiple agent harnesses (Chronos, Claude Code, Codex, Gemini CLI) and tool-calling paradigms (inline vs. file-based results). Experiment 1 finds that grep generally yields higher accuracy than vector retrieval, though overall performance depends heavily on the harness and tool-calling style used. Experiment 2 examines how performance degrades when irrelevant conversation history is mixed in, comparing grep-only and vector-only retrieval under increasing distraction.

Key quotes

· 4 pulled
grep generally yields higher accuracy than vector retrieval in our comparisons in experiment 1
overall scores still depend strongly on which harness and tool-calling style is used, even when the underlying conversation data are the same
existing literature lacks a systematic comparison of how retrieval strategy choice interacts with agent architecture and tool-calling paradigm
how tool outputs are presented to the model and how performance changes when searches must cope with more irrelevant surrounding text, remain under-explored in agent loops
Snippet from the RSS feed
Recent advances in Large Language Model (LLM) agents have enabled complex agentic workflows where models autonomously retrieve information, call tools, and reason over large corpora to complete tasks on behalf of users. Despite the growing adoption of ret

You might also wanna read