Open-source AI search agent Harness-1 outperforms GPT-5.4 on information retrieval tasks
By
Carl Franzen
Hand-rolled, kettle-boiled, baked to perfection. Worth every minute at the bakery.
Summary
Researchers from UIUC, UC Berkeley, and Chroma developed Harness-1, a 20-billion parameter open-source AI search agent built on OpenAI's gpt-oss-20B model. It achieves 73% accuracy on relevant information retrieval, outperforming GPT-5.4 (70.9%). The key insight is that improving the environment for AI models to work within may be more effective than simply scaling up model size and training data.
Key quotes
· 2 pulledHarness-1 achieves a massive leap in performance, scoring 73% average on its ability to recall relevant information correctly from a curated dataset, outperforming even GPT-5.4 (70.9%)
Harness-1 suggests that the future of agentic AI lies in building better environments for models to work within, rather than just training larger models on more data
You might also wanna read
Alibaba's Tongyi DeepResearch: Open-Source AI Research Agent Matches OpenAI Performance
Alibaba's Tongyi DeepResearch is presented as the first fully open-source web agent that achieves performance comparable to OpenAI's DeepRes
tongyi-agent.github.io·7mo agoPi: A hackable terminal harness for building custom AI coding agent workflows
Pi is a minimal, hackable terminal harness for building custom AI coding agent workflows. It keeps the core small and clean, allowing users
Chroma Context-1: A 20B Parameter Agentic Search Model for Multi-Hop Retrieval
Chroma Context-1 is a 20B parameter agentic search model designed to improve retrieval-augmented generation (RAG) systems. Unlike traditiona
agent-harness-kit: A TypeScript-based tool for simplifying AI agent orchestration with automatic state management and coordination
agent-harness-kit is a developer tool that simplifies AI agent orchestration, similar to how Vite simplifies frontend development. It allows
Dexto: An Open-Source Agent Harness for Orchestrating AI Applications
Dexto is an open-source agent harness that serves as an orchestration layer for turning large language models (LLMs) into reliable, stateful
GPT-5's search capabilities prove surprisingly effective, changing the chatbot-as-search-engine calculus
The article discusses how GPT-5 (nicknamed "Research Goblin") has become remarkably effective at using its Bing-backed search tool to answer
