All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Alibaba's Tongyi DeepResearch: Open-Source AI Research Agent Matches OpenAI Performance

By

meander_water

7mo ago· 1 min readen

Summary

Alibaba's Tongyi DeepResearch is presented as the first fully open-source web agent that achieves performance comparable to OpenAI's DeepResearch across multiple benchmarks. The article highlights its state-of-the-art results on academic reasoning tasks (Humanity's Last Exam scoring 32.9), complex information-seeking tasks (BrowseComp at 43.4 and BrowseComp-ZH at 46.7), and user-centric benchmarks (xbench-DeepSearch at 75), systematically outperforming existing proprietary and open-source deep research agents.

Key quotes

· 3 pulled
Tongyi DeepResearch, the first fully open‑source Web Agent to achieve performance on par with OpenAI's DeepResearch across a comprehensive suite of benchmarks
Tongyi DeepResearch demonstrates state‑of‑the‑art results, scoring 32.9 on the academic reasoning task Humanity's Last Exam (HLE)
Achieving a score of 75 on the user‑centric xbench‑DeepSearch benchmark, systematically outperforming all existing proprietary and open‑source Deep Research agents
Snippet from the RSS feed
GITHUB HUGGINGFACE MODELSCOPE SHOWCASE From Chatbot to Autonomous Agent We are proud to present Tongyi DeepResearch, the first fully open‑source Web Agent to achieve performance on par with OpenAI’s DeepResearch across a comprehensive suite of bench

You might also wanna read

Arcee AI Launches Trinity-Large-Thinking: Open-Source AI Model Matching Opus 4.6 Performance at 96% Lower Cost

Arcee AI has launched Trinity-Large-Thinking, an open-source AI model that claims to match the performance of OpenAI's Opus 4.6 while being

Product Hunt·1mo ago

DeepSeek previews V4 AI model, claims competitiveness with US rivals and Huawei compatibility

Chinese AI company DeepSeek has released a preview of its next-generation AI model V4, claiming it can compete with leading closed-source sy

The Verge·1mo ago

Alibaba's Qwen3.7-Max ranks 4th globally in coding benchmark, beating OpenAI and Google models

Alibaba's latest AI model, Qwen3.7-Max, has secured the fourth spot globally on the Code Arena coding leaderboard with a score of 1,541, out

scmp.com·4d ago

Open Comet: Autonomous AI Browser Agent for Research and Task Automation

Open Comet is an autonomous AI browser agent that operates in a browser sidepanel, capable of performing deep research and executing multi-s

Product Hunt·1mo ago

DeepSeek's V4 Model Shows Widening Gap with US Frontier AI Despite Being China's Best

DeepSeek's latest V4 model release was met with a muted reaction, as analysis by the US National Institute for Standards and Technology foun

bloomberg.com·4d ago

Datacurve's DeepSWE Benchmark Shows GPT-5.5 Leading AI Coding Models with 70% Pass Rate

A new benchmark called DeepSWE, released by startup Datacurve, reveals significant performance differences among AI coding models that were

share.transistor.fm·4d ago