Technology

Art

GLM 5.2 Open-Weight Model Beats Claude Code on IDOR Detection Benchmark

Austin Theriault

6d ago· 10 min readenInsight

technology cybersecurity programming ai & machine learning

Summary

Zhipu AI's open-weight model GLM 5.2 outperformed Claude Code (39% vs 32% F1) on an IDOR detection benchmark at a lower cost per vulnerability found ($0.17 vs higher). While still trailing Semgrep's multimodal pipeline (53-61% F1), GLM 5.2's strong showing among prompt-only models challenges the assumption that open-weight models are inherently inferior to proprietary frontier models in cybersecurity tasks.

Source

Hacker NewsGLM 5.2 Open-Weight Model Beats Claude Code on IDOR Detection Benchmarksemgrep.dev

Key quotes

· 3 pulled

GLM 5.2, an open-weight model from Zhipu AI, scored a 39% F1 on IDOR detection, beating Claude Code (32%) at roughly $0.17 per vulnerability found.

Among models given nothing but a prompt, the best open-weight option was no longer the obvious underdog.

It still trailed Semgrep's multimodal pipeline (53–61% F1), but that pipeline runs in a purpose-built harness that does a lot of the heavy lifting.

Snippet from the RSS feed

Among models given nothing but a prompt, the best open-weight option beat Claude Opus 4.8.

You might also wanna read

China's open-weight GLM-5.2 matches restricted US AI model on cybersecurity benchmarks, challenging export controls

Zhipu AI's open-weight GLM-5.2 model, released under the MIT license, has matched Anthropic's restricted Mythos model on vulnerability detec

startupfortune.com·6d ago

GLM-5.2 Open-Weight Model Outperforms Opus 4.8 on AI-Resistant Backend Test

The article presents a detailed technical comparison between GLM-5.2 (open-weight model) and Opus 4.8, demonstrating that GLM-5.2 outperform

southbridge.ai·11d ago

Chinese open-source AI model GLM-5.2 raises concerns over cheaper, more accessible hacking tools

China's open-source AI model GLM-5.2, developed by Z.ai, is raising security concerns because it offers advanced agentic capabilities rivali

axios.com·5d ago

China's Zhipu GLM 5.2 narrows gap with top US AI models at fraction of the cost

Chinese AI startup Zhipu has released GLM 5.2, an open-source model that rivals top US models like Anthropic's Opus 4.8 on agentic benchmark

cnbc.com·7d ago

GLM 5.2 matches frontier AI models on cybersecurity benchmarks at half the cost, raising distillation concerns

Z.ai's GLM 5.2, an open weights Chinese AI model, has been benchmarked by Louie.ai researchers on the CyberBT-CTF security agent investigati

graphistry.com·11d ago

Z.ai releases GLM-5.2: 753B parameter open weights LLM with 1M token context window

Chinese AI lab Z.ai released GLM-5.2, a 753B parameter Mixture-of-Experts open weights LLM under MIT license. The model features 40 active p

simonwillison.net·15d ago

Comments

No comments yet. Be the first.