GLM 5.2 Open-Weight Model Beats Claude Code on IDOR Detection Benchmark
By
Austin Theriault
Summary
Zhipu AI's open-weight model GLM 5.2 outperformed Claude Code (39% vs 32% F1) on an IDOR detection benchmark at a lower cost per vulnerability found ($0.17 vs higher). While still trailing Semgrep's multimodal pipeline (53-61% F1), GLM 5.2's strong showing among prompt-only models challenges the assumption that open-weight models are inherently inferior to proprietary frontier models in cybersecurity tasks.
Source
Key quotes
· 3 pulledGLM 5.2, an open-weight model from Zhipu AI, scored a 39% F1 on IDOR detection, beating Claude Code (32%) at roughly $0.17 per vulnerability found.
Among models given nothing but a prompt, the best open-weight option was no longer the obvious underdog.
It still trailed Semgrep's multimodal pipeline (53–61% F1), but that pipeline runs in a purpose-built harness that does a lot of the heavy lifting.
You might also wanna read
China's open-weight GLM-5.2 matches restricted US AI model on cybersecurity benchmarks, challenging export controls
Zhipu AI's open-weight GLM-5.2 model, released under the MIT license, has matched Anthropic's restricted Mythos model on vulnerability detec
GLM-5.2 Open-Weight Model Outperforms Opus 4.8 on AI-Resistant Backend Test
The article presents a detailed technical comparison between GLM-5.2 (open-weight model) and Opus 4.8, demonstrating that GLM-5.2 outperform
Chinese open-source AI model GLM-5.2 raises concerns over cheaper, more accessible hacking tools
China's open-source AI model GLM-5.2, developed by Z.ai, is raising security concerns because it offers advanced agentic capabilities rivali

China's Zhipu GLM 5.2 narrows gap with top US AI models at fraction of the cost
Chinese AI startup Zhipu has released GLM 5.2, an open-source model that rivals top US models like Anthropic's Opus 4.8 on agentic benchmark
GLM 5.2 matches frontier AI models on cybersecurity benchmarks at half the cost, raising distillation concerns
Z.ai's GLM 5.2, an open weights Chinese AI model, has been benchmarked by Louie.ai researchers on the CyberBT-CTF security agent investigati
graphistry.com·11d agoZ.ai releases GLM-5.2: 753B parameter open weights LLM with 1M token context window
Chinese AI lab Z.ai released GLM-5.2, a 753B parameter Mixture-of-Experts open weights LLM under MIT license. The model features 40 active p

Comments
Sign in to join the conversation.
No comments yet. Be the first.