Benchmark Comparison: Qwen3.6-35B-A3B Outperforms Claude Opus 4.7 in Pelican Image Generation Test
By
simonw
Crackling crust, pillowy middle. The kind of bagel that earns a second cup of coffee.
Summary
The article presents a comparative benchmark test between two AI language models - Qwen3.6-35B-A3B from Alibaba and Claude Opus 4.7 from Anthropic. The author uses a specific "pelican riding a bicycle" benchmark to evaluate the models' image generation capabilities. The Qwen3.6 model, running locally on a MacBook Pro M5 via LM Studio with a quantized 20.9GB model, produced a better pelican image than Claude Opus 4.7, which the author states "managed to mess up." The article serves as a performance comparison of the latest AI model releases.
Key quotes
· 4 pulledFor anyone who has been taking my pelican riding a bicycle benchmark seriously as a robust way to test models, here are pelicans from this morning's two big model releases—Qwen3.6-35B-A3B from Alibaba and Claude Opus 4.7 from Anthropic.
Here's the Qwen 3.6 pelican, generated using this 20.9GB Qwen3.6-35B-A3B-UD-Q4_K_S.gguf quantized model by Unsloth, running on my MacBook Pro M5 via LM Studio (and the llm-lmstudio plugin)
And here's one I got from Anthropic's brand new Claude Opus 4.7
I'm giving this one to Qwen 3.6. Opus managed to mess up
You might also wanna read
Alibaba's Qwen3.7-Max ranks 4th globally in coding benchmark, beating OpenAI and Google models
Alibaba's latest AI model, Qwen3.7-Max, has secured the fourth spot globally on the Code Arena coding leaderboard with a score of 1,541, out

Anthropic Releases Claude Opus 4.7 AI Model with Enhanced Coding and Creative Capabilities
Anthropic has released Claude Opus 4.7, its most powerful generally available AI model to date, which offers improvements over Opus 4.6 in a
Datacurve's DeepSWE Benchmark Shows GPT-5.5 Leading AI Coding Models with 70% Pass Rate
A new benchmark called DeepSWE, released by startup Datacurve, reveals significant performance differences among AI coding models that were

Anthropic Releases Claude Opus 4.5 AI Model Amid Cybersecurity Concerns
Anthropic has released Claude Opus 4.5, positioning it as the world's best AI model for coding, agents, and computer use, claiming it surpas

Anthropic Releases Claude Opus 4.6 AI Model with Enhanced Multi-Step Task Capabilities
Anthropic has released Claude Opus 4.6, described as a 'direct upgrade' from its predecessor with improved capabilities for handling complex
Anthropic releases Claude Opus 4.8 with effort controls, cheaper fast mode, and improved honesty
Anthropic released Claude Opus 4.8, the newest version of its flagship AI model, featuring effort controls, dynamic workflows, cheaper fast
bit.ly·1d ago