All Topics
All Topics
Technology
Technology
AI
AI
Business
Business
Entertainment
Entertainment
News
News
Programming
Programming
Security
Security
Science
Science
Design
Design
Environment
Environment
Finance
Finance
Crypto
Crypto
Politics
Politics
Sports
Sports
Education
Education
Gaming
Gaming
Art
Art
Music
Music
Health
Health
Books
Books
Food
Food
Travel
Travel
Personal
Personal
Bluesky
Twitter

Distilling DeepSeek V4 Pro’s Thinking Style Into Qwen3.6-35B-A3B

12d ago· 1 min readNews

Source

Twitter / XDistilling DeepSeek V4 Pro’s Thinking Style Into Qwen3.6-35B-A3Bmodelscope.cn
Snippet from the RSS feed
Distilling DeepSeek V4 Pro’s Thinking Style Into Qwen3.6-35B-A3B 🌟Insights from Zhihu contributor & individual developer Lynn TL;DR: After reading ReAct, Lynn distilled the thinking style of DeepSeek V4 Pro into Qwen3.6-35B-A3B with LoRA. The result: GPQA-Diamond +7.6pp, unclosed empty answers down from 12 → 1, and average agent orchestration time down from 60s → 26s. The goal was not “more CoT,” but faster task decomposition, delegation, and verification. The starting point was simple: after reading Yao Shunyu’s ReAct work, the author wanted a local model that could act as an agent orchestrator. ReAct’s core idea is clear: let the model alternate between Thought and Action, instead of going to either extreme

You might also wanna read

Comments

Sign in to join the conversation.

No comments yet. Be the first.