FeedBagel

All Topics

Art

Distilling DeepSeek V4 Pro’s Thinking Style Into Qwen3.6-35B-A3B

12d ago· 1 min readNews

Source

Twitter / XDistilling DeepSeek V4 Pro’s Thinking Style Into Qwen3.6-35B-A3Bmodelscope.cn

Snippet from the RSS feed

Distilling DeepSeek V4 Pro’s Thinking Style Into Qwen3.6-35B-A3B 🌟Insights from Zhihu contributor & individual developer Lynn TL;DR: After reading ReAct, Lynn distilled the thinking style of DeepSeek V4 Pro into Qwen3.6-35B-A3B with LoRA. The result: GPQA-Diamond +7.6pp, unclosed empty answers down from 12 → 1, and average agent orchestration time down from 60s → 26s. The goal was not “more CoT,” but faster task decomposition, delegation, and verification. The starting point was simple: after reading Yao Shunyu’s ReAct work, the author wanted a local model that could act as an agent orchestrator. ReAct’s core idea is clear: let the model alternate between Thought and Action, instead of going to either extreme

You might also wanna read

DeepSeek-V3.1 Released with Hybrid Inference and Enhanced Agent Capabilities

DeepSeek has released DeepSeek-V3.1, featuring hybrid inference with both 'Think' and 'Non-Think' modes in a single model. The new version o

api-docs.deepseek.com·10mo ago

DeepSeek-V3.1: Open-Source Language Model with Hybrid Inference for Advanced Reasoning and Coding

DeepSeek-V3.1 is an open-source large language model that introduces hybrid inference with both 'Think' and 'Non-Think' modes, optimized for

Product Hunt·10mo ago

Comparing 11 LLMs on a LangGraph Code Reorganization Task: American vs. Chinese Models

A detailed experimental comparison of 11 large language models (5 American: GPT-4o, Claude 3.5 Sonnet, Gemini 2.0 Flash, Gemini 2.5 Pro, Gro

wtf.korridzy.com·2d ago

Technical Implementation of DeepSeek LLM Deployment with Expert Parallelism on 96 H100 GPUs

The article details the technical implementation of deploying DeepSeek, an open-source large language model, across 96 H100 GPUs using advan

lmsys.org·10mo ago

DeepSeek-V4 Series Preview: Million-Token Context MoE Models with 1.6T Parameters

DeepSeek introduces the V4 series of Mixture-of-Experts (MoE) language models, including DeepSeek-V4-Pro (1.6T parameters, 49B activated) an

huggingface.co·2mo ago

DeepSeek-V4-Flash revives interest in LLM steering with local model capabilities

The article discusses LLM "steering" — manipulating model activations mid-flight to guide outputs — and highlights DeepSeek-V4-Flash as a br

seangoedecke.com·1mo ago

Comments

No comments yet. Be the first.