DeepSeek releases V4 preview models: largest open weights AI with 1.6T parameters
By
Simon Willison
Hand-rolled, kettle-boiled, baked to perfection. Worth every minute at the bakery.
Summary
Chinese AI lab DeepSeek has released the first preview models of its V4 series: DeepSeek-V4-Pro and DeepSeek-V4-Flash. Both are Mixture of Experts models with 1 million token context. The Pro version has 1.6 trillion total parameters (49B active), making it the largest open weights model to date, surpassing competitors like Kimi K2.6 and GLM-5.1. The Flash version has 284B total parameters (13B active). Both models are released under the standard MIT license, offering frontier-competitive performance at a fraction of the cost.
Key quotes
· 3 pulledI think this makes DeepSeek-V4-Pro the new largest open weights model.
It's larger than Kimi K2.6 (1.1T) and GLM-5.1 (754B) and more than twice the size of DeepSeek V3.2 (685B)
Both models are 1 million token context Mixture of Experts.
You might also wanna read

DeepSeek previews V4 AI model, claims competitiveness with US rivals and Huawei compatibility
Chinese AI company DeepSeek has released a preview of its next-generation AI model V4, claiming it can compete with leading closed-source sy
DeepSeek's V4 Model Shows Widening Gap with US Frontier AI Despite Being China's Best
DeepSeek's latest V4 model release was met with a muted reaction, as analysis by the US National Institute for Standards and Technology foun
DeepSeek-V3.1: Open-Source Language Model with Hybrid Inference for Advanced Reasoning and Coding
DeepSeek-V3.1 is an open-source large language model that introduces hybrid inference with both 'Think' and 'Non-Think' modes, optimized for
xAI Releases Grok 2.5 Open Source Model with 500 GB Weights
xAI has released Grok 2.5, their best model from 2024, as open source. The large-scale model with approximately 500 GB of weights is now ava
DeepSeek-V4: Hybrid Sparse-Attention Architecture Enables Efficient Million-Token Context Inference
DeepSeek-V4 introduces a hybrid sparse-attention architecture combined with on-policy distillation across domain specialists, enabling 1M-to
DeepSeek-V3.1-Terminus: Latest Open-Source LLM with Enhanced Stability and Agent Capabilities
DeepSeek-V3.1-Terminus is the latest open-source large language model from DeepSeek, representing the 7th launch in their series. This refin
