All Topics

Technology

Art

DeepSeek releases V4 preview models: largest open weights AI with 1.6T parameters

Simon Willison

1mo ago· 4 min readenNews

90/100

Golden Brown

Bagelometer↗

Hand-rolled, kettle-boiled, baked to perfection. Worth every minute at the bakery.

Score90TypenewsSentimentpositive

Summary

Chinese AI lab DeepSeek has released the first preview models of its V4 series: DeepSeek-V4-Pro and DeepSeek-V4-Flash. Both are Mixture of Experts models with 1 million token context. The Pro version has 1.6 trillion total parameters (49B active), making it the largest open weights model to date, surpassing competitors like Kimi K2.6 and GLM-5.1. The Flash version has 284B total parameters (13B active). Both models are released under the standard MIT license, offering frontier-competitive performance at a fraction of the cost.

Key quotes

· 3 pulled

I think this makes DeepSeek-V4-Pro the new largest open weights model.

It's larger than Kimi K2.6 (1.1T) and GLM-5.1 (754B) and more than twice the size of DeepSeek V3.2 (685B)

Both models are 1 million token context Mixture of Experts.

Snippet from the RSS feed

Chinese AI lab DeepSeek’s last model release was V3.2 (and V3.2 Speciale) last December. They just dropped the first of their hotly anticipated V4 series in the shape of two …

You might also wanna read

DeepSeek previews V4 AI model, claims competitiveness with US rivals and Huawei compatibility

Chinese AI company DeepSeek has released a preview of its next-generation AI model V4, claiming it can compete with leading closed-source sy

The Verge·1mo ago

DeepSeek's V4 Model Shows Widening Gap with US Frontier AI Despite Being China's Best

DeepSeek's latest V4 model release was met with a muted reaction, as analysis by the US National Institute for Standards and Technology foun

bloomberg.com·4d ago

DeepSeek-V3.1: Open-Source Language Model with Hybrid Inference for Advanced Reasoning and Coding

DeepSeek-V3.1 is an open-source large language model that introduces hybrid inference with both 'Think' and 'Non-Think' modes, optimized for

Product Hunt·9mo ago

xAI Releases Grok 2.5 Open Source Model with 500 GB Weights

xAI has released Grok 2.5, their best model from 2024, as open source. The large-scale model with approximately 500 GB of weights is now ava

Product Hunt·9mo ago

DeepSeek-V4: Hybrid Sparse-Attention Architecture Enables Efficient Million-Token Context Inference

DeepSeek-V4 introduces a hybrid sparse-attention architecture combined with on-policy distillation across domain specialists, enabling 1M-to

artgor.medium.com·7h ago

DeepSeek-V3.1-Terminus: Latest Open-Source LLM with Enhanced Stability and Agent Capabilities

DeepSeek-V3.1-Terminus is the latest open-source large language model from DeepSeek, representing the 7th launch in their series. This refin

Product Hunt·1mo ago