All Topics

Technology

Art

New Variant of DeepSeek AI Model Developed by German Lab TNG Technology Consulting GmbH

saubeidl

11mo ago· 8 min readenNews

100/100

Golden Brown

Bagelometer↗

Master baker tier. Every paragraph earns its place on the tray.

Score100TypenewsSentimentpositive

Summary

A new variant of the DeepSeek AI model, R1-0528, developed by German lab TNG Technology Consulting GmbH, is 200% faster than its predecessor. This improvement is attributed to TNG's Assembly-of-Experts method for building LLMs.

Key quotes

· 2 pulled

Like its predecessor, DeepSeek-R1 — which rocked the AI and global business communities with how cheaply it was trained and how well it performed on reasoning tasks, all available to developers and enterprises for free — R1-0528 is already being adapted and remixed by other A

This gain is made possible by TNG’s Assembly-of-Experts (AoE) method — a technique for building LLMs by selectively merging the weight tensors

Snippet from the RSS feed

This gain is made possible by TNG’s Assembly-of-Experts (AoE) method — a technique for building LLMs by selectively merging the weight tensors

You might also wanna read

DeepSeek previews V4 AI model, claims competitiveness with US rivals and Huawei compatibility

Chinese AI company DeepSeek has released a preview of its next-generation AI model V4, claiming it can compete with leading closed-source sy

The Verge·1mo ago

DeepSeek's V4 Model Shows Widening Gap with US Frontier AI Despite Being China's Best

DeepSeek's latest V4 model release was met with a muted reaction, as analysis by the US National Institute for Standards and Technology foun

bloomberg.com·4d ago

DeepSeek-V3.1-Terminus: Latest Open-Source LLM with Enhanced Stability and Agent Capabilities

DeepSeek-V3.1-Terminus is the latest open-source large language model from DeepSeek, representing the 7th launch in their series. This refin

Product Hunt·1mo ago

DeepSeek-V3.1: Open-Source Language Model with Hybrid Inference for Advanced Reasoning and Coding

DeepSeek-V3.1 is an open-source large language model that introduces hybrid inference with both 'Think' and 'Non-Think' modes, optimized for

Product Hunt·9mo ago