RNG: Amazon deploys flat datacenter networks using quasi-random graphs for 45% cost savings
By
[Submitted on 16 Apr 2026 (v1), last revised 21 May 2026 (this version, v3)]
Summary
This paper presents RNG, the first flat datacenter network design deployed in production at Amazon. RNG uses quasi-random graph topologies to achieve cost and fault-tolerance benefits that have been theoretically known but practically challenging due to routing and cabling issues. The design introduces a new distributed routing protocol that leverages random graph properties to find edge-disjoint paths, and a novel passive optical device for simplified cabling. RNG matches or exceeds fat tree performance across various traffic patterns while being up to 45% cheaper, and is now the default datacenter network for most Amazon workloads.
Source
Key quotes
· 5 pulledWe design and deploy in production the first flat datacenter networks.
RNG has a new distributed routing protocol that exploits the properties of random graphs to find a large number of edge disjoint paths between pairs of endpoints.
It uses a novel passive optical device that internally shuffles cables, which makes its cabling complexity similar to that of fat trees.
We show that RNG matches or exceeds the performance of fat trees for a range of traffic patterns, despite being up to 45% cheaper.
RNG is now the default datacenter network for most workloads at Amazon.
You might also wanna read
Performance Benchmark: Polars vs DuckDB vs Daft vs Spark on 650GB Delta Lake Dataset
The article presents a performance comparison benchmark of four data processing frameworks (Polars, DuckDB, Daft, and Spark) on a 650GB Delt
FuriosaAI Launches NXT RNGD Server for Scalable AI Inference in Data Centers
FuriosaAI introduces the NXT RNGD Server, an optimized AI inference system built around their RNGD accelerators. The server is designed for
FractalBits Introduces New Object Storage Solution Addressing Economic Challenges of High-Performance Storage
FractalBits introduces a new object storage solution that addresses the economic challenges of high-performance storage at scale. While the
How AWS S3 achieves petabyte-scale throughput using commodity hard drives
This article explores the engineering behind AWS S3's massive scale, explaining how Amazon built a distributed storage system capable of ser
OpenFGA Implements Self-Tuning Strategy Planner with Thompson Sampling to Reduce P99 Latency by 98%
The article details how OpenFGA, an open-source authorization system modeled after Google's Zanzibar, addressed tail latency challenges in i
Optimizing Cloud Development Sandboxes for Low Latency Performance
The article discusses strategies for achieving low-latency cloud development sandboxes by optimizing server placement and network architectu

Comments
Sign in to join the conversation.
No comments yet. Be the first.