TCP_NODELAY: Why Modern Distributed Systems Should Disable Nagle's Algorithm by Default
By
eieio
The kind of bagel that ruins lesser bagels for you.
Summary
The article discusses the persistent latency issues in modern distributed systems caused by the default TCP_NODELAY setting, which implements Nagle's algorithm from 1984. The author argues that this algorithm, designed for 1980s network conditions with limited bandwidth and high packet loss, is now counterproductive in today's high-speed, low-latency environments. The article explains how Nagle's algorithm batches small packets to reduce overhead but introduces significant latency, especially in interactive applications. The author advocates for enabling TCP_NODELAY by default in modern systems and suggests that the original problem the algorithm solved (small packet overhead) is no longer relevant with today's network infrastructure.
Key quotes
· 4 pulledThe first thing I check when debugging latency issues in distributed systems is whether TCP_NODELAY is enabled. And it's not just me. Every distributed system builder I know has lost hours to latency issues quickly fixed by enabling this simple socket option, suggesting that the default behavior is wrong, and perhaps that the whole concept is outmoded.
Nagle's algorithm is a simple, elegant solution to a problem that no longer exists. It was designed for a world of 300 baud modems and 10 Mbps Ethernet, where packet overhead was a real concern. Today, we have gigabit Ethernet, 10 gigabit Ethernet, and even 100 gigabit Ethernet. The overhead of a few extra bytes is negligible.
The problem is that Nagle's algorithm introduces latency. It waits for either an ACK from the previous packet or for enough data to fill a full-sized packet before sending. This can add tens or even hundreds of milliseconds to the round-trip time, which is unacceptable in modern distributed systems.
We should be enabling TCP_NODELAY by default. The original problem that Nagle's algorithm was designed to solve—small packet overhead—is no longer a problem. The overhead of a few extra bytes is negligible on modern networks.
You might also wanna read
BGP Lab Project Expanded to Include Full IPv6 Feed
The author extends their BGP lab project to support full IPv6 feeds, following requests from readers who previously received IPv4 BGP feeds.
Agent Memory Is Distributed State Management, Not Magic
The article argues that "agent memory" in AI systems is fundamentally just distributed state management rebranded. It draws parallels betwee
Modified Raft Consensus Protocol Enables Progress with Minority Node Participation
This article describes a modified version of the Raft consensus protocol that allows progress to be made even when fewer than a majority of
Whosthere: A Go-based LAN discovery tool with interactive TUI for unprivileged network scanning
Whosthere is a Go-based Local Area Network (LAN) discovery tool with an interactive Terminal User Interface (TUI). It performs unprivileged,
Building a Rust Multi-Paxos Engine with AI: Lessons from 130K Lines of Code
A developer shares their experience building a 130K-line Rust-based multi-Paxos consensus engine using AI coding agents over ~3 months. The
Investigating Intermittent ECONNRESET Errors in Local TCP Connections (Part 1)
A technical blog post investigating mysterious ECONNRESET errors occurring between two services communicating over TCP on the same machine.
