All Topics

Technology

Art

Why Traditional Latency Measurement Tools Provide Misleading Results

dempedempe

6mo ago· 14 min readenInsight

100/100

Golden Brown

Bagelometer↗

Kettled twice. Extra chewy, extra trustworthy.

Score100TypeanalysisSentimentnegative

Summary

The article critiques traditional latency measurement tools and methodologies, arguing they provide misleading results. Based on a workshop by Gil Tene (CTO of Azul Systems), it explains how common approaches like averages, percentiles, and histograms fail to capture the true nature of latency distributions, especially tail latency. The article advocates for better visualization tools like HDR histograms and coordinated omission correction to understand latency behavior accurately, particularly for high-performance systems where tail latency matters most.

Key quotes

· 5 pulled

Okay, maybe not everything you know about latency is wrong. But now that I have your attention, we can talk about why the tools and methodologies you use to measure and reason about latency are likely horribly flawed.

In fact, they're not just flawed, they're probably lying to your face.

The problem with averages is that they hide the outliers, and with latency, the outliers are often what matter most.

Percentiles are better than averages, but they still don't tell the whole story about latency distributions.

Coordinated omission is the practice of measuring latency only when the system is ready to respond, which completely misses the worst-case scenarios.

Snippet from the RSS feed

Okay, maybe not everything you know about latency is wrong. But now that I have your attention, we can talk about why the tools and methodologies you use to measure and reason about latency are lik…

You might also wanna read

Performance Optimization: How a 185-Microsecond Type Hint Boosted Throughput 13× in Clojure Roughtime Implementation

The article describes a performance optimization in a Clojure implementation of the Roughtime protocol, where a seemingly trivial change to

blog.sturdystatistics.com·3mo ago

A Practical Guide to Scaling Web Systems from Zero to 10+ Million Users

This article provides a practical guide to scaling web systems from zero to over 10 million users, based on the author's experience at big t

blog.algomaster.io·4mo ago

Performance Optimization: Replacing Protobuf with Direct C-to-Rust Bindings in PgDog PostgreSQL Proxy

The article details how PgDog, a PostgreSQL proxy written in Rust, replaced Protobuf serialization with direct C-to-Rust bindings to achieve

pgdog.dev·4mo ago

Introduction to Memory Subsystem Optimization Blog Series

This blog post introduces a series of 18 articles focused on memory subsystem optimizations for software performance. The author explains th

johnnysswlab.com·5mo ago

Error Handling in Large Systems: The Debate Around Rust's .unwrap() Method

The article discusses the debate around error handling in large systems, sparked by Cloudflare's November 18 outage postmortem that mentione

brooker.co.za·6mo ago

The Legacy Problems with Environment Variables in Modern Software Development

This article critiques environment variables as an outdated and problematic mechanism in modern software development. It argues that while p

allvpv.org·7mo ago