Health Checking in Load Balancing: Client-Side vs Server-Side Approaches and Failure Detection
By
singhsanjay12
The kind of bagel that ruins lesser bagels for you.
Summary
The article examines health checking mechanisms in load balancing systems, comparing client-side versus server-side approaches. It explains how zombie instances can survive health checks despite being unable to process requests, leading to user-facing failures. The analysis covers how different health check implementations affect failure detection speed, accuracy, and application complexity, with practical implications for system reliability and user experience.
Key quotes
· 3 pulledA service reports healthy. The load balancer believes it. A request lands on it and times out. Another follows. Then ten more. By the time the system reacts, hundreds of requests have drained into a broken instance while users stared at a spinner.
Health checking sounds simple: ask if something is alive, stop sending traffic if it isn't. In practice, the mechanism behind that check, and who performs it, determines how fast your system detects failure, how accurately it responds, and how much of that complexity leaks into your application code.
Why zombie instances survive health checks, and what the choice between server-side and client-side load balancing means for how fast your system detects and reacts to failure.
You might also wanna read
Optimizing .NET APIs for High Throughput: Techniques for 1M Requests Per Minute
Article discusses techniques for designing high-throughput .NET APIs capable of handling 1M requests per minute. It covers horizontal scalin
SQLite as a Viable Alternative for Durable Workflow Execution
The article argues that SQLite can replace complex orchestration systems for durable workflow execution in many cases. It builds on DBOS's a
JWT vs Opaque Tokens: A Technical Comparison for API Security Architecture
This article compares JWT (JSON Web Tokens) and opaque tokens for API security, clarifying the common confusion between bearer tokens and JW
How Frontend State Management Becomes a Distributed Monolith as Apps Scale
This article discusses how frontend state management in growing applications can evolve into a "distributed monolith" — where state becomes
A Field Guide to Production-Ready AI Agents: Context Windows, Security, and Drift Monitoring
Karl Mehta presents a field guide for building production-ready AI agents, focusing on four key engineering challenges: context-window disci
The Convergent Architecture of Frontier Agentic Systems
This article from Veso Research analyzes the emerging universal architecture across frontier agentic systems (Claude Code, OpenAI Codex, Gem
