All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

EPaxos*: A Simplified and Corrected Variant of Egalitarian Paxos for Distributed Consensus

By

otrack

6mo ago· 2 min readenInsight

Summary

This academic paper presents EPaxos*, a simplified and corrected variant of Egalitarian Paxos, a leaderless distributed consensus protocol. The authors address the complexity and bugs in the original Egalitarian Paxos by developing a simpler failure-recovery algorithm with rigorous correctness proofs. The protocol generalizes Egalitarian Paxos to cover optimal failure thresholds while maintaining the key benefits of leaderless operation, including non-zero throughput with up to f crashes and fast command execution in 2 message delays under certain conditions.

Key quotes

· 5 pulled
Egalitarian Paxos introduced an alternative, leaderless approach, that allows replicas to order commands collaboratively.
Not relying on a single leader allows the protocol to maintain non-zero throughput with up to f crashes of any processes out of a total of n = 2f+1.
Egalitarian Paxos has served as a foundation for many other replication protocols. But unfortunately, the protocol is very complex, ambiguously specified and suffers from nontrivial bugs.
In this paper, we present EPaxos* -- a simpler and correct variant of Egalitarian Paxos.
Our key technical contribution is a simpler failure-recovery algorithm, which we have rigorously proved correct.
Snippet from the RSS feed
Classical state-machine replication protocols, such as Paxos, rely on a distinguished leader process to order commands. Unfortunately, this approach makes the leader a single point of failure and increases the latency for clients that are not co-located w

You might also wanna read