Enabling XDP for Egress Traffic: A Performance Breakthrough for Linux Packet Processing
By
loopholelabs
If you only eat one bagel today, this is the bagel.
Summary
This article details a breakthrough technique that enables XDP (eXpress Data Path), previously limited to ingress traffic processing, to work for egress traffic as well. The authors discovered a loophole in how the Linux kernel determines packet direction, allowing them to apply XDP's high-performance packet processing to outgoing traffic. The solution delivers 10x better performance than current alternatives, works seamlessly with existing Docker/Kubernetes containers, and requires zero kernel modifications. The post provides implementation details and outlines how container and VM workloads can immediately benefit from this advancement with minimal effort.
Key quotes
· 4 pulledXDP (eXpress Data Path) is the fastest packet processing framework in linux - but it only works for incoming (ingress) traffic.
We discovered how to use it for outgoing (egress) traffic by exploiting a loophole in how the linux kernel determines packet direction.
Our technique delivers 10x better performance than current solutions, works with existing Docker/Kubernetes containers, and requires zero kernel modifications.
This post not only expands on the overall implementation but also outlines how existing container and VM workloads can immediately take advantage with minimal effort.
You might also wanna read
Optimizing .NET APIs for High Throughput: Techniques for 1M Requests Per Minute
Article discusses techniques for designing high-throughput .NET APIs capable of handling 1M requests per minute. It covers horizontal scalin

How micro-optimizations in Azure Service Bus SDK paved the way for a smarter redesign
The article discusses how micro-optimizations in the Azure Service Bus SDK led to meaningful design improvements. Rather than advocating for
How Kestra Improved Orchestrator Performance Across 14 Releases: A Year of Performance Engineering
Kestra's engineering team details their year-long performance engineering journey across releases 0.19 to 1.3, treating performance as an on
Optimizing Deep Learning Performance Through First-Principles Reasoning
The article discusses improving deep learning model performance by reasoning from first principles rather than relying on ad-hoc tricks and
Java Performance Optimization: Fixing 8 Common Anti-Patterns to Reduce Processing Time by 80%
The article presents a case study of Java performance optimization where fixing common anti-patterns dramatically improved application perfo
Understanding CPU Branch Prediction and Its Impact on Benchmarking
The article discusses how modern processors use branch prediction to execute multiple instructions per cycle, explaining that CPUs have rema
lemire.me·2mo ago