AVX-512 Optimization Boosts Linux RAID Performance by Up to 41% on AMD Ryzen 9 9950X
By
Written by Michael Larabel in Linux Storage on 12 June 2026 at 06:42 AM EDT. Add A Comment
Lightly toasted, lightly seasoned, mostly correct.
Summary
Eric Biggers, a Google Linux cryptography expert, has developed an AVX-512 optimized xor_gen() function for the Linux kernel's software RAID code. This optimization targets parity block generation and validation for RAID5/RAID6, showing up to 41% improvement on AMD's Ryzen 9 9950X processor. The work continues Biggers' history of x86_64 and AVX-512 optimizations within the Linux kernel's crypto and RAID subsystems.
Key quotes
· 3 pulledBiggers has written an AVX-512 optimized xor_gen() function for the RAID code.
The Linux kernel's xor_gen() function is used for generating and validating parity blocks such as for RAID5/RAID6.
Eric Biggers of Google worked on some pretty nice Intel/AMD x86_64 optimizations over the years.
You might also wanna read
Optimizing Prime Number Generation: Creating a High-Performance C Program for 32-Bit Primes
This article documents the author's technical journey to create an optimized C program for Linux that generates all prime numbers that fit w
Cache Aware Scheduling Set to Land in Linux 7.2, Boosts AMD Zen 5 Performance on PostgreSQL and Valkey
Cache Aware Scheduling, a long-in-development Linux kernel feature, is expected to land in Linux 7.2. Recent benchmarks on an AMD Ryzen Thre
Technical Analysis of Ultrassembler's High-Speed RISC-V Assembly Performance
Ultrassembler is a high-performance RISC-V assembler library developed as part of the Chata signal processing project. The article explains
CachyOS Plans Server Edition for 2026 with Performance Benchmarks on AMD EPYC Hardware
CachyOS, an Arch Linux-based distribution known for performance optimizations, is planning to develop a server edition by 2026. The article
Optimizing Linux timestamps on x86: 30% faster custom timers without vDSO
A deep technical exploration of optimizing Linux timestamps on x86 architecture, achieving 30% speed improvements over standard system clock
Understanding Linux Memory Management: Page Faults, mmap, and userfaultfd
This technical article explores Linux memory management concepts including page faults, mmap system calls, and userfaultfd. The author expla
