All Topics

Technology

Art

Python 3.15's Tail-Calling Interpreter Shows 15% Performance Gain on Windows x86-64

lumpa

5mo ago· 10 min readenInsight

100/100

Golden Brown

Bagelometer↗

Hand-rolled, kettle-boiled, baked to perfection. Worth every minute at the bakery.

Score100TypeanalysisSentimentpositive

Summary

The article discusses performance improvements in Python 3.15's interpreter, specifically highlighting that the tail-calling interpreter shows significant speed gains over the computed goto interpreter on certain platforms. The author partially retracts a previous apology about performance results, noting that on macOS AArch64 (XCode Clang) the tail-calling interpreter is 5% faster, and on Windows x86-64 (MSVC) it's approximately 15% faster based on pyperformance benchmarks.

Key quotes

· 3 pulled

I can proudly say today that I am partially retracting that apology, but only for two platforms—macOS AArch64 (XCode Clang) and Windows x86-64 (MSVC).

In our own experiments, the tail calling interpreter for CPython was found to beat the computed goto interpreter by 5% on pyperformance on AArch64 macOS using XCode Clang, and roughly 15% on pyperformance on Windows x86-64 (MSVC).

Some time ago I posted an apology piece for Python’s tail calling results. I apologized for communicating performance results without noticing a compiler bug had occured.

Snippet from the RSS feed

Python 3.15’s interpreter for Windows x86-64 should hopefully be 15% faster

You might also wanna read

Java Performance Optimization: Fixing 8 Common Anti-Patterns to Reduce Processing Time by 80%

The article presents a case study of Java performance optimization where fixing common anti-patterns dramatically improved application perfo

jvogel.me·2mo ago

Performance Optimization: Replacing Virtual Dispatch with Static Polymorphism in C++

The article discusses performance issues with virtual dispatch in object-oriented programming and advocates for using static polymorphism as

david.alvarezrosa.com·3mo ago

Performance Optimization: Achieving 20x Speedup by Removing Code in Rust Data Versioning Tool

A developer shares a performance optimization story where removing code led to a 20x speedup in their data versioning tool. The team at Oxen

suriya.cc·3mo ago

Introducing tprof: A Targeted Profiler for Python Performance Optimization

The article introduces tprof, a targeting profiler for Python that addresses the inefficiency of traditional profilers when optimizing speci

adamj.eu·4mo ago

Building memchunk: A High-Performance Text Chunking Library for RAG Pipelines Using SIMD and memchr

The article details the development of memchunk, a high-performance text chunking library for RAG (Retrieval-Augmented Generation) pipelines

minha.sh·4mo ago

GitHub Repository: Fix for VLC Video Source Audio Stuttering and CPU Throttling on Low-End Devices

A GitHub repository containing code that fixes VLC Video Source audio stuttering and CPU throttling issues on low-end or older devices durin

github.com·4mo ago