AutoKernel: Autonomous AI System for GPU Kernel Optimization in PyTorch Models
By
frozenseven
2mo ago· 7 min readenCode
100/100
Golden Brown
Bagelometer↗
Hand-rolled, kettle-boiled, baked to perfection. Worth every minute at the bakery.
Score100Typepress releaseSentimentpositive
Summary
AutoKernel is an autonomous AI system that automatically optimizes GPU kernels for PyTorch models. Inspired by autonomous AI research agents, it profiles models to identify bottleneck kernels, extracts them as standalone Triton or CUDA C++ kernels, and then continuously optimizes them through an autonomous agent that makes modifications, runs evaluations, and keeps or reverts changes in an endless improvement loop. The tool aims to automate the complex process of GPU kernel optimization, allowing developers to simply provide a PyTorch model and receive optimized kernels without manual intervention.
Key quotes
· 5 pulledAutoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton or CUDA C++ kernels.
AutoKernel applies the same philosophy to GPU kernel optimization: agent modifies one file, runs a fixed evaluation, keeps or reverts, repeats forever.
Profile the model to find which GPU kernels are bottlenecks
Extract each bottleneck as a standalone Triton or CUDA C++ kernel
Optimize each kernel autonomously
Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels. - RightNow-AI/autokernel
