🚀 Just published: "OpenEvolve: Open-Source Evolutionary Code Optimization with Real-World GPU Kernel Discovery"
We built the first open-source implementation of Google's AlphaEvolve system and used it to automatically discover GPU kernel optimizations that outperform human engineers!
Key results:
- 21.8% average decode speed improvement on Apple Silicon - 36.7% improvement on long-context transformer attention - Discovered novel vectorization patterns and 2-pass softmax algorithm
The system evolved a Metal kernel for Qwen3's Grouped Query Attention from a basic 3-pass implementation into something with sophisticated Apple Silicon optimizations that would take experts months to discover manually. The evolved kernel automatically found the optimal vec<T,8> operations for 128-dim attention heads and fused softmax computation with value accumulation.
Really excited about the potential here - imagine evolutionary algorithms automatically discovering optimizations across all our AI infrastructure. What would you want to optimize with this approach?