Record-Remix-Replay: Hierarchical GPU Kernel Optimization using Evolutionary Search
📰 ArXiv cs.AI
arXiv:2604.11109v1 Announce Type: cross Abstract: As high-performance computing and AI workloads become increasingly dependent on GPUs, maintaining high performance across rapidly evolving hardware generations has become a major challenge. Developers often spend months tuning scientific applications to fully exploit new architectures, navigating a complex optimization space that spans algorithm design, source implementation, compiler flags and pass sequences, and kernel launch parameters. Existi
DeepCamp AI