ARGUS: Agentic GPU Optimization Guided by Data-Flow Invariants

📰 ArXiv cs.AI

arXiv:2604.18616v1 Announce Type: cross Abstract: LLM-based coding agents can generate functionally correct GPU kernels, yet their performance remains far below hand-optimized libraries on critical computations such as matrix multiplication, attention, and Mixture-of-Experts (MoE). Peak GPU performance requires coordinated reasoning over tightly coupled optimizations, including tiling, shared-memory staging, software pipelining, and instruction scheduling, while existing agents rely on sparse pa

Published 22 Apr 2026

Read full paper → ← Back to Reads