Model2Kernel: Model-Aware Symbolic Execution For Safe CUDA Kernels

📰 ArXiv cs.AI

Model2Kernel enables safe CUDA kernels for GPU-accelerated inference using model-aware symbolic execution

advanced Published 27 Mar 2026

Action Steps

Identify model-dependent tensor layouts and memory indexing patterns
Apply model-aware symbolic execution to detect memory-safety bugs
Validate CUDA kernels for correctness and safety
Integrate Model2Kernel into production inference systems

Who Needs to Know This

This research benefits software engineers and AI researchers working on large language models and GPU-accelerated inference systems, as it helps ensure the safety and reliability of CUDA kernels

Key Insight

💡 Model-aware symbolic execution can effectively detect memory-safety bugs in CUDA kernels

Key Takeaways

Model2Kernel enables safe CUDA kernels for GPU-accelerated inference using model-aware symbolic execution

Full Article

Title: Model2Kernel: Model-Aware Symbolic Execution For Safe CUDA Kernels

Abstract:
arXiv:2603.24595v1 Announce Type: cross Abstract: The widespread adoption of large language models (LLMs) has made GPU-accelerated inference a critical part of modern computing infrastructure. Production inference systems rely on CUDA kernels to implement core transformer operations, yet these kernels are highly susceptible to memory-safety bugs due to model-dependent tensor layouts, intricate memory indexing, and massive thread-level parallelism. Such bugs can corrupt model weights, crash infer

Read full paper → ← Back to Reads