Your Coding Agent Should Do AI System Engineering — Ben Burtenshaw, Hugging Face
An agent written RMSNorm kernel hit 1.88x speedups on H100s. A finetuned Qwen3 0.6B hit 35% on LiveCodeBench. Neither result required a systems engineer. Just coding agents with the right skills loaded.
Ben Burtenshaw from Hugging Face walks through three levels: using Claude Code interactively to write and benchmark CUDA kernels distributed as versioned repos on the Hub, a zero-shot task where an agent finetunes a model end to end from a single prompt, and a multi agent research lab running parallel experiments overnight on Hub compute while a reporter agent pushes results to a live Trackio dashboard. The through line is skills: file based context that turns a zero shot failure into a few shot workflow. CUDA programming and ML training pipelines were deep specializations that took years. Skills compress that timeline to hours.
Speaker info:
- https://x.com/ben_burtenshaw
- https://www.linkedin.com/in/ben-burtenshaw/
- https://github.com/burtenshaw
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: AI Systems Design
View skill →Related AI Lessons
🎓
Tutor Explanation
DeepCamp AI