Lessons from Trillion Token Deployments at Fortune 500s — Alessandro Cappelli, Adaptive ML
95% of GenAI pilots fail to reach production. Alessandro Cappelli's argument is that this isn't a deployment problem or a prompt engineering problem — it's a feedback integration problem. Instruction fine-tuning and proprietary models give you a demo. Only reinforcement learning gives you a systematic way to incorporate defects, business metrics, and production signals and keep improving.
This talk covers what a production-grade RL pipeline looks like at Fortune 500 scale: synthetic data as a byproduct of environment training rather than a prerequisite, mock environments where agents can fail safely before touching real systems, and LLM judges that replace expensive annotation campaigns with a rubric-definition exercise that takes hours rather than weeks. The throughline is that agents raise the stakes on all of this — more tokens, less tolerance for errors, direct access to live databases — and RL was designed for exactly that problem.
Speaker info:
- https://www.linkedin.com/in/alessandro-cappelli-aa8060172
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: LLM Engineering
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Master the Persona Pattern: Make ChatGPT Think Like a True Expert
Medium · AI
Master the Persona Pattern: Make ChatGPT Think Like a True Expert
Medium · ChatGPT
KrishiBot: How I Built a Multi-Agent AI Tutor — and Why I Kept Adding Layers
Medium · LLM
Day 24: When Medical Nomenclatures Shift, How Does Your Multilingual AI Adapt?
Dev.to AI
🎓
Tutor Explanation
DeepCamp AI