Scaling the Next Paradigm of Heterogeneous Intelligence — Adrian Bertagnoli, Callosum

AI Engineer · Intermediate ·🧠 Large Language Models ·44m ago
A mixture of Qwen 3 VL8B and Kimi K2.5 beat the state of the art on Video Web Arena, outperforming the leading GPT and Gemini models by 18 and 25 percent while costing 3.7 times less and running 3 times faster. The reason it worked is that visual web navigation decomposes into subtasks that do not all need a frontier model: routing zoom and visual parsing to a smaller model alone produced 11x speed and 43x cost improvements on those steps. Adrian Bertagnoli from Callosum makes the case that the GPU cluster era of identical hardware and monolithic models is ending. Heterogeneous intelligence treats model architectures, chip types, and workflows as variables to optimize together. A second result: running recursive long context reasoning tasks on Cerebras instead of a frontier model cuts cost by 7x and latency by 5x while matching accuracy. Callosum is building the automation layer that routes tasks to the right chip and model without bespoke decisions for each subtask. Speaker info: - https://www.linkedin.com/in/adrian-bertagnoli-bb3467178/
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

Qwen 3.7 Max Developer Guide: 1M Context, $2.50/MTok, and the Anthropic-Protocol Drop-In (2026)
Learn to use Qwen 3.7 Max with 1M context and Anthropic-Protocol for advanced AI development
Dev.to AI
Gemma 4 and the Politics of Local AI
Learn about Gemma 4 and its implications on local AI development, and how it can be utilized by developers to create more efficient AI systems
Dev.to · Ashmeet
🔥 What’s Happening in Tech World Right Now? — AI, React 19, GPT-4o & More
Stay updated on the latest tech trends, including AI, React 19, and GPT-4, and learn how to apply them in your work
Dev.to · Prem Gaikwad
The One Word Change That Made My AI Images Look Professional
Learn how a simple one-word change in prompts can significantly improve the quality of AI-generated images
Medium · AI
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →