Baseten CEO Tuhin Srivastava on Custom Models, and Building the Inference Cloud
No Priors: AI, Machine Learning, Tech, & Startups
·
Intermediate
·🚀 Entrepreneurship & Startups
·1w ago
Skills:
AI Startup Building80%
Baseten CEO and co-founder Tuhin Srivastava sits down with Sarah Guo and Elad Gil to discuss the rapid growth of AI inference demand, Baseten’s 30x growth, and why inference is becoming the strategic “last market.” Tuhin Srivastava argues the application layer will persist because companies with unique user signals can encode value into workflows and post-train specialized models, citing examples like Abridge and support workflows. The conversation covers GPU capacity constraints, Baseten’s multi-cloud fabric across 18 clouds and 90 clusters, long-term contracting dynamics, the importance of the software layer for stickiness, evolving workloads, multichip possibilities, and operational lessons at scale.
Sign up for new podcasts every week. Email feedback to show@no-priors.com
Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @Tuhinone
Chapters:
00:31 Baseten growth
01:55 Why the app layer wins
05:57 Serving frontier customers
07:55 Open source model mix
09:21 Chinese models and geopolitics
13:07 Custom inference dominates
14:22 Post training acquisition
17:10 When to invest in custom models
18:35 Supply crunch and data centerse
22:25 Longer GPU Contracts
24:09 What Makes a Winner
26:07 Multi Chip Future
28:19 Runtime Roadmap
31:08 Scaling Edge Cases
33:48 Hiring and Leadership
36:44 Operations Pager Culture
38:19 Efficiency Drives Demand
40:41 Concierge Everything Future
42:34 Conclusion
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: AI Startup Building
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
After 25 years of writing the cheques himself, Bezos is opening Blue Origin to outsiders
The Next Web AI
What the "Bus Factor" Problem Looks Like for Solopreneurs
Dev.to · Khalfan
Reducing Startup Costs With Automation: A Real Playbook
Dev.to · AdamVibe
How to Validate Your Startup Idea for Free in 2026 — A Step-by-Step Framework
Dev.to · John Leslie
Chapters (19)
0:31
Baseten growth
1:55
Why the app layer wins
5:57
Serving frontier customers
7:55
Open source model mix
9:21
Chinese models and geopolitics
13:07
Custom inference dominates
14:22
Post training acquisition
17:10
When to invest in custom models
18:35
Supply crunch and data centerse
22:25
Longer GPU Contracts
24:09
What Makes a Winner
26:07
Multi Chip Future
28:19
Runtime Roadmap
31:08
Scaling Edge Cases
33:48
Hiring and Leadership
36:44
Operations Pager Culture
38:19
Efficiency Drives Demand
40:41
Concierge Everything Future
42:34
Conclusion
🎓
Tutor Explanation
DeepCamp AI