Stanford CS153 Frontier Systems | Andreas Blattmann from Black Forest Labs on Visual Intelligence

Stanford Online · Advanced ·🎨 Image & Video AI ·1d ago
For more information about Stanford's online Artificial Intelligence programs, visit: https://stanford.io/ai Follow along with the course schedule and syllabus, visit: https://cs153.stanford.edu/ In this CS153 “Frontier Systems” session, Anjney Midha welcomes Andreas Blattmann, co-founder of Black Forest Labs and co-creator of Stable Diffusion, for a discussion on the visual intelligence frontier and how frontier AI “factories” scale. Blattmann recounts his path from mechanical engineering to a Heidelberg PhD lab, developing latent diffusion to train image generators efficiently and enabling Stable Diffusion’s 2022 release. They contrast earlier unimodal content-creation models with today’s push toward unified multimodal systems spanning images, video, and audio, plus action prediction for computer use and robotics, emphasizing observation and interaction loops. Using Flux as a case study, they cover pre-training, mid-training, post-training, distillation for speed, customer feedback driving image editing and character consistency, and why open weights enable customization. They also discuss Self Flow for multimodal alignment, safety guardrails, EU compliance, data labeling strategies, diffusion vs autoregressive tradeoffs, and skepticism about explicit 3D representations. Guest Speaker: Andreas Blattmann is the co-founder of Black Forest Labs (BFL), the German generative AI startup behind the FLUX text-to-image foundation model, backed by Andreessen Horowitz and other major venture firms. Before founding BFL, he was a generative AI researcher at LMU Munich, NVIDIA, and Stability AI, where he made significant contributions to image and video generation. He is a co-inventor of Latent Diffusion, the generative modeling technique that produced the open-source text-to-image system Stable Diffusion (which he co-developed) and now powers cutting-edge models, including FLUX, Midjourney, and OpenAI's DALL-E 3, with applications extending into audio generation and medi
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

The Complete Guide to Programmatic Image Generation
Generate images programmatically at scale using Puppeteer, layer-based APIs, and other methods
Dev.to · Iteration Layer
I Tested 25 AI Headshot Generators. Here Are 9 That Actually Look Real (2026 Guide)
Learn which 9 AI headshot generators produce the most realistic results for professional use, and how to use them effectively.
Medium · AI
Gemini Stalling? Optimize Performance with Google Workspace Login & Usage Management
Optimize Gemini performance by managing Google Workspace login and usage limits to prevent image generation stalling
Dev.to AI
I Built a Watermark Remover — Here’s What I Actually Learned
Learn how building a watermark remover can teach you about image processing, AI, and problem-solving
Dev.to · Eric Cheung
Up next
New OpenAI Image-Gen-2 Is Unreal. The OAI Kitchen is HOT!
MattVidPro
Watch →