A Technical History of Generative Media

Latent Space · Advanced ·🎨 Image & Video AI ·8mo ago
Today we are joined by Gorkem and Batuhan from Fal.ai, the fastest growing generative media inference provider. They recently raised a $125M Series C and crossed $100M ARR. We covered how they pivoted from dbt pipelines to diffusion models inference, what were the models that really changed the trajectory of image generation, and the future of AI videos. Enjoy! 00:00 - Introductions 04:58 - History of Major AI Models and Their Impact on Fal.ai 07:06 - Pivoting to Generative Media and Strategic Business Decisions 10:46 - Technical discussion on CUDA optimization and kernel development 12:42 - Inference Engine Architecture and Kernel Reusability 14:59 - Performance Gains and Latency Trade-offs 15:50 - Discussion of model latency importance and performance optimization 17:56 - Importance of Latency and User Engagement 18:46 - Impact of Open Source Model Releases and Competitive Advantage 19:00 - Partnerships with closed source model developers 20:06 - Collaborations with Closed-Source Model Providers 21:28 - Serving Audio Models and Infrastructure Scalability 22:29 - Serverless GPU infrastructure and technical stack 23:52 - GPU Prioritization: H100s and Blackwell Optimization 25:00 - Discussion on ASICs vs. General Purpose GPUs 26:10 - Architectural Trends: MMDiTs and Model Innovation 27:35 - Rise and Decline of Distillation and Consistency Models 28:15 - Draft Mode and Streaming in Image Generation Workflows 29:46 - Generative Video Models and the Role of Latency 30:14 - Auto-Regressive Image Models and Industry Reactions 31:35 - Discussion of OpenAI's Sora and competition in video generation 34:44 - World Models and Creative Applications in Games and Movies 35:27 - Video Models’ Revenue Share and Open-Source Contributions 36:40 - Rise of Chinese Labs and Partnerships 38:03 - Top Trending Models on Hugging Face and ByteDance's Role 39:29 - Monetization Strategies for Open Models 40:48 - Usage Distribution and Model Turnover on FAL 42:11 - Revenue Share vs. Open Model
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Latent Space · Latent Space · 0 of 60

← Previous Next →
1 Ep 18: Petaflops to the People — with George Hotz of tinycorp
Ep 18: Petaflops to the People — with George Hotz of tinycorp
Latent Space
2 FlashAttention-2: Making Transformers 800% faster AND exact
FlashAttention-2: Making Transformers 800% faster AND exact
Latent Space
3 RWKV: Reinventing RNNs for the Transformer Era
RWKV: Reinventing RNNs for the Transformer Era
Latent Space
4 Generating your AI Media Empire - with Youssef Rizk of Wondercraft.ai
Generating your AI Media Empire - with Youssef Rizk of Wondercraft.ai
Latent Space
5 RAG is a hack - with Jerry Liu of LlamaIndex
RAG is a hack - with Jerry Liu of LlamaIndex
Latent Space
6 The End of Finetuning — with Jeremy Howard of Fast.ai
The End of Finetuning — with Jeremy Howard of Fast.ai
Latent Space
7 Why AI Agents Don't Work (yet) - with Kanjun Qiu of Imbue
Why AI Agents Don't Work (yet) - with Kanjun Qiu of Imbue
Latent Space
8 Powering your Copilot for Data - with Artem Keydunov from Cube.dev
Powering your Copilot for Data - with Artem Keydunov from Cube.dev
Latent Space
9 Beating GPT-4 with Open Source Models - with Michael Royzen of Phind
Beating GPT-4 with Open Source Models - with Michael Royzen of Phind
Latent Space
10 The State of Silicon and the GPU Poors - with Dylan Patel of SemiAnalysis
The State of Silicon and the GPU Poors - with Dylan Patel of SemiAnalysis
Latent Space
11 The "Normsky" architecture for AI coding agents — with Beyang Liu + Steve Yegge of SourceGraph
The "Normsky" architecture for AI coding agents — with Beyang Liu + Steve Yegge of SourceGraph
Latent Space
12 The AI-First Graphics Editor - with Suhail Doshi of Playground AI
The AI-First Graphics Editor - with Suhail Doshi of Playground AI
Latent Space
13 The Accidental AI Canvas - with Steve Ruiz of tldraw
The Accidental AI Canvas - with Steve Ruiz of tldraw
Latent Space
14 The Origin and Future of RLHF: the secret ingredient for ChatGPT - with Nathan Lambert
The Origin and Future of RLHF: the secret ingredient for ChatGPT - with Nathan Lambert
Latent Space
15 The Four Wars of the AI Stack - Dec 2023 Recap
The Four Wars of the AI Stack - Dec 2023 Recap
Latent Space
16 The State of AI in production — with David Hsu of Retool
The State of AI in production — with David Hsu of Retool
Latent Space
17 Building an open AI company - with Ce and Vipul of Together AI
Building an open AI company - with Ce and Vipul of Together AI
Latent Space
18 Truly Serverless Infra for AI Engineers - with Erik Bernhardsson of Modal
Truly Serverless Infra for AI Engineers - with Erik Bernhardsson of Modal
Latent Space
19 A Brief History of the Open Source AI Hacker - with Ben Firshman of Replicate
A Brief History of the Open Source AI Hacker - with Ben Firshman of Replicate
Latent Space
20 Open Source AI is AI we can Trust — with Soumith Chintala of Meta AI
Open Source AI is AI we can Trust — with Soumith Chintala of Meta AI
Latent Space
21 Making Transformers Sing - with Mikey Shulman of Suno
Making Transformers Sing - with Mikey Shulman of Suno
Latent Space
22 A Comprehensive Overview of Large Language Models - Latent Space Paper Club
A Comprehensive Overview of Large Language Models - Latent Space Paper Club
Latent Space
23 Why Google failed to make GPT-3 -- with David Luan of Adept
Why Google failed to make GPT-3 -- with David Luan of Adept
Latent Space
24 Personal AI Meetup - Bee, BasedHardware, LangChain LangFriend, Deepgram EmilyAI
Personal AI Meetup - Bee, BasedHardware, LangChain LangFriend, Deepgram EmilyAI
Latent Space
25 Supervise the Process of AI Research — with Jungwon Byun and Andreas Stuhlmüller of Elicit
Supervise the Process of AI Research — with Jungwon Byun and Andreas Stuhlmüller of Elicit
Latent Space
26 Breaking down the OG GPT Paper by Alec Radford
Breaking down the OG GPT Paper by Alec Radford
Latent Space
27 High Agency Pydantic over VC Backed Frameworks — with Jason Liu of Instructor
High Agency Pydantic over VC Backed Frameworks — with Jason Liu of Instructor
Latent Space
28 This World Does Not Exist — Joscha Bach, Karan Malhotra, Rob Haisfield (WorldSim, WebSim, Liquid AI)
This World Does Not Exist — Joscha Bach, Karan Malhotra, Rob Haisfield (WorldSim, WebSim, Liquid AI)
Latent Space
29 LLM Asia Paper Club Survey Round
LLM Asia Paper Club Survey Round
Latent Space
30 How to train a Million Context LLM — with Mark Huang of Gradient.ai
How to train a Million Context LLM — with Mark Huang of Gradient.ai
Latent Space
31 How AI is Eating Finance - with Mike Conover of Brightwave
How AI is Eating Finance - with Mike Conover of Brightwave
Latent Space
32 How To Hire AI Engineers (ft. James Brady and Adam Wiggins of Elicit)
How To Hire AI Engineers (ft. James Brady and Adam Wiggins of Elicit)
Latent Space
33 State of the Art: Training 70B LLMs on 10,000 H100 clusters
State of the Art: Training 70B LLMs on 10,000 H100 clusters
Latent Space
34 The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka
The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka
Latent Space
35 Training Llama 2, 3 & 4: The Path to Open Source AGI — with Thomas Scialom of Meta AI
Training Llama 2, 3 & 4: The Path to Open Source AGI — with Thomas Scialom of Meta AI
Latent Space
36 [LLM Paper Club] Llama 3.1 Paper: The Llama Family of Models
[LLM Paper Club] Llama 3.1 Paper: The Llama Family of Models
Latent Space
37 Synthetic data + tool use for LLM improvements 🦙
Synthetic data + tool use for LLM improvements 🦙
Latent Space
38 RLHF vs SFT to break out of local maxima 📈
RLHF vs SFT to break out of local maxima 📈
Latent Space
39 The Winds of AI Winter (Q2 Four Wars of the AI Stack Recap)
The Winds of AI Winter (Q2 Four Wars of the AI Stack Recap)
Latent Space
40 Segment Anything 2: Memory + Vision = Object Permanence — with Nikhila Ravi and Joseph Nelson
Segment Anything 2: Memory + Vision = Object Permanence — with Nikhila Ravi and Joseph Nelson
Latent Space
41 Answer.ai & AI Magic with Jeremy Howard
Answer.ai & AI Magic with Jeremy Howard
Latent Space
42 Is finetuning GPT4o worth it?
Is finetuning GPT4o worth it?
Latent Space
43 Personal benchmarks vs HumanEval - with Nicholas Carlini of DeepMind
Personal benchmarks vs HumanEval - with Nicholas Carlini of DeepMind
Latent Space
44 Building AGI with OpenAI's Structured Outputs API
Building AGI with OpenAI's Structured Outputs API
Latent Space
45 Q* for model distillation 🍓
Q* for model distillation 🍓
Latent Space
46 Finetuning LoRAs on BILLIONS of tokens 🤖
Finetuning LoRAs on BILLIONS of tokens 🤖
Latent Space
47 Cursor UX team is CRACKED 💻
Cursor UX team is CRACKED 💻
Latent Space
48 Choosing the BEST OpenAI model 🏆
Choosing the BEST OpenAI model 🏆
Latent Space
49 How will OpenAI voice mode change API design?
How will OpenAI voice mode change API design?
Latent Space
50 STEALING OpenAI models data 🥷
STEALING OpenAI models data 🥷
Latent Space
51 [Paper Club] 🍓 On Reasoning: Q-STaR and Friends!
[Paper Club] 🍓 On Reasoning: Q-STaR and Friends!
Latent Space
52 [Paper Club] Writing in the Margins: Chunked Prefill KV Caching for Long Context Retrieval
[Paper Club] Writing in the Margins: Chunked Prefill KV Caching for Long Context Retrieval
Latent Space
53 The Ultimate Guide to Prompting - with Sander Schulhoff from LearnPrompting.org
The Ultimate Guide to Prompting - with Sander Schulhoff from LearnPrompting.org
Latent Space
54 llm.c's Origin and the Future of LLM Compilers - Andrej Karpathy at CUDA MODE
llm.c's Origin and the Future of LLM Compilers - Andrej Karpathy at CUDA MODE
Latent Space
55 Prompt Engineer is NOT a job 📝
Prompt Engineer is NOT a job 📝
Latent Space
56 Prompt Mining LLMs for better prompts ⛏️
Prompt Mining LLMs for better prompts ⛏️
Latent Space
57 The six pillars of few-shot prompting 🔧
The six pillars of few-shot prompting 🔧
Latent Space
58 Language Agents: From Reasoning to Acting — with Shunyu Yao of OpenAI, Harrison Chase of LangGraph
Language Agents: From Reasoning to Acting — with Shunyu Yao of OpenAI, Harrison Chase of LangGraph
Latent Space
59 [Paper Club] Who Validates the Validators? Aligning LLM-Judges with Humans (w/ Eugene Yan)
[Paper Club] Who Validates the Validators? Aligning LLM-Judges with Humans (w/ Eugene Yan)
Latent Space
60 Can you separate intelligence and knowledge?
Can you separate intelligence and knowledge?
Latent Space

Related AI Lessons

How to Write Better AI Image Prompts for Midjourney (With Examples That Actually Work)
Learn to write effective AI image prompts for Midjourney with actionable examples and techniques
Medium · ChatGPT
Image to Video AI: The Complete Workflow Playbook That Actually Produces Results
Learn a step-by-step workflow for image-to-video AI that produces results, from preparation to delivery
Medium · AI
Image Harvest v1.0.2: Internationalization, Free Pro Trial & Quality-of-Life Improvements
Learn about Image Harvest v1.0.2, a Chrome extension with internationalization, free pro trial, and quality-of-life improvements, and how to utilize it for privacy-first image extraction
Dev.to · kyriewen
Pix2Pix: Image-to-Image Translation using Conditional GANs
Learn how to use Pix2Pix for image-to-image translation with conditional GANs, a powerful technique for generating realistic images
Medium · Deep Learning

Chapters (28)

Introductions
4:58 History of Major AI Models and Their Impact on Fal.ai
7:06 Pivoting to Generative Media and Strategic Business Decisions
10:46 Technical discussion on CUDA optimization and kernel development
12:42 Inference Engine Architecture and Kernel Reusability
14:59 Performance Gains and Latency Trade-offs
15:50 Discussion of model latency importance and performance optimization
17:56 Importance of Latency and User Engagement
18:46 Impact of Open Source Model Releases and Competitive Advantage
19:00 Partnerships with closed source model developers
20:06 Collaborations with Closed-Source Model Providers
21:28 Serving Audio Models and Infrastructure Scalability
22:29 Serverless GPU infrastructure and technical stack
23:52 GPU Prioritization: H100s and Blackwell Optimization
25:00 Discussion on ASICs vs. General Purpose GPUs
26:10 Architectural Trends: MMDiTs and Model Innovation
27:35 Rise and Decline of Distillation and Consistency Models
28:15 Draft Mode and Streaming in Image Generation Workflows
29:46 Generative Video Models and the Role of Latency
30:14 Auto-Regressive Image Models and Industry Reactions
31:35 Discussion of OpenAI's Sora and competition in video generation
34:44 World Models and Creative Applications in Games and Movies
35:27 Video Models’ Revenue Share and Open-Source Contributions
36:40 Rise of Chinese Labs and Partnerships
38:03 Top Trending Models on Hugging Face and ByteDance's Role
39:29 Monetization Strategies for Open Models
40:48 Usage Distribution and Model Turnover on FAL
42:11 Revenue Share vs. Open Model
Up next
Inside image generation’s Renaissance moment — the OpenAI Podcast Ep. 19
OpenAI
Watch →