Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 6 - Model Training

Stanford Online · Beginner ·🎨 Image & Video AI ·4h ago
Learn more details about this course: https://online.stanford.edu/courses/cme296-diffusion-and-large-vision-models To follow along with the course schedule and syllabus, visit: https://cme296.stanford.edu/syllabus/ Chapters: 00:00:00 Introduction 00:07:45 Training lifecycle overview 00:10:39 Loss parameterization 00:16:08 Timestep sampling 00:22:27 Logit normal distribution 00:25:44 Sampling shift for different resolutions 00:43:56 Representation alignment (REPA) 00:49:21 Pre-training 00:52:46 Continued training (CT) 00:54:02 Supervised finetuning (SFT) 00:54:39 Preference tuning 00:56:48 Reward feedback learning (ReFL) 01:00:16 Flow-GRPO 01:03:23 Diffusion-DPO 01:05:06 Prompt enhancement (PE) 01:10:00 DreamBooth, low-rank adaptation (LoRA) 01:16:35 Distillation 01:21:42 Progressive distillation (PD) 01:25:14 InstaFlow 01:30:18 Consistency models (CM) 01:34:23 Distribution matching distillation (DMD) 01:39:16 Adversarial diffusion distillation (ADD) For more information about Stanford’s graduate programs, visit: https://online.stanford.edu/graduate-education Afshine Amidi is an Adjunct Lecturer at Stanford University. Shervine Amidi is an Adjunct Lecturer at Stanford University. View the course playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNdy8rt2rZ4T2xM0OjADnfu
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

remove-ai-watermarks: una CLI borra SynthID, C2PA y el sparkle de Gemini
Learn how to remove AI watermarks from images using a Python-based CLI tool
Dev.to · lu1tr0n
I benchmarked 6 WASM image codecs in the browser. Here is what beats the server.
Benchmarking 6 WASM image codecs in the browser to find the best alternative to server-side compression
Dev.to · Convertilo
I Thought AI Image Tools Were Broken… Until I Realized My Prompts Had No Structure
Learn how to improve AI-generated images by structuring your prompts, a crucial step for reliable results
Medium · ChatGPT
I built a Stable Diffusion playground in 200 lines and zero API keys. Here's how.
Build a Stable Diffusion playground in under 200 lines of code without needing API keys, and explore AI image generation
Dev.to · Devanshu Biswas

Chapters (22)

Introduction
7:45 Training lifecycle overview
10:39 Loss parameterization
16:08 Timestep sampling
22:27 Logit normal distribution
25:44 Sampling shift for different resolutions
43:56 Representation alignment (REPA)
49:21 Pre-training
52:46 Continued training (CT)
54:02 Supervised finetuning (SFT)
54:39 Preference tuning
56:48 Reward feedback learning (ReFL)
1:00:16 Flow-GRPO
1:03:23 Diffusion-DPO
1:05:06 Prompt enhancement (PE)
1:10:00 DreamBooth, low-rank adaptation (LoRA)
1:16:35 Distillation
1:21:42 Progressive distillation (PD)
1:25:14 InstaFlow
1:30:18 Consistency models (CM)
1:34:23 Distribution matching distillation (DMD)
1:39:16 Adversarial diffusion distillation (ADD)
Up next
Introducing Gemini Omni
Google for Developers
Watch →