Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 4 - LLM Training

Stanford Online · Beginner ·🧠 Large Language Models ·5mo ago
For more information about Stanford’s graduate programs, visit: https://online.stanford.edu/graduate-education October 17, 2025 This lecture covers: • Pretraining • Quantization • Hardware optimization • Supervised finetuning (SFT) • Parameter-efficient finetuning (LoRA) To follow along with the course schedule and syllabus, visit: https://cme295.stanford.edu/syllabus/ Chapters: 00:00:00 Introduction 00:07:19 Pretraining 00:13:26 FLOPs, FLOPS 00:16:34 Scaling laws, Chinchilla law 00:24:49 Training optimizations overview 00:31:09 Data parallelism with ZeRO 00:35:51 Model parallelism 00:38:2…
Watch on YouTube ↗ (saves to browser)

Chapters (14)

Introduction
7:19 Pretraining
13:26 FLOPs, FLOPS
16:34 Scaling laws, Chinchilla law
24:49 Training optimizations overview
31:09 Data parallelism with ZeRO
35:51 Model parallelism
38:26 Flash Attention
52:37 Quantization
56:00 Mixed precision training
1:02:31 Supervised finetuning
1:09:21 Instruction tuning
1:37:53 Parameter-efficient finetuning with LoRA
1:45:16 QLoRA

Playlist

Uploads from Stanford Online · Stanford Online · 0 of 60

← Previous Next →
1 Statistical Learning: 13.2 Introduction to Multiple Testing and Family Wise Error Rate
Statistical Learning: 13.2 Introduction to Multiple Testing and Family Wise Error Rate
Stanford Online
2 Statistical Learning: 13.1 Introduction to Hypothesis Testing II
Statistical Learning: 13.1 Introduction to Hypothesis Testing II
Stanford Online
3 Statistical Learning: 12.R.3 Hierarchical Clustering
Statistical Learning: 12.R.3 Hierarchical Clustering
Stanford Online
4 Statistical Learning: 12.R.2 K means Clustering
Statistical Learning: 12.R.2 K means Clustering
Stanford Online
5 Statistical Learning: 12.R.1 Principal Components
Statistical Learning: 12.R.1 Principal Components
Stanford Online
6 Statistical Learning: 13.R.1 Bonferroni and Holm II
Statistical Learning: 13.R.1 Bonferroni and Holm II
Stanford Online
7 Statistical Learning: 12.6 Breast Cancer Example
Statistical Learning: 12.6 Breast Cancer Example
Stanford Online
8 Statistical Learning: 12.5 Matrix Completion
Statistical Learning: 12.5 Matrix Completion
Stanford Online
9 Statistical Learning: 12.4 Hierarchical Clustering
Statistical Learning: 12.4 Hierarchical Clustering
Stanford Online
10 Statistical Learning: 12.3 k means Clustering
Statistical Learning: 12.3 k means Clustering
Stanford Online
11 Statistical Learning: 13.1 Introduction to Hypothesis Testing
Statistical Learning: 13.1 Introduction to Hypothesis Testing
Stanford Online
12 Stanford Seminar - Introduction to Web3
Stanford Seminar - Introduction to Web3
Stanford Online
13 Stanford Seminar - Designing Equitable Online Experiences
Stanford Seminar - Designing Equitable Online Experiences
Stanford Online
14 Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 1
Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 1
Stanford Online
15 Stanford Seminar - Perceiving, Understanding, and Interacting through Touch
Stanford Seminar - Perceiving, Understanding, and Interacting through Touch
Stanford Online
16 Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 2
Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 2
Stanford Online
17 Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 3
Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 3
Stanford Online
18 Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 4
Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 4
Stanford Online
19 Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 5
Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 5
Stanford Online
20 Stanford Seminar - Evolution of a Web3 Company
Stanford Seminar - Evolution of a Web3 Company
Stanford Online
21 Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 6
Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 6
Stanford Online
22 Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 7
Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 7
Stanford Online
23 Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 8
Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 8
Stanford Online
24 Stanford Seminar - Designing Human-Centered AI Systems for Human-AI Collaboration
Stanford Seminar - Designing Human-Centered AI Systems for Human-AI Collaboration
Stanford Online
25 The Sh*tFixers: Bob Sutton Interviews David Kelley, Design Thinking Superstar
The Sh*tFixers: Bob Sutton Interviews David Kelley, Design Thinking Superstar
Stanford Online
26 Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 9
Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 9
Stanford Online
27 Women Rise: Sheri Sheppard
Women Rise: Sheri Sheppard
Stanford Online
28 Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 10
Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 10
Stanford Online
29 Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 11
Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 11
Stanford Online
30 Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 12
Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 12
Stanford Online
31 Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 13
Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 13
Stanford Online
32 Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 14
Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 14
Stanford Online
33 Stanford Webinar - Cloud Computing: What’s on the Horizon with Dr. Timothy Chou
Stanford Webinar - Cloud Computing: What’s on the Horizon with Dr. Timothy Chou
Stanford Online
34 Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 15
Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 15
Stanford Online
35 Stanford Seminar - Multi-Sensory Neural Objects: Modeling, Inference, and Applications in Robotics
Stanford Seminar - Multi-Sensory Neural Objects: Modeling, Inference, and Applications in Robotics
Stanford Online
36 Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 16
Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 16
Stanford Online
37 Stanford Seminar - Toward Better Human-AI Group Decisions
Stanford Seminar - Toward Better Human-AI Group Decisions
Stanford Online
38 Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 17
Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 17
Stanford Online
39 Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 18
Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 18
Stanford Online
40 Stanford Webinar - Web3 Considered: Possible Futures for Decentralization and Digital Ownership
Stanford Webinar - Web3 Considered: Possible Futures for Decentralization and Digital Ownership
Stanford Online
41 Stanford Seminar - Ethics Governance-in-the-Making: Bridging Ethics Work & Governance Menlo Report
Stanford Seminar - Ethics Governance-in-the-Making: Bridging Ethics Work & Governance Menlo Report
Stanford Online
42 Stanford Seminar -  Towards Generalizable Autonomy: Duality of Discovery & Bias
Stanford Seminar - Towards Generalizable Autonomy: Duality of Discovery & Bias
Stanford Online
43 Stanford Seminar - ML Explainability Part 1 I Overview and Motivation for Explainability
Stanford Seminar - ML Explainability Part 1 I Overview and Motivation for Explainability
Stanford Online
44 Stanford Seminar - ML Explainability Part 2 I Inherently Interpretable Models
Stanford Seminar - ML Explainability Part 2 I Inherently Interpretable Models
Stanford Online
45 Stanford Seminar - ML Explainability Part 3 I Post hoc Explanation Methods
Stanford Seminar - ML Explainability Part 3 I Post hoc Explanation Methods
Stanford Online
46 Kratika Gupta talks about Stanford's Product Management Program
Kratika Gupta talks about Stanford's Product Management Program
Stanford Online
47 Stanford Webinar - CRISPR - 10 Years of Genome Editing and More
Stanford Webinar - CRISPR - 10 Years of Genome Editing and More
Stanford Online
48 Stanford Seminar - Making Teamwork an Objective Discipline - Sid Sijbrandij CEO & Chairman of GitLab
Stanford Seminar - Making Teamwork an Objective Discipline - Sid Sijbrandij CEO & Chairman of GitLab
Stanford Online
49 Stanford Seminar - ML Explainability Part 4 I Evaluating Model Interpretations/Explanations
Stanford Seminar - ML Explainability Part 4 I Evaluating Model Interpretations/Explanations
Stanford Online
50 Stanford Seminar - Adaptable Robotic Manipulation Using Tactile Sensors
Stanford Seminar - Adaptable Robotic Manipulation Using Tactile Sensors
Stanford Online
51 Stanford Seminar - ML Explainability Part 5 I Future of Model Understanding
Stanford Seminar - ML Explainability Part 5 I Future of Model Understanding
Stanford Online
52 Meet Joe Lapin, Innovation and Entrepreneurship Program Completer
Meet Joe Lapin, Innovation and Entrepreneurship Program Completer
Stanford Online
53 Stanford Seminar: Social Media Scrutiny of Frontline Professionals & Implications for Accountability
Stanford Seminar: Social Media Scrutiny of Frontline Professionals & Implications for Accountability
Stanford Online
54 Stanford Seminar - Alphy and Alphy Reflect: creating a reflective mirror to advance women
Stanford Seminar - Alphy and Alphy Reflect: creating a reflective mirror to advance women
Stanford Online
55 Stanford Webinar - The Digital Future of Health
Stanford Webinar - The Digital Future of Health
Stanford Online
56 Stanford CS229M - Lecture 1: Overview, supervised learning, empirical risk minimization
Stanford CS229M - Lecture 1: Overview, supervised learning, empirical risk minimization
Stanford Online
57 Stanford CS229M - Lecture 2:  Asymptotic analysis, uniform convergence, Hoeffding inequality
Stanford CS229M - Lecture 2: Asymptotic analysis, uniform convergence, Hoeffding inequality
Stanford Online
58 Stanford CS229M - Lecture 3: Finite hypothesis class, discretizing infinite hypothesis space
Stanford CS229M - Lecture 3: Finite hypothesis class, discretizing infinite hypothesis space
Stanford Online
59 Stanford Seminar - Decentralized Finance (DeFi)
Stanford Seminar - Decentralized Finance (DeFi)
Stanford Online
60 Stanford CS229M - Lecture 4: Advanced concentration inequalities
Stanford CS229M - Lecture 4: Advanced concentration inequalities
Stanford Online
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Next Up
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)