Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 4 - LLM Training

Stanford Online · Beginner ·🧠 Large Language Models ·5mo ago

For more information about Stanford’s graduate programs, visit: https://online.stanford.edu/graduate-education October 17, 2025 This lecture covers: • Pretraining • Quantization • Hardware optimization • Supervised finetuning (SFT) • Parameter-efficient finetuning (LoRA) To follow along with the course schedule and syllabus, visit: https://cme295.stanford.edu/syllabus/ Chapters: 00:00:00 Introduction 00:07:19 Pretraining 00:13:26 FLOPs, FLOPS 00:16:34 Scaling laws, Chinchilla law 00:24:49 Training optimizations overview 00:31:09 Data parallelism with ZeRO 00:35:51 Model parallelism 00:38:2…

Watch on YouTube ↗ (saves to browser)

Chapters (14)

Introduction

7:19 Pretraining

13:26 FLOPs, FLOPS

16:34 Scaling laws, Chinchilla law

24:49 Training optimizations overview

31:09 Data parallelism with ZeRO

35:51 Model parallelism

38:26 Flash Attention

52:37 Quantization

56:00 Mixed precision training

1:02:31 Supervised finetuning

1:09:21 Instruction tuning

1:37:53 Parameter-efficient finetuning with LoRA

1:45:16 QLoRA

Playlist

Uploads from Stanford Online · Stanford Online · 0 of 60

← Previous Next →

Statistical Learning: 13.2 Introduction to Multiple Testing and Family Wise Error Rate

Statistical Learning: 13.2 Introduction to Multiple Testing and Family Wise Error Rate

Stanford Online

Statistical Learning: 13.1 Introduction to Hypothesis Testing II

Statistical Learning: 13.1 Introduction to Hypothesis Testing II

Stanford Online

Statistical Learning: 12.R.3 Hierarchical Clustering

Statistical Learning: 12.R.3 Hierarchical Clustering

Stanford Online

Statistical Learning: 12.R.2 K means Clustering

Statistical Learning: 12.R.2 K means Clustering

Stanford Online

Statistical Learning: 12.R.1 Principal Components

Statistical Learning: 12.R.1 Principal Components

Stanford Online

Statistical Learning: 13.R.1 Bonferroni and Holm II

Statistical Learning: 13.R.1 Bonferroni and Holm II

Stanford Online

Statistical Learning: 12.6 Breast Cancer Example

Statistical Learning: 12.6 Breast Cancer Example

Stanford Online

Statistical Learning: 12.5 Matrix Completion

Statistical Learning: 12.5 Matrix Completion

Stanford Online

Statistical Learning: 12.4 Hierarchical Clustering

Statistical Learning: 12.4 Hierarchical Clustering

Stanford Online

Statistical Learning: 12.3 k means Clustering

Statistical Learning: 12.3 k means Clustering

Stanford Online

Statistical Learning: 13.1 Introduction to Hypothesis Testing

Statistical Learning: 13.1 Introduction to Hypothesis Testing

Stanford Online

Stanford Seminar - Introduction to Web3

Stanford Seminar - Introduction to Web3

Stanford Online

Stanford Seminar - Designing Equitable Online Experiences

Stanford Seminar - Designing Equitable Online Experiences

Stanford Online

Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 1

Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 1

Stanford Online

Stanford Seminar - Perceiving, Understanding, and Interacting through Touch

Stanford Seminar - Perceiving, Understanding, and Interacting through Touch

Stanford Online

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 2

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 2

Stanford Online

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 3

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 3

Stanford Online

Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 4

Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 4

Stanford Online

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 5

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 5

Stanford Online

Stanford Seminar - Evolution of a Web3 Company

Stanford Seminar - Evolution of a Web3 Company

Stanford Online

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 6

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 6

Stanford Online

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 7

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 7

Stanford Online

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 8

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 8

Stanford Online

Stanford Seminar - Designing Human-Centered AI Systems for Human-AI Collaboration

Stanford Seminar - Designing Human-Centered AI Systems for Human-AI Collaboration

Stanford Online

The Sh*tFixers: Bob Sutton Interviews David Kelley, Design Thinking Superstar

The Sh*tFixers: Bob Sutton Interviews David Kelley, Design Thinking Superstar

Stanford Online

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 9

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 9

Stanford Online

Women Rise: Sheri Sheppard

Women Rise: Sheri Sheppard

Stanford Online

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 10

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 10

Stanford Online

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 11

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 11

Stanford Online

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 12

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 12

Stanford Online

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 13

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 13

Stanford Online

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 14

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 14

Stanford Online

Stanford Webinar - Cloud Computing: What’s on the Horizon with Dr. Timothy Chou

Stanford Webinar - Cloud Computing: What’s on the Horizon with Dr. Timothy Chou

Stanford Online

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 15

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 15

Stanford Online

Stanford Seminar - Multi-Sensory Neural Objects: Modeling, Inference, and Applications in Robotics

Stanford Seminar - Multi-Sensory Neural Objects: Modeling, Inference, and Applications in Robotics

Stanford Online

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 16

Stanford CS330: Deep Multi-task & Meta Learning I 2021 I Lecture 16

Stanford Online

Stanford Seminar - Toward Better Human-AI Group Decisions

Stanford Seminar - Toward Better Human-AI Group Decisions

Stanford Online

Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 17

Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 17

Stanford Online

Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 18

Stanford CS330: Deep Multi-Task & Meta Learning I 2021 I Lecture 18

Stanford Online

Stanford Webinar - Web3 Considered: Possible Futures for Decentralization and Digital Ownership

Stanford Webinar - Web3 Considered: Possible Futures for Decentralization and Digital Ownership

Stanford Online

Stanford Seminar - Ethics Governance-in-the-Making: Bridging Ethics Work & Governance Menlo Report

Stanford Seminar - Ethics Governance-in-the-Making: Bridging Ethics Work & Governance Menlo Report

Stanford Online

Stanford Seminar - Towards Generalizable Autonomy: Duality of Discovery & Bias

Stanford Seminar - Towards Generalizable Autonomy: Duality of Discovery & Bias

Stanford Online

Stanford Seminar - ML Explainability Part 1 I Overview and Motivation for Explainability

Stanford Seminar - ML Explainability Part 1 I Overview and Motivation for Explainability

Stanford Online

Stanford Seminar - ML Explainability Part 2 I Inherently Interpretable Models

Stanford Seminar - ML Explainability Part 2 I Inherently Interpretable Models

Stanford Online

Stanford Seminar - ML Explainability Part 3 I Post hoc Explanation Methods

Stanford Seminar - ML Explainability Part 3 I Post hoc Explanation Methods

Stanford Online

Kratika Gupta talks about Stanford's Product Management Program

Kratika Gupta talks about Stanford's Product Management Program

Stanford Online

Stanford Webinar - CRISPR - 10 Years of Genome Editing and More

Stanford Webinar - CRISPR - 10 Years of Genome Editing and More

Stanford Online

Stanford Seminar - Making Teamwork an Objective Discipline - Sid Sijbrandij CEO & Chairman of GitLab

Stanford Seminar - Making Teamwork an Objective Discipline - Sid Sijbrandij CEO & Chairman of GitLab

Stanford Online

Stanford Seminar - ML Explainability Part 4 I Evaluating Model Interpretations/Explanations

Stanford Seminar - ML Explainability Part 4 I Evaluating Model Interpretations/Explanations

Stanford Online

Stanford Seminar - Adaptable Robotic Manipulation Using Tactile Sensors

Stanford Seminar - Adaptable Robotic Manipulation Using Tactile Sensors

Stanford Online

Stanford Seminar - ML Explainability Part 5 I Future of Model Understanding

Stanford Seminar - ML Explainability Part 5 I Future of Model Understanding

Stanford Online

Meet Joe Lapin, Innovation and Entrepreneurship Program Completer

Meet Joe Lapin, Innovation and Entrepreneurship Program Completer

Stanford Online

Stanford Seminar: Social Media Scrutiny of Frontline Professionals & Implications for Accountability

Stanford Seminar: Social Media Scrutiny of Frontline Professionals & Implications for Accountability

Stanford Online

Stanford Seminar - Alphy and Alphy Reflect: creating a reflective mirror to advance women

Stanford Seminar - Alphy and Alphy Reflect: creating a reflective mirror to advance women

Stanford Online

Stanford Webinar - The Digital Future of Health

Stanford Webinar - The Digital Future of Health

Stanford Online

Stanford CS229M - Lecture 1: Overview, supervised learning, empirical risk minimization

Stanford CS229M - Lecture 1: Overview, supervised learning, empirical risk minimization

Stanford Online

Stanford CS229M - Lecture 2: Asymptotic analysis, uniform convergence, Hoeffding inequality

Stanford CS229M - Lecture 2: Asymptotic analysis, uniform convergence, Hoeffding inequality

Stanford Online

Stanford CS229M - Lecture 3: Finite hypothesis class, discretizing infinite hypothesis space

Stanford CS229M - Lecture 3: Finite hypothesis class, discretizing infinite hypothesis space

Stanford Online

Stanford Seminar - Decentralized Finance (DeFi)

Stanford Seminar - Decentralized Finance (DeFi)

Stanford Online

Stanford CS229M - Lecture 4: Advanced concentration inequalities

Stanford CS229M - Lecture 4: Advanced concentration inequalities

Stanford Online

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)