Talk: Kernels Deep Dive (Ben Burtenshaw)
In this talk, Ben Burtenshaw from Hugging Face breaks down why optimized kernels are critical for real-world deep learning performance and how the Hugging Face Kernels ecosystem makes them easier to build and use.
He covers memory-bound bottlenecks, the kernel-builder workflow, reproducible multi-hardware builds with Nix, and practical PyTorch/Transformers integration patterns that reduce setup time from hours to seconds.
## Chapters
0:00 Intro and speaker background
1:35 Why Hugging Face Kernels matters
2:05 Compute vs memory bottlenecks in deep learning
3:30 Fused kernels and why the…
Watch on YouTube ↗
(saves to browser)
Chapters (19)
Intro and speaker background
1:35
Why Hugging Face Kernels matters
2:05
Compute vs memory bottlenecks in deep learning
3:30
Fused kernels and why they speed things up
5:05
Talk agenda and ecosystem overview
5:35
Kernel pain points: fragmentation and long installs
7:12
Supporting older, cheaper hardware for the community
8:18
Goal: from CMake errors to one-line kernel usage
8:54
Kernels + kernel-builder architecture
10:00
Reproducible builds with Nix and support matrix
11:45
Kernel project structure (`build.toml`, sources, torch extension)
12:23
Publishing kernels to the Hugging Face Hub
13:25
Real-world gain: faster FlashAttention setup
14:18
Docs, repos, and how to get started
16:00
Verifying compatibility and loading kernels in Python
17:20
Managing local cache with `hf cache ls`
17:55
Kernelizing PyTorch layers with hub mappings
19:32
Transformers integration (`use_kernels=True`)
20:48
Performance chart and closing resources
Playlist
Uploads from HuggingFace · HuggingFace · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
The Future of Natural Language Processing
HuggingFace
Trends in Model Size & Computational Efficiency in NLP
HuggingFace
Increasing Data Usage in Natural Language Processing
HuggingFace
In Domain & Out of Domain Generalization in the Future of NLP
HuggingFace
The Limits of NLU & the Rise of NLG in the Future of NLP
HuggingFace
The Lack of Robustness in the Future of NLP
HuggingFace
Inductive Bias, Common Sense, Continual Learning in The Future of NLP
HuggingFace
Train and use a NLP model in 10 mins!
HuggingFace
Automatic text classification in a few lines of code
HuggingFace
Train a Hugging Face Transformers Model with Amazon SageMaker
HuggingFace
What is Transfer Learning?
HuggingFace
The pipeline function
HuggingFace
Navigating the Model Hub
HuggingFace
Character-based tokenizers
HuggingFace
Transformer models: Decoders
HuggingFace
The Transformer architecture
HuggingFace
Transformer models: Encoder-Decoders
HuggingFace
Transformer models: Encoders
HuggingFace
Keras introduction
HuggingFace
The push to hub API
HuggingFace
Subword-based tokenizers
HuggingFace
Fine-tuning with TensorFlow
HuggingFace
Learning rate scheduling with TensorFlow
HuggingFace
TensorFlow Predictions and metrics
HuggingFace
Tokenizers Overview
HuggingFace
Word-based tokenizers
HuggingFace
Welcome to the Hugging Face course
HuggingFace
The tokenization pipeline
HuggingFace
Supercharge your PyTorch training loop with Accelerate
HuggingFace
The Trainer API
HuggingFace
Batching inputs together (PyTorch)
HuggingFace
Batching inputs together (TensorFlow)
HuggingFace
Hugging Face Datasets overview (Pytorch)
HuggingFace
Hugging Face Datasets overview (Tensorflow)
HuggingFace
What is dynamic padding?
HuggingFace
What happens inside the pipeline function? (PyTorch)
HuggingFace
What happens inside the pipeline function? (TensorFlow)
HuggingFace
Instantiate a Transformers model (PyTorch)
HuggingFace
Instantiate a Transformers model (TensorFlow)
HuggingFace
Preprocessing sentence pairs (PyTorch)
HuggingFace
Preprocessing sentence pairs (TensorFlow)
HuggingFace
Write your training loop in PyTorch
HuggingFace
Managing a repo on the Model Hub
HuggingFace
Chapter 1 Live Session with Sylvain
HuggingFace
Chapter 2 Live Session with Lewis
HuggingFace
The push to hub API
HuggingFace
Chapter 2 Live Session with Sylvain
HuggingFace
Chapter 3 live sessions with Lewis (PyTorch)
HuggingFace
Day 1 Talks: JAX, Flax & Transformers 🤗
HuggingFace
Day 2 Talks: JAX, Flax & Transformers 🤗
HuggingFace
Day 3 Talks JAX, Flax, Transformers 🤗
HuggingFace
Chapter 4 live sessions with Omar
HuggingFace
Deploy a Hugging Face Transformers Model from S3 to Amazon SageMaker
HuggingFace
Deploy a Hugging Face Transformers Model from the Model Hub to Amazon SageMaker
HuggingFace
Run a Batch Transform Job using Hugging Face Transformers and Amazon SageMaker
HuggingFace
[Webinar] How to add machine learning capabilities with just a few lines of code
HuggingFace
Hugging Face + Zapier Demo Video
HuggingFace
Hugging Face + Google Sheets Demo
HuggingFace
Hugging Face Infinity Launch - 09/28
HuggingFace
Introducing AutoNLP (Trailer)
HuggingFace
DeepCamp AI