Deploy a Hugging Face Transformers Model from the Model Hub to Amazon SageMaker
Key Takeaways
Deploying Hugging Face Transformers models from the Model Hub to Amazon SageMaker for inference using the HuggingFaceModel class and SageMaker SDK.
Full Transcript
hi everyone my name is philip i'm a machine learning engineer at hugging phase and also the tech lead for our partnership with aws i will show you in a few minutes how you can deploy hiking phase transformers from dehydrating phase hub to amazon sagemaker for inference amazon sagemaker is a fully managed machine learning service that provides everyone with the ability to build train and deploy machine learning models quickly together with aws we have built an inference optimized solution to deploy a transformer model as sagemaker endpoints let's get started we are going to use the sagemaker notebook instance to deploy our model using a sagemaker notebook instance is not a requirement to deploy your model you can also use sagemaker studio or deploy a model from your local machine the first step we need to do is to choose a model from the hugging phase hub the hiking phase hub offers over 10 000 different models fine-tuned on nlp vision and speech in this example we want to use a model fine tune on question answering and we also want to be it quick therefore we select the digital bird base cased distal squad model we copy the model id and go back to our notebook instance to be able to deploy a model from the hub to sagemaker we need to create a hugging face model class this model class contains all the configuration for our hub model as the model id and our task in this case question and the ring after that we can create our hugging face model and run the dot deploy method dot deploy method will create our sagemaker endpoint and we can pass in the initial instance count basically we can define on how many instances we want our endpoint to deploy it and the instance type in this case i went with the m5 instance which is a cpu based image the deploy will now create our sagemaker endpoint which takes around 5 minutes the endpoint has been successfully deployed to sagemaker and can now be used for inference the benefit of using the python sagemaker sdk is that the deploy method automatically returns predictor class which we can use to request our endpoint with the dot predict method therefore we need to define our inputs in this example since it's a question answering model we have our context my name is philip and i live in nuremberg this model is used with stage maker for inference and our question what is used for inference and our request has been successfully executed and the model successfully predicted the correct answer what is used for inference its sagemaker after you are done testing or requesting your endpoint you can run predictor dot delete endpoint this will clean up everything on sagemaker and make sure everything is deleted properly if you want to learn more about packing face on amazon sagemaker check out huggingface.co maker
Original Description
To deploy a model directly from the Hugging Face Model Hub to Amazon SageMaker, we need to define two environment variables when creating the HuggingFaceModel. We need to define:
- HF_MODEL_ID: defines the model id, which will be automatically loaded from huggingface.co/models when creating or SageMaker Endpoint. The 🤗 Hub provides +10 000 models all available through this environment variable.
- HF_TASK: defines the task for the used 🤗 Transformers pipeline. A full list of tasks can be found here.
https://huggingface.co/blog/deploy-hugging-face-models-easily-with-amazon-sagemaker
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from HuggingFace · HuggingFace · 48 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
▶
49
50
51
52
53
54
55
56
57
58
59
60
The Future of Natural Language Processing
HuggingFace
Trends in Model Size & Computational Efficiency in NLP
HuggingFace
Increasing Data Usage in Natural Language Processing
HuggingFace
In Domain & Out of Domain Generalization in the Future of NLP
HuggingFace
The Limits of NLU & the Rise of NLG in the Future of NLP
HuggingFace
The Lack of Robustness in the Future of NLP
HuggingFace
Inductive Bias, Common Sense, Continual Learning in The Future of NLP
HuggingFace
Train a Hugging Face Transformers Model with Amazon SageMaker
HuggingFace
What is Transfer Learning?
HuggingFace
The pipeline function
HuggingFace
Navigating the Model Hub
HuggingFace
Transformer models: Decoders
HuggingFace
The Transformer architecture
HuggingFace
Transformer models: Encoder-Decoders
HuggingFace
Transformer models: Encoders
HuggingFace
Keras introduction
HuggingFace
The push to hub API
HuggingFace
Fine-tuning with TensorFlow
HuggingFace
Learning rate scheduling with TensorFlow
HuggingFace
TensorFlow Predictions and metrics
HuggingFace
Welcome to the Hugging Face course
HuggingFace
The tokenization pipeline
HuggingFace
Supercharge your PyTorch training loop with Accelerate
HuggingFace
The Trainer API
HuggingFace
Batching inputs together (PyTorch)
HuggingFace
Batching inputs together (TensorFlow)
HuggingFace
Hugging Face Datasets overview (Pytorch)
HuggingFace
Hugging Face Datasets overview (Tensorflow)
HuggingFace
What is dynamic padding?
HuggingFace
What happens inside the pipeline function? (PyTorch)
HuggingFace
What happens inside the pipeline function? (TensorFlow)
HuggingFace
Instantiate a Transformers model (PyTorch)
HuggingFace
Instantiate a Transformers model (TensorFlow)
HuggingFace
Preprocessing sentence pairs (PyTorch)
HuggingFace
Preprocessing sentence pairs (TensorFlow)
HuggingFace
Write your training loop in PyTorch
HuggingFace
Managing a repo on the Model Hub
HuggingFace
Chapter 1 Live Session with Sylvain
HuggingFace
Chapter 2 Live Session with Lewis
HuggingFace
The push to hub API
HuggingFace
Chapter 2 Live Session with Sylvain
HuggingFace
Chapter 3 live sessions with Lewis (PyTorch)
HuggingFace
Day 1 Talks: JAX, Flax & Transformers 🤗
HuggingFace
Day 2 Talks: JAX, Flax & Transformers 🤗
HuggingFace
Day 3 Talks JAX, Flax, Transformers 🤗
HuggingFace
Chapter 4 live sessions with Omar
HuggingFace
Deploy a Hugging Face Transformers Model from S3 to Amazon SageMaker
HuggingFace
Deploy a Hugging Face Transformers Model from the Model Hub to Amazon SageMaker
HuggingFace
Run a Batch Transform Job using Hugging Face Transformers and Amazon SageMaker
HuggingFace
[Webinar] How to add machine learning capabilities with just a few lines of code
HuggingFace
Hugging Face + Zapier Demo Video
HuggingFace
Hugging Face + Google Sheets Demo
HuggingFace
Hugging Face Infinity Launch - 09/28
HuggingFace
Build and Deploy a Machine Learning App in 2 Minutes
HuggingFace
Hugging Face Infinity - GPU Walkthrough
HuggingFace
Otto - 🤗 Infinity Case Study
HuggingFace
Workshop: Getting started with Amazon Sagemaker Train a Hugging Face Transformers and deploy it
HuggingFace
Workshop: Going Production: Deploying, Scaling & Monitoring Hugging Face Transformer models
HuggingFace
🤗 Tasks: Causal Language Modeling
HuggingFace
🤗 Tasks: Masked Language Modeling
HuggingFace
More on: LLM Engineering
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Embeddings Simplified
Medium · RAG
Building LSTMs with PyTorch and Lightning AI Part 7: Resuming Training with Checkpoints
Dev.to · Rijul Rajesh
How AI Learns with Less Labeled Data
Medium · AI
Comparing Sarvam-30B and Qwen2.5–14B on Spider Text-to-SQL: An Active-Parameter Perspective
Medium · LLM
🎓
Tutor Explanation
DeepCamp AI