Deploy a Hugging Face Transformers Model from the Model Hub to Amazon SageMaker

HuggingFace · Intermediate ·🧠 Large Language Models ·4y ago

Key Takeaways

Deploying Hugging Face Transformers models from the Model Hub to Amazon SageMaker for inference using the HuggingFaceModel class and SageMaker SDK.

Full Transcript

hi everyone my name is philip i'm a machine learning engineer at hugging phase and also the tech lead for our partnership with aws i will show you in a few minutes how you can deploy hiking phase transformers from dehydrating phase hub to amazon sagemaker for inference amazon sagemaker is a fully managed machine learning service that provides everyone with the ability to build train and deploy machine learning models quickly together with aws we have built an inference optimized solution to deploy a transformer model as sagemaker endpoints let's get started we are going to use the sagemaker notebook instance to deploy our model using a sagemaker notebook instance is not a requirement to deploy your model you can also use sagemaker studio or deploy a model from your local machine the first step we need to do is to choose a model from the hugging phase hub the hiking phase hub offers over 10 000 different models fine-tuned on nlp vision and speech in this example we want to use a model fine tune on question answering and we also want to be it quick therefore we select the digital bird base cased distal squad model we copy the model id and go back to our notebook instance to be able to deploy a model from the hub to sagemaker we need to create a hugging face model class this model class contains all the configuration for our hub model as the model id and our task in this case question and the ring after that we can create our hugging face model and run the dot deploy method dot deploy method will create our sagemaker endpoint and we can pass in the initial instance count basically we can define on how many instances we want our endpoint to deploy it and the instance type in this case i went with the m5 instance which is a cpu based image the deploy will now create our sagemaker endpoint which takes around 5 minutes the endpoint has been successfully deployed to sagemaker and can now be used for inference the benefit of using the python sagemaker sdk is that the deploy method automatically returns predictor class which we can use to request our endpoint with the dot predict method therefore we need to define our inputs in this example since it's a question answering model we have our context my name is philip and i live in nuremberg this model is used with stage maker for inference and our question what is used for inference and our request has been successfully executed and the model successfully predicted the correct answer what is used for inference its sagemaker after you are done testing or requesting your endpoint you can run predictor dot delete endpoint this will clean up everything on sagemaker and make sure everything is deleted properly if you want to learn more about packing face on amazon sagemaker check out huggingface.co maker

Original Description

To deploy a model directly from the Hugging Face Model Hub to Amazon SageMaker, we need to define two environment variables when creating the HuggingFaceModel. We need to define: - HF_MODEL_ID: defines the model id, which will be automatically loaded from huggingface.co/models when creating or SageMaker Endpoint. The 🤗 Hub provides +10 000 models all available through this environment variable. - HF_TASK: defines the task for the used 🤗 Transformers pipeline. A full list of tasks can be found here. https://huggingface.co/blog/deploy-hugging-face-models-easily-with-amazon-sagemaker
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from HuggingFace · HuggingFace · 48 of 60

1 The Future of Natural Language Processing
The Future of Natural Language Processing
HuggingFace
2 Trends in Model Size & Computational Efficiency in NLP
Trends in Model Size & Computational Efficiency in NLP
HuggingFace
3 Increasing Data Usage in Natural Language Processing
Increasing Data Usage in Natural Language Processing
HuggingFace
4 In Domain & Out of Domain Generalization in the Future of NLP
In Domain & Out of Domain Generalization in the Future of NLP
HuggingFace
5 The Limits of NLU & the Rise of NLG in the Future of NLP
The Limits of NLU & the Rise of NLG in the Future of NLP
HuggingFace
6 The Lack of Robustness in the Future of NLP
The Lack of Robustness in the Future of NLP
HuggingFace
7 Inductive Bias, Common Sense, Continual Learning in The Future of NLP
Inductive Bias, Common Sense, Continual Learning in The Future of NLP
HuggingFace
8 Train a Hugging Face Transformers Model with Amazon SageMaker
Train a Hugging Face Transformers Model with Amazon SageMaker
HuggingFace
9 What is Transfer Learning?
What is Transfer Learning?
HuggingFace
10 The pipeline function
The pipeline function
HuggingFace
11 Navigating the Model Hub
Navigating the Model Hub
HuggingFace
12 Transformer models: Decoders
Transformer models: Decoders
HuggingFace
13 The Transformer architecture
The Transformer architecture
HuggingFace
14 Transformer models: Encoder-Decoders
Transformer models: Encoder-Decoders
HuggingFace
15 Transformer models: Encoders
Transformer models: Encoders
HuggingFace
16 Keras introduction
Keras introduction
HuggingFace
17 The push to hub API
The push to hub API
HuggingFace
18 Fine-tuning with TensorFlow
Fine-tuning with TensorFlow
HuggingFace
19 Learning rate scheduling with TensorFlow
Learning rate scheduling with TensorFlow
HuggingFace
20 TensorFlow Predictions and metrics
TensorFlow Predictions and metrics
HuggingFace
21 Welcome to the Hugging Face course
Welcome to the Hugging Face course
HuggingFace
22 The tokenization pipeline
The tokenization pipeline
HuggingFace
23 Supercharge your PyTorch training loop with Accelerate
Supercharge your PyTorch training loop with Accelerate
HuggingFace
24 The Trainer API
The Trainer API
HuggingFace
25 Batching inputs together (PyTorch)
Batching inputs together (PyTorch)
HuggingFace
26 Batching inputs together (TensorFlow)
Batching inputs together (TensorFlow)
HuggingFace
27 Hugging Face Datasets overview (Pytorch)
Hugging Face Datasets overview (Pytorch)
HuggingFace
28 Hugging Face Datasets overview (Tensorflow)
Hugging Face Datasets overview (Tensorflow)
HuggingFace
29 What is dynamic padding?
What is dynamic padding?
HuggingFace
30 What happens inside the pipeline function? (PyTorch)
What happens inside the pipeline function? (PyTorch)
HuggingFace
31 What happens inside the pipeline function? (TensorFlow)
What happens inside the pipeline function? (TensorFlow)
HuggingFace
32 Instantiate a Transformers model (PyTorch)
Instantiate a Transformers model (PyTorch)
HuggingFace
33 Instantiate a Transformers model (TensorFlow)
Instantiate a Transformers model (TensorFlow)
HuggingFace
34 Preprocessing sentence pairs (PyTorch)
Preprocessing sentence pairs (PyTorch)
HuggingFace
35 Preprocessing sentence pairs (TensorFlow)
Preprocessing sentence pairs (TensorFlow)
HuggingFace
36 Write your training loop in PyTorch
Write your training loop in PyTorch
HuggingFace
37 Managing a repo on the Model Hub
Managing a repo on the Model Hub
HuggingFace
38 Chapter 1 Live Session with Sylvain
Chapter 1 Live Session with Sylvain
HuggingFace
39 Chapter 2 Live Session with Lewis
Chapter 2 Live Session with Lewis
HuggingFace
40 The push to hub API
The push to hub API
HuggingFace
41 Chapter 2 Live Session with Sylvain
Chapter 2 Live Session with Sylvain
HuggingFace
42 Chapter 3 live sessions with Lewis (PyTorch)
Chapter 3 live sessions with Lewis (PyTorch)
HuggingFace
43 Day 1 Talks: JAX, Flax & Transformers 🤗
Day 1 Talks: JAX, Flax & Transformers 🤗
HuggingFace
44 Day 2 Talks: JAX, Flax & Transformers 🤗
Day 2 Talks: JAX, Flax & Transformers 🤗
HuggingFace
45 Day 3 Talks JAX, Flax, Transformers 🤗
Day 3 Talks JAX, Flax, Transformers 🤗
HuggingFace
46 Chapter 4 live sessions with Omar
Chapter 4 live sessions with Omar
HuggingFace
47 Deploy a Hugging Face Transformers Model from S3 to Amazon SageMaker
Deploy a Hugging Face Transformers Model from S3 to Amazon SageMaker
HuggingFace
Deploy a Hugging Face Transformers Model from the Model Hub to Amazon SageMaker
Deploy a Hugging Face Transformers Model from the Model Hub to Amazon SageMaker
HuggingFace
49 Run a Batch Transform Job using Hugging Face Transformers and Amazon SageMaker
Run a Batch Transform Job using Hugging Face Transformers and Amazon SageMaker
HuggingFace
50 [Webinar] How to add machine learning capabilities with just a few lines of code
[Webinar] How to add machine learning capabilities with just a few lines of code
HuggingFace
51 Hugging Face + Zapier Demo Video
Hugging Face + Zapier Demo Video
HuggingFace
52 Hugging Face + Google Sheets Demo
Hugging Face + Google Sheets Demo
HuggingFace
53 Hugging Face Infinity Launch - 09/28
Hugging Face Infinity Launch - 09/28
HuggingFace
54 Build and Deploy a Machine Learning App in 2 Minutes
Build and Deploy a Machine Learning App in 2 Minutes
HuggingFace
55 Hugging Face Infinity - GPU Walkthrough
Hugging Face Infinity - GPU Walkthrough
HuggingFace
56 Otto - 🤗 Infinity Case Study
Otto - 🤗 Infinity Case Study
HuggingFace
57 Workshop: Getting started with Amazon Sagemaker Train a Hugging Face Transformers and deploy it
Workshop: Getting started with Amazon Sagemaker Train a Hugging Face Transformers and deploy it
HuggingFace
58 Workshop: Going Production: Deploying, Scaling & Monitoring Hugging Face Transformer models
Workshop: Going Production: Deploying, Scaling & Monitoring Hugging Face Transformer models
HuggingFace
59 🤗 Tasks: Causal Language Modeling
🤗 Tasks: Causal Language Modeling
HuggingFace
60 🤗 Tasks: Masked Language Modeling
🤗 Tasks: Masked Language Modeling
HuggingFace

This video demonstrates how to deploy a Hugging Face Transformers model from the Model Hub to Amazon SageMaker for inference. The process involves creating a HuggingFaceModel class, defining environment variables, and using the SageMaker SDK to deploy the model.

Key Takeaways
  1. Choose a model from the Hugging Face Model Hub
  2. Create a HuggingFaceModel class
  3. Define environment variables (HF_MODEL_ID)
  4. Create a SageMaker notebook instance
  5. Deploy the model using the SageMaker SDK
  6. Test the endpoint with a sample input
  7. Clean up resources using predictor.delete_endpoint()
💡 The HuggingFaceModel class and SageMaker SDK provide a streamlined way to deploy Transformers models from the Model Hub to Amazon SageMaker for inference.

Related AI Lessons

Embeddings Simplified
Learn the basics of embeddings and how they simplify complex data, a crucial concept in AI and ML
Medium · RAG
Building LSTMs with PyTorch and Lightning AI Part 7: Resuming Training with Checkpoints
Learn to resume LSTM training with checkpoints using PyTorch and Lightning AI, enabling efficient model iteration and development
Dev.to · Rijul Rajesh
How AI Learns with Less Labeled Data
Learn how AI can learn with less labeled data, a crucial aspect of machine learning beyond model selection
Medium · AI
Comparing Sarvam-30B and Qwen2.5–14B on Spider Text-to-SQL: An Active-Parameter Perspective
Learn how to compare large language models like Sarvam-30B and Qwen2.5-14B on the Spider Text-to-SQL benchmark from an active-parameter perspective
Medium · LLM
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →