SLMs and LLMs: When to use them? | Amazon Web Services

Amazon Web Services · Advanced ·☁️ DevOps & Cloud ·12mo ago

Key Takeaways

The video discusses the use cases for Small Language Models (SLMs) and Large Language Models (LLMs), highlighting the benefits and limitations of each, and explores scenarios where a hybrid approach combining both SLMs and LLMs can be effective.

Full Transcript

Hello, I'm Nolan Chen, partner solutions architect at AWS. And I'm Andrew Welco, lead the field engineering team at RCAI. Andrew, can you give us a quick recap recap again as to what is a small language model or SLM? Absolutely. So, when we're talking about small language models versus large language models, there's a couple categories that we're focusing on. With LLMs, you typically are able to handle more complex use cases and be able to handle complex data types. However, they're also very expensive to run and that means that you cannot run them in your own environment. However, with small language models, these have a lower parameter count, which means they are able to be run with lower latency, lower cost, can be run in your own environment, and additionally have an increased ability to be adapted or fine-tuned. Thanks Andrew. So at least in theory SLMs are less complex, less costly, more easily adaptable and can run in your own environment. But could you now give us some examples of real world use cases or workloads that would be ideal for an SLM? Absolutely. So there's really two categories of use cases that we would think about. one is the generalpurpose small language models that are out there and then the other is domain adapted small language models. So let's first start with general purpose ones and the most common use case here is for chat bots. Small language models do a great job at acting as chat bots for businesses and specifically chat bots that are using retrieval augmented generation because with retrieval augmented generation or rag, you're actually passing in that information to the model when asking a question. And this means that the SLM has the context it needs to answer that question and can do it in a very quick and inexpensive manner. We've also seen a lot of use cases do really well around data labeling or data tagging as well as sentiment analysis and many others. And this is perfect for general purpose models that you don't actually need to tune the model for. However, one of the large benefits with small language models is the fact that they can be adapted or fine-tuned. And when we're looking at fine-tuned models, this is where businesses can get real benefits because you can be a financial institute that has a model that's fine-tuned to be able to do financial analysis. You can be a healthc care company that is fine-tuning a model that does really well at summarizing uh doctor transcripts for patients and the list goes on. You're able to fine-tune these models to be really performant at the tasks that's most relevant for your business. Got it. So I think there are a lot of companies out there that can definitely benefit from having a chatbot doing labeling and doing sentiment analysis. And you listed two important industries here, finance and healthcare that can benefit from SLM. But that said, you talked about all these benefits of SLM. When should companies still still use LLMs instead of SLM? Yeah, great question. And we don't think that LLMs are pointless, right? It's just that they should be used for the right task. So for example, if you have really complex use cases, let's say for example that you are a research lab and you are doing research around protein synthesis for example. This is where an LLM can come into play and there's lots of news stories around about how companies have been able to utilize them effectively. Additionally, if you're doing really complex analysis, this is where LLMs do a good job. Also, if you need to utilize a massive context window, so you are sending in lots of documents, and I'm not talking hundreds of documents, but I'm talking about like you're getting to thousands of documents that you are sending. There are LLMs that allow you to use to be able to pass in that much information. So depending on the use case, you're able to use LLMs for those really really highly complex tasks. Are there any tasks or workflows where you have uh hybrid approach where you have both LM and SLM working together? Yeah, absolutely. And that's actually one that we see all the time. So let's take for example you have a document processing workflow. You might start with your document and you want to do a couple of things. Maybe you want to extract some data. So you want to do a little bit of data extraction here. You can pass that to an SLM to do that task. Then you might also want to do let's say sentiment analysis. [Music] You can use an SLM for that task as well. But then let's say that you actually want to pass the extracted data, the document itself, the sentiment analysis, and let's say that there is some data that you scrape from the web. [Music] Let's say it's some research and you want to pass all of that to a model. you can then pass that to your LLM to do overall analysis of this output and then that output can have a task done with it. So here we see a perfect example where you can combine small language models with large language models and one of the things that actually works well here is when we get into the topic of int intelligent model routing. Awesome. So, we'll get to that topic next, but thank you for showing how companies can benefit from LLMs or SLMs on their own or even put them together in a hybrid solution. Absolutely. Thanks, Dolan. Thank you, Andrew.

Original Description

In part 2 of this 5 part video series on Small Language Models with Arcee AI, Andrew Walko and Nolan Chen discuss common use cases for SLMs and LLMs. Learn more - http://go.aws/4laWv7r Subscribe to AWS: https://go.aws/subscribe Sign up for AWS: https://go.aws/signup AWS free tier: https://go.aws/free Explore more: https://go.aws/more Contact AWS: https://go.aws/contact Next steps: Explore on AWS in Analyst Research: https://go.aws/reports Discover, deploy, and manage software that runs on AWS: https://go.aws/marketplace Join the AWS Partner Network: https://go.aws/partners Learn more on how Amazon builds and operates software: https://go.aws/library Do you have technical AWS questions? Ask the community of experts on AWS re:Post: https://go.aws/3lPaoPb Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—use AWS to be more agile, lower costs, and innovate faster. #AWS #AmazonWebServices #CloudComputing #SLMs #LLMs #smalllanguagemodels #generativeai #ai
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Amazon Web Services · Amazon Web Services · 57 of 60

1 Agentic AI Design Patterns Introduction and walkthrough | Amazon Web Services
Agentic AI Design Patterns Introduction and walkthrough | Amazon Web Services
Amazon Web Services
2 Galileo on modernizing on banking infrastructure | Amazon Web Services
Galileo on modernizing on banking infrastructure | Amazon Web Services
Amazon Web Services
3 Alliander Speeds Innovation and Energy Transition Using AWS | Amazon Web Services
Alliander Speeds Innovation and Energy Transition Using AWS | Amazon Web Services
Amazon Web Services
4 AWS and Scuderia Ferrari HP streamline F1 power unit assembly | Amazon Web Services
AWS and Scuderia Ferrari HP streamline F1 power unit assembly | Amazon Web Services
Amazon Web Services
5 How AWS machine learning supports Scuderia Ferrari HP pit stops | Amazon Web Services
How AWS machine learning supports Scuderia Ferrari HP pit stops | Amazon Web Services
Amazon Web Services
6 Nasdaq Builds Market Infrastructure of the Future with AWS | Amazon Web Services
Nasdaq Builds Market Infrastructure of the Future with AWS | Amazon Web Services
Amazon Web Services
7 AWS Security Hub Exposure Findings | Amazon Web Services
AWS Security Hub Exposure Findings | Amazon Web Services
Amazon Web Services
8 How do I use Session Manager port forwarding to connect to my EC2 instance through RDP?
How do I use Session Manager port forwarding to connect to my EC2 instance through RDP?
Amazon Web Services
9 How do I extend an EBS volume with LVM partitions?
How do I extend an EBS volume with LVM partitions?
Amazon Web Services
10 AWS Graviton makes it easy to optimize performance, cost, and sustainability | Amazon Web Services
AWS Graviton makes it easy to optimize performance, cost, and sustainability | Amazon Web Services
Amazon Web Services
11 Run Cloud Adoption Framework workshops with Miro | Amazon Web Services
Run Cloud Adoption Framework workshops with Miro | Amazon Web Services
Amazon Web Services
12 Getting Started with AWS Cost Optimization Hub | Amazon Web Services
Getting Started with AWS Cost Optimization Hub | Amazon Web Services
Amazon Web Services
13 Why did my Amazon SQS messages get sent to a dead-letter queue?
Why did my Amazon SQS messages get sent to a dead-letter queue?
Amazon Web Services
14 Declarative Policies for EC2 | Amazon Web Services
Declarative Policies for EC2 | Amazon Web Services
Amazon Web Services
15 How do I troubleshoot IAM permission issues for the Billing and Cost Management console?
How do I troubleshoot IAM permission issues for the Billing and Cost Management console?
Amazon Web Services
16 Integrity at Scale: Inside the Flo Health Mission | Amazon Web Services
Integrity at Scale: Inside the Flo Health Mission | Amazon Web Services
Amazon Web Services
17 Fueling Success: Small shifts, powerful performance | Amazon Web Services
Fueling Success: Small shifts, powerful performance | Amazon Web Services
Amazon Web Services
18 WEX enhances customer experience with AI-powered chatbot | Amazon Web Services
WEX enhances customer experience with AI-powered chatbot | Amazon Web Services
Amazon Web Services
19 Accelerate troubleshooting with Amazon CloudWatch investigations | Amazon Web Services
Accelerate troubleshooting with Amazon CloudWatch investigations | Amazon Web Services
Amazon Web Services
20 Why is my Windows WorkSpace stuck in the starting, rebooting, or stopping status?
Why is my Windows WorkSpace stuck in the starting, rebooting, or stopping status?
Amazon Web Services
21 Telemetry Pipelines for AI | Amazon Web Services
Telemetry Pipelines for AI | Amazon Web Services
Amazon Web Services
22 Getting Control over Security and Observability Data | Amazon Web Services
Getting Control over Security and Observability Data | Amazon Web Services
Amazon Web Services
23 The Problem with Telemetry Data Volume | Amazon Web Services
The Problem with Telemetry Data Volume | Amazon Web Services
Amazon Web Services
24 Telemetry Pipelines on AWS | Amazon Web Services
Telemetry Pipelines on AWS | Amazon Web Services
Amazon Web Services
25 What are Telemetry Pipelines? | Amazon Web Services
What are Telemetry Pipelines? | Amazon Web Services
Amazon Web Services
26 Using AI for RegEx on Telemetry Pipelines | Amazon Web Services
Using AI for RegEx on Telemetry Pipelines | Amazon Web Services
Amazon Web Services
27 Multi-Session Support in the AWS Console | Amazon Web Services
Multi-Session Support in the AWS Console | Amazon Web Services
Amazon Web Services
28 How CloudHedge delivers assessment with AWS ISV Tooling Program at no cost?
How CloudHedge delivers assessment with AWS ISV Tooling Program at no cost?
Amazon Web Services
29 How customers speed up migration and modernization to AWS with CloudHedge | Amazon Web Services
How customers speed up migration and modernization to AWS with CloudHedge | Amazon Web Services
Amazon Web Services
30 Chaos Experiment with Amazon ElastiCache | Amazon Web Services
Chaos Experiment with Amazon ElastiCache | Amazon Web Services
Amazon Web Services
31 Amazon S3 Access Points: Easily manage access for shared datasets on S3 | Amazon Web Services
Amazon S3 Access Points: Easily manage access for shared datasets on S3 | Amazon Web Services
Amazon Web Services
32 ElastiCache Valkey 8.0 - Savings and Efficiency | Amazon Web Services
ElastiCache Valkey 8.0 - Savings and Efficiency | Amazon Web Services
Amazon Web Services
33 Pennymac scales document processing with AWS | Amazon Web Services
Pennymac scales document processing with AWS | Amazon Web Services
Amazon Web Services
34 AWS | Next Level Innovation | Amazon Web Services
AWS | Next Level Innovation | Amazon Web Services
Amazon Web Services
35 Driving Cloud Innovation: Mindtickle's Partnership with AWS Enterprise Support | Amazon Web Services
Driving Cloud Innovation: Mindtickle's Partnership with AWS Enterprise Support | Amazon Web Services
Amazon Web Services
36 A Leader's Edge from Executive Insights | Amazon Web Services
A Leader's Edge from Executive Insights | Amazon Web Services
Amazon Web Services
37 How do I create a custom Amazon WorkSpaces image?
How do I create a custom Amazon WorkSpaces image?
Amazon Web Services
38 Charles Leclerc tests his AI-generated race track | Amazon Web Services
Charles Leclerc tests his AI-generated race track | Amazon Web Services
Amazon Web Services
39 Redington Scales India’s Cloud Access with AWS Partnership | Amazon Web Services
Redington Scales India’s Cloud Access with AWS Partnership | Amazon Web Services
Amazon Web Services
40 How do I prevent the resources in my CloudFormation stack from getting deleted or updated?
How do I prevent the resources in my CloudFormation stack from getting deleted or updated?
Amazon Web Services
41 How do I troubleshoot authentication errors when I use RDP to connect to an EC2 Windows instance?
How do I troubleshoot authentication errors when I use RDP to connect to an EC2 Windows instance?
Amazon Web Services
42 Exploring the Possibilities of Digital Twin & AI at the Edge | Amazon Web Services
Exploring the Possibilities of Digital Twin & AI at the Edge | Amazon Web Services
Amazon Web Services
43 Exploring the Possibilities of Digital Twin & AI at the Edge | Amazon Web Services
Exploring the Possibilities of Digital Twin & AI at the Edge | Amazon Web Services
Amazon Web Services
44 AWS at the FORMULA 1 AWS GRAN PREMIO DELL'EMILIA-ROMAGNA 2025 | Amazon Web Services
AWS at the FORMULA 1 AWS GRAN PREMIO DELL'EMILIA-ROMAGNA 2025 | Amazon Web Services
Amazon Web Services
45 What's new in RCPs | Amazon Web Services
What's new in RCPs | Amazon Web Services
Amazon Web Services
46 API Caching using Amazon ElastiCache | Amazon Web Services
API Caching using Amazon ElastiCache | Amazon Web Services
Amazon Web Services
47 Pendula: Amazon Nova Customer Testimonial | Amazon Web Services
Pendula: Amazon Nova Customer Testimonial | Amazon Web Services
Amazon Web Services
48 InDebted : Amazon Nova Customer Testimonial | Amazon Web Services
InDebted : Amazon Nova Customer Testimonial | Amazon Web Services
Amazon Web Services
49 Amazon DynamoDB global tables with multi-Region strong consistency | Amazon Web Services
Amazon DynamoDB global tables with multi-Region strong consistency | Amazon Web Services
Amazon Web Services
50 Siemens Mobility uses AWS to operate securely, efficiently on a global scale | Amazon Web Services
Siemens Mobility uses AWS to operate securely, efficiently on a global scale | Amazon Web Services
Amazon Web Services
51 How do I reuse a knowledge base session in Amazon Bedrock?
How do I reuse a knowledge base session in Amazon Bedrock?
Amazon Web Services
52 EP5: MBZUAI, CMU : Causal AI, Answering The “Why“ and “What if“ Questions | AWS for AI Podcast
EP5: MBZUAI, CMU : Causal AI, Answering The “Why“ and “What if“ Questions | AWS for AI Podcast
Amazon Web Services
53 Hema scales time to market developing a data mesh on AWS (Technical) - Cloud Adventures
Hema scales time to market developing a data mesh on AWS (Technical) - Cloud Adventures
Amazon Web Services
54 Hema scales time to market developing a data mesh on AWS (Business) - Cloud Adventures
Hema scales time to market developing a data mesh on AWS (Business) - Cloud Adventures
Amazon Web Services
55 How Langfuse Scaled Their AI Platform with AWS: From Open-Source to Enterprise | Amazon Web Services
How Langfuse Scaled Their AI Platform with AWS: From Open-Source to Enterprise | Amazon Web Services
Amazon Web Services
56 SLMs and LLMs: What’s the Difference? | Amazon Web Services
SLMs and LLMs: What’s the Difference? | Amazon Web Services
Amazon Web Services
SLMs and LLMs: When to use them? | Amazon Web Services
SLMs and LLMs: When to use them? | Amazon Web Services
Amazon Web Services
58 SLMs on CPU | Amazon Web Services
SLMs on CPU | Amazon Web Services
Amazon Web Services
59 Intelligent Model Routing | Amazon Web Services
Intelligent Model Routing | Amazon Web Services
Amazon Web Services
60 SLMs, LLMs, and Model Routing in Agents | Amazon Web Services
SLMs, LLMs, and Model Routing in Agents | Amazon Web Services
Amazon Web Services

The video explains the differences between SLMs and LLMs, and how to choose the right model for specific use cases, including chatbots, data labeling, and sentiment analysis. It also explores hybrid approaches that combine SLMs and LLMs for more complex tasks.

Key Takeaways
  1. Identify the use case for the language model
  2. Determine the required complexity and latency
  3. Choose between SLM and LLM based on the use case
  4. Fine-tune the model for domain-specific tasks
  5. Consider a hybrid approach for complex tasks
💡 SLMs and LLMs have different strengths and weaknesses, and the choice between them depends on the specific use case and requirements.

Related AI Lessons

Built a suite of client-side dev tools to fix the "production data" privacy gap
Learn how to build client-side dev tools to address production data privacy gaps and improve development efficiency
Dev.to · Rayan Ahmad
5 Best BrowserStack Alternatives to Optimize Your Testing Infrastructure
Discover the top 5 BrowserStack alternatives to optimize testing infrastructure for better execution speed, pricing, and test management
Medium · DevOps
️ The Lifecycle Symphony: A Senior SRE’s Deep Dive into Init and Sidecar Containers
Learn how to optimize container initialization and sidecar containers for resilient multi-cloud platforms
Medium · DevOps
`wrangler dev --remote` silently writes to your production KV namespace — here's the fix
Learn how to safely use wrangler dev --remote with live KV namespaces without overwriting production data
Dev.to · 강해수
Up next
CompTIA Linux+ XK0-006: How to Prepare and Pass in 2026
Webronaq
Watch →