SLMs and LLMs: When to use them? | Amazon Web Services

Amazon Web Services · Advanced ·☁️ DevOps & Cloud ·12mo ago

Skills: LLM Engineering90%LLM Foundations80%Prompting Basics60%

Key Takeaways

The video discusses the use cases for Small Language Models (SLMs) and Large Language Models (LLMs), highlighting the benefits and limitations of each, and explores scenarios where a hybrid approach combining both SLMs and LLMs can be effective.

Full Transcript

Hello, I'm Nolan Chen, partner solutions architect at AWS. And I'm Andrew Welco, lead the field engineering team at RCAI. Andrew, can you give us a quick recap recap again as to what is a small language model or SLM? Absolutely. So, when we're talking about small language models versus large language models, there's a couple categories that we're focusing on. With LLMs, you typically are able to handle more complex use cases and be able to handle complex data types. However, they're also very expensive to run and that means that you cannot run them in your own environment. However, with small language models, these have a lower parameter count, which means they are able to be run with lower latency, lower cost, can be run in your own environment, and additionally have an increased ability to be adapted or fine-tuned. Thanks Andrew. So at least in theory SLMs are less complex, less costly, more easily adaptable and can run in your own environment. But could you now give us some examples of real world use cases or workloads that would be ideal for an SLM? Absolutely. So there's really two categories of use cases that we would think about. one is the generalpurpose small language models that are out there and then the other is domain adapted small language models. So let's first start with general purpose ones and the most common use case here is for chat bots. Small language models do a great job at acting as chat bots for businesses and specifically chat bots that are using retrieval augmented generation because with retrieval augmented generation or rag, you're actually passing in that information to the model when asking a question. And this means that the SLM has the context it needs to answer that question and can do it in a very quick and inexpensive manner. We've also seen a lot of use cases do really well around data labeling or data tagging as well as sentiment analysis and many others. And this is perfect for general purpose models that you don't actually need to tune the model for. However, one of the large benefits with small language models is the fact that they can be adapted or fine-tuned. And when we're looking at fine-tuned models, this is where businesses can get real benefits because you can be a financial institute that has a model that's fine-tuned to be able to do financial analysis. You can be a healthc care company that is fine-tuning a model that does really well at summarizing uh doctor transcripts for patients and the list goes on. You're able to fine-tune these models to be really performant at the tasks that's most relevant for your business. Got it. So I think there are a lot of companies out there that can definitely benefit from having a chatbot doing labeling and doing sentiment analysis. And you listed two important industries here, finance and healthcare that can benefit from SLM. But that said, you talked about all these benefits of SLM. When should companies still still use LLMs instead of SLM? Yeah, great question. And we don't think that LLMs are pointless, right? It's just that they should be used for the right task. So for example, if you have really complex use cases, let's say for example that you are a research lab and you are doing research around protein synthesis for example. This is where an LLM can come into play and there's lots of news stories around about how companies have been able to utilize them effectively. Additionally, if you're doing really complex analysis, this is where LLMs do a good job. Also, if you need to utilize a massive context window, so you are sending in lots of documents, and I'm not talking hundreds of documents, but I'm talking about like you're getting to thousands of documents that you are sending. There are LLMs that allow you to use to be able to pass in that much information. So depending on the use case, you're able to use LLMs for those really really highly complex tasks. Are there any tasks or workflows where you have uh hybrid approach where you have both LM and SLM working together? Yeah, absolutely. And that's actually one that we see all the time. So let's take for example you have a document processing workflow. You might start with your document and you want to do a couple of things. Maybe you want to extract some data. So you want to do a little bit of data extraction here. You can pass that to an SLM to do that task. Then you might also want to do let's say sentiment analysis. [Music] You can use an SLM for that task as well. But then let's say that you actually want to pass the extracted data, the document itself, the sentiment analysis, and let's say that there is some data that you scrape from the web. [Music] Let's say it's some research and you want to pass all of that to a model. you can then pass that to your LLM to do overall analysis of this output and then that output can have a task done with it. So here we see a perfect example where you can combine small language models with large language models and one of the things that actually works well here is when we get into the topic of int intelligent model routing. Awesome. So, we'll get to that topic next, but thank you for showing how companies can benefit from LLMs or SLMs on their own or even put them together in a hybrid solution. Absolutely. Thanks, Dolan. Thank you, Andrew.

Original Description

In part 2 of this 5 part video series on Small Language Models with Arcee AI, Andrew Walko and Nolan Chen discuss common use cases for SLMs and LLMs. Learn more - http://go.aws/4laWv7r Subscribe to AWS: https://go.aws/subscribe Sign up for AWS: https://go.aws/signup AWS free tier: https://go.aws/free Explore more: https://go.aws/more Contact AWS: https://go.aws/contact Next steps: Explore on AWS in Analyst Research: https://go.aws/reports Discover, deploy, and manage software that runs on AWS: https://go.aws/marketplace Join the AWS Partner Network: https://go.aws/partners Learn more on how Amazon builds and operates software: https://go.aws/library Do you have technical AWS questions? Ask the community of experts on AWS re:Post: https://go.aws/3lPaoPb Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—use AWS to be more agile, lower costs, and innovate faster. #AWS #AmazonWebServices #CloudComputing #SLMs #LLMs #smalllanguagemodels #generativeai #ai

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Amazon Web Services · Amazon Web Services · 57 of 60

← Previous Next →

Agentic AI Design Patterns Introduction and walkthrough | Amazon Web Services

Agentic AI Design Patterns Introduction and walkthrough | Amazon Web Services

Amazon Web Services

Galileo on modernizing on banking infrastructure | Amazon Web Services

Galileo on modernizing on banking infrastructure | Amazon Web Services

Amazon Web Services

Alliander Speeds Innovation and Energy Transition Using AWS | Amazon Web Services

Alliander Speeds Innovation and Energy Transition Using AWS | Amazon Web Services

Amazon Web Services

AWS and Scuderia Ferrari HP streamline F1 power unit assembly | Amazon Web Services

AWS and Scuderia Ferrari HP streamline F1 power unit assembly | Amazon Web Services

Amazon Web Services

How AWS machine learning supports Scuderia Ferrari HP pit stops | Amazon Web Services

How AWS machine learning supports Scuderia Ferrari HP pit stops | Amazon Web Services

Amazon Web Services

Nasdaq Builds Market Infrastructure of the Future with AWS | Amazon Web Services

Nasdaq Builds Market Infrastructure of the Future with AWS | Amazon Web Services

Amazon Web Services

AWS Security Hub Exposure Findings | Amazon Web Services

AWS Security Hub Exposure Findings | Amazon Web Services

Amazon Web Services

How do I use Session Manager port forwarding to connect to my EC2 instance through RDP?

How do I use Session Manager port forwarding to connect to my EC2 instance through RDP?

Amazon Web Services

How do I extend an EBS volume with LVM partitions?

How do I extend an EBS volume with LVM partitions?

Amazon Web Services

AWS Graviton makes it easy to optimize performance, cost, and sustainability | Amazon Web Services

AWS Graviton makes it easy to optimize performance, cost, and sustainability | Amazon Web Services

Amazon Web Services

Run Cloud Adoption Framework workshops with Miro | Amazon Web Services

Run Cloud Adoption Framework workshops with Miro | Amazon Web Services

Amazon Web Services

Getting Started with AWS Cost Optimization Hub | Amazon Web Services

Getting Started with AWS Cost Optimization Hub | Amazon Web Services

Amazon Web Services

Why did my Amazon SQS messages get sent to a dead-letter queue?

Why did my Amazon SQS messages get sent to a dead-letter queue?

Amazon Web Services

Declarative Policies for EC2 | Amazon Web Services

Declarative Policies for EC2 | Amazon Web Services

Amazon Web Services

How do I troubleshoot IAM permission issues for the Billing and Cost Management console?

How do I troubleshoot IAM permission issues for the Billing and Cost Management console?

Amazon Web Services

Integrity at Scale: Inside the Flo Health Mission | Amazon Web Services

Integrity at Scale: Inside the Flo Health Mission | Amazon Web Services

Amazon Web Services

Fueling Success: Small shifts, powerful performance | Amazon Web Services

Fueling Success: Small shifts, powerful performance | Amazon Web Services

Amazon Web Services

WEX enhances customer experience with AI-powered chatbot | Amazon Web Services

WEX enhances customer experience with AI-powered chatbot | Amazon Web Services

Amazon Web Services

Accelerate troubleshooting with Amazon CloudWatch investigations | Amazon Web Services

Accelerate troubleshooting with Amazon CloudWatch investigations | Amazon Web Services

Amazon Web Services

Why is my Windows WorkSpace stuck in the starting, rebooting, or stopping status?

Why is my Windows WorkSpace stuck in the starting, rebooting, or stopping status?

Amazon Web Services

Telemetry Pipelines for AI | Amazon Web Services

Telemetry Pipelines for AI | Amazon Web Services

Amazon Web Services

Getting Control over Security and Observability Data | Amazon Web Services

Getting Control over Security and Observability Data | Amazon Web Services

Amazon Web Services

The Problem with Telemetry Data Volume | Amazon Web Services

The Problem with Telemetry Data Volume | Amazon Web Services

Amazon Web Services

Telemetry Pipelines on AWS | Amazon Web Services

Telemetry Pipelines on AWS | Amazon Web Services

Amazon Web Services

What are Telemetry Pipelines? | Amazon Web Services

What are Telemetry Pipelines? | Amazon Web Services

Amazon Web Services

Using AI for RegEx on Telemetry Pipelines | Amazon Web Services

Using AI for RegEx on Telemetry Pipelines | Amazon Web Services

Amazon Web Services

Multi-Session Support in the AWS Console | Amazon Web Services

Multi-Session Support in the AWS Console | Amazon Web Services

Amazon Web Services

How CloudHedge delivers assessment with AWS ISV Tooling Program at no cost?

How CloudHedge delivers assessment with AWS ISV Tooling Program at no cost?

Amazon Web Services

How customers speed up migration and modernization to AWS with CloudHedge | Amazon Web Services

How customers speed up migration and modernization to AWS with CloudHedge | Amazon Web Services

Amazon Web Services

Chaos Experiment with Amazon ElastiCache | Amazon Web Services

Chaos Experiment with Amazon ElastiCache | Amazon Web Services

Amazon Web Services

Amazon S3 Access Points: Easily manage access for shared datasets on S3 | Amazon Web Services

Amazon S3 Access Points: Easily manage access for shared datasets on S3 | Amazon Web Services

Amazon Web Services

ElastiCache Valkey 8.0 - Savings and Efficiency | Amazon Web Services

ElastiCache Valkey 8.0 - Savings and Efficiency | Amazon Web Services

Amazon Web Services

Pennymac scales document processing with AWS | Amazon Web Services

Pennymac scales document processing with AWS | Amazon Web Services

Amazon Web Services

AWS | Next Level Innovation | Amazon Web Services

AWS | Next Level Innovation | Amazon Web Services

Amazon Web Services

Driving Cloud Innovation: Mindtickle's Partnership with AWS Enterprise Support | Amazon Web Services

Driving Cloud Innovation: Mindtickle's Partnership with AWS Enterprise Support | Amazon Web Services

Amazon Web Services

A Leader's Edge from Executive Insights | Amazon Web Services

A Leader's Edge from Executive Insights | Amazon Web Services

Amazon Web Services

How do I create a custom Amazon WorkSpaces image?

How do I create a custom Amazon WorkSpaces image?

Amazon Web Services

Charles Leclerc tests his AI-generated race track | Amazon Web Services

Charles Leclerc tests his AI-generated race track | Amazon Web Services

Amazon Web Services

Redington Scales India’s Cloud Access with AWS Partnership | Amazon Web Services

Redington Scales India’s Cloud Access with AWS Partnership | Amazon Web Services

Amazon Web Services

How do I prevent the resources in my CloudFormation stack from getting deleted or updated?

How do I prevent the resources in my CloudFormation stack from getting deleted or updated?

Amazon Web Services

How do I troubleshoot authentication errors when I use RDP to connect to an EC2 Windows instance?

How do I troubleshoot authentication errors when I use RDP to connect to an EC2 Windows instance?

Amazon Web Services

Exploring the Possibilities of Digital Twin & AI at the Edge | Amazon Web Services

Exploring the Possibilities of Digital Twin & AI at the Edge | Amazon Web Services

Amazon Web Services

Exploring the Possibilities of Digital Twin & AI at the Edge | Amazon Web Services

Exploring the Possibilities of Digital Twin & AI at the Edge | Amazon Web Services

Amazon Web Services

AWS at the FORMULA 1 AWS GRAN PREMIO DELL'EMILIA-ROMAGNA 2025 | Amazon Web Services

AWS at the FORMULA 1 AWS GRAN PREMIO DELL'EMILIA-ROMAGNA 2025 | Amazon Web Services

Amazon Web Services

What's new in RCPs | Amazon Web Services

What's new in RCPs | Amazon Web Services

Amazon Web Services

API Caching using Amazon ElastiCache | Amazon Web Services

API Caching using Amazon ElastiCache | Amazon Web Services

Amazon Web Services

Pendula: Amazon Nova Customer Testimonial | Amazon Web Services

Pendula: Amazon Nova Customer Testimonial | Amazon Web Services

Amazon Web Services

InDebted : Amazon Nova Customer Testimonial | Amazon Web Services

InDebted : Amazon Nova Customer Testimonial | Amazon Web Services

Amazon Web Services

Amazon DynamoDB global tables with multi-Region strong consistency | Amazon Web Services

Amazon DynamoDB global tables with multi-Region strong consistency | Amazon Web Services

Amazon Web Services

Siemens Mobility uses AWS to operate securely, efficiently on a global scale | Amazon Web Services

Siemens Mobility uses AWS to operate securely, efficiently on a global scale | Amazon Web Services

Amazon Web Services

How do I reuse a knowledge base session in Amazon Bedrock?

How do I reuse a knowledge base session in Amazon Bedrock?

Amazon Web Services

EP5: MBZUAI, CMU : Causal AI, Answering The “Why“ and “What if“ Questions | AWS for AI Podcast

EP5: MBZUAI, CMU : Causal AI, Answering The “Why“ and “What if“ Questions | AWS for AI Podcast

Amazon Web Services

Hema scales time to market developing a data mesh on AWS (Technical) - Cloud Adventures

Hema scales time to market developing a data mesh on AWS (Technical) - Cloud Adventures

Amazon Web Services

Hema scales time to market developing a data mesh on AWS (Business) - Cloud Adventures

Hema scales time to market developing a data mesh on AWS (Business) - Cloud Adventures

Amazon Web Services

How Langfuse Scaled Their AI Platform with AWS: From Open-Source to Enterprise | Amazon Web Services

How Langfuse Scaled Their AI Platform with AWS: From Open-Source to Enterprise | Amazon Web Services

Amazon Web Services

SLMs and LLMs: What’s the Difference? | Amazon Web Services

SLMs and LLMs: What’s the Difference? | Amazon Web Services

Amazon Web Services

SLMs and LLMs: When to use them? | Amazon Web Services

SLMs and LLMs: When to use them? | Amazon Web Services

Amazon Web Services

SLMs on CPU | Amazon Web Services

SLMs on CPU | Amazon Web Services

Amazon Web Services

Intelligent Model Routing | Amazon Web Services

Intelligent Model Routing | Amazon Web Services

Amazon Web Services

SLMs, LLMs, and Model Routing in Agents | Amazon Web Services

SLMs, LLMs, and Model Routing in Agents | Amazon Web Services

Amazon Web Services

The video explains the differences between SLMs and LLMs, and how to choose the right model for specific use cases, including chatbots, data labeling, and sentiment analysis. It also explores hybrid approaches that combine SLMs and LLMs for more complex tasks.

Key Takeaways

Identify the use case for the language model
Determine the required complexity and latency
Choose between SLM and LLM based on the use case
Fine-tune the model for domain-specific tasks
Consider a hybrid approach for complex tasks

💡 SLMs and LLMs have different strengths and weaknesses, and the choice between them depends on the specific use case and requirements.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Engineering

View skill →

Build an LLM and RAG-based Chat Application using AlloyDB and LangChain

FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Ultimate Guide: Deploy Google ADK Agents to Vertex AI & Cloud Run (Step-by-Step Tutorial)

Ultimate Guide: Deploy Google ADK Agents to Vertex AI & Cloud Run (Step-by-Step Tutorial)

Shane | LLM Implementation

How to Make an Asteroids Game Bot (LIVE)

How to Make an Asteroids Game Bot (LIVE)

Using Claude Code + Nano Banana Pro To Create a Dataset of Engineering Drawings

Using Claude Code + Nano Banana Pro To Create a Dataset of Engineering Drawings

Automata Learning Lab

Related Reads

One Triage Pass, Every Trace Format: Stop Letting Fragmentation Shrink Your Eval Coverage

Unify agent traces from multiple formats into one deterministic triage pass to increase eval coverage

Dev.to · Saurav Bhattacharya

Monitoring Costs Are Out of Control — Here's How to Fix It

Learn how to reduce monitoring costs with a 5-step approach to auditing and optimizing your monitoring stack

Dev.to · Samson Tanimawo

Web3 Sees Price Gains Amid Critical Argo CD Security Flaws Impacting Infrastructure

Web3 assets see price gains despite critical security flaws in Argo CD impacting infrastructure, highlighting the need for secure DevOps practices

5 Terraform Architecture Decisions That Matter More Than Most Engineers Think

Learn 5 crucial Terraform architecture decisions to optimize infrastructure deployment

Dev.to · Guilherme Marochio

Containers on Amazon ECS with Mama J