Zero-shot Prompting Explained

Elvis Saravia · Beginner ·✍️ Prompt Engineering ·2y ago

Key Takeaways

The video explains the concept of zero-shot prompting in large language models, demonstrating how models like GPT 2.5 turbo can perform tasks like sentiment analysis without fine-tuning or providing examples. It highlights the importance of instruction tuning and the potential need for demonstrations and examples in real-world applications.

Full Transcript

hi everyone in this video I want to talk a little bit about zero shot prompting so when we are using these large language models like GPT 2.5 turbo and the latest GPT 4 or cloud or any of these language models that have been trained and that are great at performing iral sorts of tasks as we saw in the previous video when we're doing that typically the way we prompt this models is by an approach or a method called zero shot prompting now what do we mean by zero shot prompting so here's an example to illustrate what we mean by that and I will explain in a minute what that actually entails so I'm going to actually take this prompt and this is a prompt that I already tested and demonstrated in the previous recording that we did where we talked about some examples of prompting and this was a text classification example so I'm going to take this it's easier to show in the playground and to to kind of demonstrate to you how it works with the GPT 2.5 turbo model so this one is doing what we refer to and sentiment analysis you can also call it sentiment classification and the idea with this task is that you would pass to the model some input and then the model would predict the sentiment if it's neutral negative or positive so what I'm going to do here just to improve this prompt a bit I'm going to to actually add this here this is a it's a prompt that is meant to classify text so the model will understand that this is that type of task just by looking at the structure and the way have designed this system prompt and also by the use of this output indicator which as I mentioned in a previous guide the importance of that so you can see here that the model predicted this to be neutral which is the correct label or the correct class for this particular input that we have here so that looks to be working and so the question is how does this model know that it should perform this particular task and classify this input text right the input Tex here into either of these how does it have knowledge and understanding of this task and the reason for that is that this model has been trained on large scale web data right but it has also been train on all sorts of data sets out there as well that might already have examples of you know of something that looks like sentiment classification right so there are tons of data sets out there um there's a lot of content out there that might already have this structure the M out of the box kind of understands how to perform the task right and for this task you might not need to do what we refer to as fine tuning or tune them all to perform this task well so at first first clance right we see that the model is performing really well we see that the assistant sent us this neutral it looks to be working okay and you can test it out by trying a different input here so I'm just going to go here and try I um feeling excited today okay then I'm going to try it out again and you can see that this one is positive so you can see that the small Dot have some knowledge of this particular task it knows the sentiment that this input text is eliciting right so that's very good to see now I'm trying different examples here but in reality as a developer as a researcher you may need to put together large data sets to evaluate whether this model is doing it correctly for now this is autoscope but this is something we are going to discuss in a later video we will be publishing something about f tuning later down the road and we will also be using this particular use case it's a very popular use case this one of text classification where we share like how we try different types of prompting techniques and how it Compares with something like fine tuning I want to go back here I did mention here in this guide uh a really important resource here that discusses this idea of instruction tuning and instruction tuning basically you can you will need something like a prompt response or like an input response where you're training the model to when the model sees those inputs it g it is going to have a certain type of response right so if you're fine tuning these models and the model has you know something that looks quite similar to this type of task it will have an understanding on how to perform the task right so a lot of these models they have those zero shot capabilities that we can leverage and that's really key and important for how we use these models today so if you use something like chbd right when you go there you're not thinking about oh I need to provide the model knowledge or additional knowledge or provide them all examples of how to perform the task no you go there and essentially what you expect as a user is the m to be able to perform that task really well however I must say in reality a lot of real world applications of large language models require you to put together demonstrations to steer the model better for the results that you want to see and for that we have what we refer to as fuse shot in context learning or F shot prompting and that's something that we will also be discussing in a future video as well so uh look forward to that that will be an interesting one as well that we will share with some examples as well so that's the idea of zero shot right so here you can see that I am not really providing the model any examples and how would that look like if I'm providing examples again we will discuss this in a future video but for now that you can see that this small has potentially the capability to do this type of text classification and in fact if you go back to our examples that we shared in the previous guide you will see that there's all these tasks um foundational tasks that we ask them all to perform right like text rization information instruction questions or answering if you look at these examples you will see that there is also these are all zero shot prompts in the sense that we're not really giving the model any examples on how to perform the task we're just telling it here is a piece of text and do something with it right summarize text in extract information and so on we just expect the model to do it really well the good thing is that a lot of researchers are really working hard for these mods to be able to perform really well in the zero shot setting realistically speaking today it is the case that for some tasks at least the more common task it will work so a lot of things like information instruction the mall might be able to do that task you know in a zero shot setting but in a lot of cases in the real world when you're deploying models and so on you may need to consider adding demonstrations and examples to better steer the mole to get the results that you really want for your task so that's a little bit about zero shot prompting hopefully that clarifies a little bit on what it is if you enjoyed the video or found it useful please leave give a like And subscribe to the channnel we'll be posting a lot more new videos about all of these like prompting techniques and if you have any questions about those also leave them in the comments if you have any ideas on videos that you would like to see or maybe a concept that needs further explanation also feel free to comment on that and I'll be looking at all of those and you know decide which ones make sense to do a video on so that's it for today thank you so much for watching the video and see you in the next one

Original Description

To learn how to build with LLMs, check out my new courses here: https://dair-ai.thinkific.com/ Use code YOUTUBE20 to get an extra 20% off. The discount is limited to the first 500 students so make sure to enroll early. --- In this video, I explain the idea behind zero-shot prompting, what enables it, and how it can be used with LLMs. More in our guide: https://www.promptingguide.ai/techniques/zeroshot #ai #llms #promptengineering #machinelearning #programming
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Elvis Saravia · Elvis Saravia · 35 of 60

1 101 ways to solve search (by Pratik Bhavsar)
101 ways to solve search (by Pratik Bhavsar)
Elvis Saravia
2 TLDR Generation of Scientific Documents | ML Interview #1 with Isabel Cachola
TLDR Generation of Scientific Documents | ML Interview #1 with Isabel Cachola
Elvis Saravia
3 Sentiment Analysis: Key Milestones, Challenges and New Directions
Sentiment Analysis: Key Milestones, Challenges and New Directions
Elvis Saravia
4 Discriminative Adversarial Search for Abstractive Summarization (by Thomas Scialom)
Discriminative Adversarial Search for Abstractive Summarization (by Thomas Scialom)
Elvis Saravia
5 Question Understanding: COVID-Q: 1,600+ Questions about COVID-19
Question Understanding: COVID-Q: 1,600+ Questions about COVID-19
Elvis Saravia
6 Getting Started with NLP
Getting Started with NLP
Elvis Saravia
7 Building tools and frameworks for large-scale social media mining (by Dr. Juan M. Banda)
Building tools and frameworks for large-scale social media mining (by Dr. Juan M. Banda)
Elvis Saravia
8 TextAttack: A Framework for Data Augmentation and Adversarial Training in NLP
TextAttack: A Framework for Data Augmentation and Adversarial Training in NLP
Elvis Saravia
9 Dive into Deep Learning (Study Group): Introduction to Deep Learning | Session 1
Dive into Deep Learning (Study Group): Introduction to Deep Learning | Session 1
Elvis Saravia
10 Dive into Deep Learning (Study Group): Multilayer Perceptrons | Session 4
Dive into Deep Learning (Study Group): Multilayer Perceptrons | Session 4
Elvis Saravia
11 How I read and annotate ML papers
How I read and annotate ML papers
Elvis Saravia
12 Keep Learning ML  (Session 1) | DSV, CompLex, Modern tools for emotions
Keep Learning ML (Session 1) | DSV, CompLex, Modern tools for emotions
Elvis Saravia
13 Dive into Deep Learning (Study Group): Preliminaries | Session 2
Dive into Deep Learning (Study Group): Preliminaries | Session 2
Elvis Saravia
14 Keep Learning ML #2 | Language-conditioned policy learning, Effective ML Testing, EagerPy
Keep Learning ML #2 | Language-conditioned policy learning, Effective ML Testing, EagerPy
Elvis Saravia
15 Dive into Deep Learning (Study Group): Linear Neural Networks | Session 3
Dive into Deep Learning (Study Group): Linear Neural Networks | Session 3
Elvis Saravia
16 Dive into Deep Learning (Study Group): Multilayer Perceptrons | Session 4
Dive into Deep Learning (Study Group): Multilayer Perceptrons | Session 4
Elvis Saravia
17 Keep Learning ML #3 | Contrastively Trained Structured World Models
Keep Learning ML #3 | Contrastively Trained Structured World Models
Elvis Saravia
18 Dive into Deep Learning (Study Group): Deep Learning Computation with PyTorch |  Session 5
Dive into Deep Learning (Study Group): Deep Learning Computation with PyTorch | Session 5
Elvis Saravia
19 Dive into Deep Learning (Study Group): Convolutional Neural Networks | Session 6
Dive into Deep Learning (Study Group): Convolutional Neural Networks | Session 6
Elvis Saravia
20 Dive into Deep Learning (Study Group): Modern CNNs | Session 7
Dive into Deep Learning (Study Group): Modern CNNs | Session 7
Elvis Saravia
21 101 ways to solve neural search with Jina
101 ways to solve neural search with Jina
Elvis Saravia
22 (Hopefully-Reusable) Life Lessons for PhD Students in NLP
(Hopefully-Reusable) Life Lessons for PhD Students in NLP
Elvis Saravia
23 How to save the world and forward your career in 5 easy steps | Women in NLP Talks
How to save the world and forward your career in 5 easy steps | Women in NLP Talks
Elvis Saravia
24 Prompt Engineering Overview
Prompt Engineering Overview
Elvis Saravia
25 Getting Started with the OpenAI Playground
Getting Started with the OpenAI Playground
Elvis Saravia
26 LM-Guided Chain of Thought
LM-Guided Chain of Thought
Elvis Saravia
27 Elements of a Prompt
Elements of a Prompt
Elvis Saravia
28 Reasoning with Intermediate Revision and Search with LLMs #chatgpt #ai #llms #science #programming
Reasoning with Intermediate Revision and Search with LLMs #chatgpt #ai #llms #science #programming
Elvis Saravia
29 General Tips for Designing Prompts
General Tips for Designing Prompts
Elvis Saravia
30 Efficient Infinite Context Transformers #ai #machinelearning #research #llms #science
Efficient Infinite Context Transformers #ai #machinelearning #research #llms #science
Elvis Saravia
31 Best Practices and Lessons Learned on Synthetic Data for Language Models #ai #machinelearning #genai
Best Practices and Lessons Learned on Synthetic Data for Language Models #ai #machinelearning #genai
Elvis Saravia
32 Reducing Hallucinations in Structured Outputs via RAG #chatgpt #ai #llms #programming
Reducing Hallucinations in Structured Outputs via RAG #chatgpt #ai #llms #programming
Elvis Saravia
33 Basic Prompt Examples for LLMs
Basic Prompt Examples for LLMs
Elvis Saravia
34 LLM In Context Recall is Prompt Dependent  #llms #ai #chatgpt #machinelearning
LLM In Context Recall is Prompt Dependent #llms #ai #chatgpt #machinelearning
Elvis Saravia
Zero-shot Prompting Explained
Zero-shot Prompting Explained
Elvis Saravia
36 RAG Faithfulness #llms #ai #gpt4
RAG Faithfulness #llms #ai #gpt4
Elvis Saravia
37 Understanding LLM Settings
Understanding LLM Settings
Elvis Saravia
38 Llama 3 is here! | First impressions and thoughts
Llama 3 is here! | First impressions and thoughts
Elvis Saravia
39 Llama 3 is Here! #ai #llms #llama3
Llama 3 is Here! #ai #llms #llama3
Elvis Saravia
40 Microsoft introduces Phi-3 | The most capable small language model?
Microsoft introduces Phi-3 | The most capable small language model?
Elvis Saravia
41 Microsoft introduces Phi-3! #ai #llms #microsoft
Microsoft introduces Phi-3! #ai #llms #microsoft
Elvis Saravia
42 Make Your LLM Fully Utilize the Context #ai #llms #machinelearning
Make Your LLM Fully Utilize the Context #ai #llms #machinelearning
Elvis Saravia
43 When to Retrieve? #ai #llms #machinelearning
When to Retrieve? #ai #llms #machinelearning
Elvis Saravia
44 Training an LLM to effectively use information retrieval
Training an LLM to effectively use information retrieval
Elvis Saravia
45 State-of-the-art open-source LLM judges #ai #machinelearning #gpt4
State-of-the-art open-source LLM judges #ai #machinelearning #gpt4
Elvis Saravia
46 Better and Faster LLMs via Multi-token Prediction
Better and Faster LLMs via Multi-token Prediction
Elvis Saravia
47 AlphaMath Almost Zero #ai #science #machinelearning
AlphaMath Almost Zero #ai #science #machinelearning
Elvis Saravia
48 SWE-Agent | An LLM-based Software Engineering Agent
SWE-Agent | An LLM-based Software Engineering Agent
Elvis Saravia
49 [LLM NEWS] AlphaFold 3, xLSTM, OpenAI's Model Spec, DeepSeek-V2, OpenDevin CodeAct 1.0
[LLM NEWS] AlphaFold 3, xLSTM, OpenAI's Model Spec, DeepSeek-V2, OpenDevin CodeAct 1.0
Elvis Saravia
50 LLM-powered tool for web scraping #ai #chatgpt #engineering
LLM-powered tool for web scraping #ai #chatgpt #engineering
Elvis Saravia
51 Learn about LLMs in this NEW course #ai #chatgpt #engineering
Learn about LLMs in this NEW course #ai #chatgpt #engineering
Elvis Saravia
52 [LLM NEWS] KANs, Gemma 10M Context, OpenAI Updates?, Automatic Prompt Engineering, Tokenizer Arena
[LLM NEWS] KANs, Gemma 10M Context, OpenAI Updates?, Automatic Prompt Engineering, Tokenizer Arena
Elvis Saravia
53 [LLM News] GPT4-o, Project Astra, Veo, Copilot+ PCs, Gemini 1.5 Flash, Chameleon
[LLM News] GPT4-o, Project Astra, Veo, Copilot+ PCs, Gemini 1.5 Flash, Chameleon
Elvis Saravia
54 Enhancing Answer Selection in LLMs #ai #machinelearning #engineering
Enhancing Answer Selection in LLMs #ai #machinelearning #engineering
Elvis Saravia
55 On exploring LLMs #ai #promptengineering #chatgpt
On exploring LLMs #ai #promptengineering #chatgpt
Elvis Saravia
56 Transformers Can Do Arithmetic with the Right Embeddings #ai #machinelearning #engineering
Transformers Can Do Arithmetic with the Right Embeddings #ai #machinelearning #engineering
Elvis Saravia
57 [LLM News] xAI Series B, Codestral, LLM Guide, AutoGen Course, Symbolic Chain-of-Thought
[LLM News] xAI Series B, Codestral, LLM Guide, AutoGen Course, Symbolic Chain-of-Thought
Elvis Saravia
58 PR-Agent #ai #gpt4 #software
PR-Agent #ai #gpt4 #software
Elvis Saravia
59 Extracting features from Claude 3 Sonnet
Extracting features from Claude 3 Sonnet
Elvis Saravia
60 Has prompt engineering been solved?
Has prompt engineering been solved?
Elvis Saravia

This video teaches the concept of zero-shot prompting and its applications in large language models, highlighting the importance of prompt engineering and instruction tuning. It demonstrates how to use zero-shot prompting for tasks like sentiment analysis and text classification.

Key Takeaways
  1. Understand the concept of zero-shot prompting
  2. Learn how to design effective zero-shot prompts
  3. Apply zero-shot prompting to tasks like sentiment analysis and text classification
  4. Experiment with different prompt engineering techniques
  5. Consider the need for fine-tuning and demonstrations in real-world applications
💡 Zero-shot prompting can be a powerful tool for leveraging the capabilities of large language models, but it may require careful prompt engineering and instruction tuning to achieve desired results.

Related Reads

📰
Prompt Engineering Fails Quietly —  Prompt Regression Is Why
Learn to detect hidden regressions in prompt engineering to prevent silent failures in production and ensure reliable AI model performance
Towards Data Science
📰
Prompt Engineering: The Skill That Makes AI Work Better
Learn how to optimize AI performance with prompt engineering, a crucial skill for maximizing AI tool effectiveness
Dev.to · patil rushikesh
📰
5 prompt engineering techniques to get the best out of a legacy project
Learn 5 prompt engineering techniques to improve legacy project performance and why they matter for maintaining outdated codebases
Dev.to · Marco Coelho
📰
I Spent 30 Days Learning AI Prompt Engineering — Here’s What Actually Matters
Learn the key takeaways from a 30-day journey into AI prompt engineering and its practical applications
Medium · ChatGPT
Up next
I Built an AI Agent in 6 Minutes (No Code, No Developer)
HubSpot Marketing
Watch →