Zero-shot Prompting Explained
Key Takeaways
The video explains the concept of zero-shot prompting in large language models, demonstrating how models like GPT 2.5 turbo can perform tasks like sentiment analysis without fine-tuning or providing examples. It highlights the importance of instruction tuning and the potential need for demonstrations and examples in real-world applications.
Full Transcript
hi everyone in this video I want to talk a little bit about zero shot prompting so when we are using these large language models like GPT 2.5 turbo and the latest GPT 4 or cloud or any of these language models that have been trained and that are great at performing iral sorts of tasks as we saw in the previous video when we're doing that typically the way we prompt this models is by an approach or a method called zero shot prompting now what do we mean by zero shot prompting so here's an example to illustrate what we mean by that and I will explain in a minute what that actually entails so I'm going to actually take this prompt and this is a prompt that I already tested and demonstrated in the previous recording that we did where we talked about some examples of prompting and this was a text classification example so I'm going to take this it's easier to show in the playground and to to kind of demonstrate to you how it works with the GPT 2.5 turbo model so this one is doing what we refer to and sentiment analysis you can also call it sentiment classification and the idea with this task is that you would pass to the model some input and then the model would predict the sentiment if it's neutral negative or positive so what I'm going to do here just to improve this prompt a bit I'm going to to actually add this here this is a it's a prompt that is meant to classify text so the model will understand that this is that type of task just by looking at the structure and the way have designed this system prompt and also by the use of this output indicator which as I mentioned in a previous guide the importance of that so you can see here that the model predicted this to be neutral which is the correct label or the correct class for this particular input that we have here so that looks to be working and so the question is how does this model know that it should perform this particular task and classify this input text right the input Tex here into either of these how does it have knowledge and understanding of this task and the reason for that is that this model has been trained on large scale web data right but it has also been train on all sorts of data sets out there as well that might already have examples of you know of something that looks like sentiment classification right so there are tons of data sets out there um there's a lot of content out there that might already have this structure the M out of the box kind of understands how to perform the task right and for this task you might not need to do what we refer to as fine tuning or tune them all to perform this task well so at first first clance right we see that the model is performing really well we see that the assistant sent us this neutral it looks to be working okay and you can test it out by trying a different input here so I'm just going to go here and try I um feeling excited today okay then I'm going to try it out again and you can see that this one is positive so you can see that the small Dot have some knowledge of this particular task it knows the sentiment that this input text is eliciting right so that's very good to see now I'm trying different examples here but in reality as a developer as a researcher you may need to put together large data sets to evaluate whether this model is doing it correctly for now this is autoscope but this is something we are going to discuss in a later video we will be publishing something about f tuning later down the road and we will also be using this particular use case it's a very popular use case this one of text classification where we share like how we try different types of prompting techniques and how it Compares with something like fine tuning I want to go back here I did mention here in this guide uh a really important resource here that discusses this idea of instruction tuning and instruction tuning basically you can you will need something like a prompt response or like an input response where you're training the model to when the model sees those inputs it g it is going to have a certain type of response right so if you're fine tuning these models and the model has you know something that looks quite similar to this type of task it will have an understanding on how to perform the task right so a lot of these models they have those zero shot capabilities that we can leverage and that's really key and important for how we use these models today so if you use something like chbd right when you go there you're not thinking about oh I need to provide the model knowledge or additional knowledge or provide them all examples of how to perform the task no you go there and essentially what you expect as a user is the m to be able to perform that task really well however I must say in reality a lot of real world applications of large language models require you to put together demonstrations to steer the model better for the results that you want to see and for that we have what we refer to as fuse shot in context learning or F shot prompting and that's something that we will also be discussing in a future video as well so uh look forward to that that will be an interesting one as well that we will share with some examples as well so that's the idea of zero shot right so here you can see that I am not really providing the model any examples and how would that look like if I'm providing examples again we will discuss this in a future video but for now that you can see that this small has potentially the capability to do this type of text classification and in fact if you go back to our examples that we shared in the previous guide you will see that there's all these tasks um foundational tasks that we ask them all to perform right like text rization information instruction questions or answering if you look at these examples you will see that there is also these are all zero shot prompts in the sense that we're not really giving the model any examples on how to perform the task we're just telling it here is a piece of text and do something with it right summarize text in extract information and so on we just expect the model to do it really well the good thing is that a lot of researchers are really working hard for these mods to be able to perform really well in the zero shot setting realistically speaking today it is the case that for some tasks at least the more common task it will work so a lot of things like information instruction the mall might be able to do that task you know in a zero shot setting but in a lot of cases in the real world when you're deploying models and so on you may need to consider adding demonstrations and examples to better steer the mole to get the results that you really want for your task so that's a little bit about zero shot prompting hopefully that clarifies a little bit on what it is if you enjoyed the video or found it useful please leave give a like And subscribe to the channnel we'll be posting a lot more new videos about all of these like prompting techniques and if you have any questions about those also leave them in the comments if you have any ideas on videos that you would like to see or maybe a concept that needs further explanation also feel free to comment on that and I'll be looking at all of those and you know decide which ones make sense to do a video on so that's it for today thank you so much for watching the video and see you in the next one
Original Description
To learn how to build with LLMs, check out my new courses here: https://dair-ai.thinkific.com/
Use code YOUTUBE20 to get an extra 20% off. The discount is limited to the first 500 students so make sure to enroll early.
---
In this video, I explain the idea behind zero-shot prompting, what enables it, and how it can be used with LLMs.
More in our guide: https://www.promptingguide.ai/techniques/zeroshot
#ai #llms #promptengineering #machinelearning #programming
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Elvis Saravia · Elvis Saravia · 35 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
▶
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
101 ways to solve search (by Pratik Bhavsar)
Elvis Saravia
TLDR Generation of Scientific Documents | ML Interview #1 with Isabel Cachola
Elvis Saravia
Sentiment Analysis: Key Milestones, Challenges and New Directions
Elvis Saravia
Discriminative Adversarial Search for Abstractive Summarization (by Thomas Scialom)
Elvis Saravia
Question Understanding: COVID-Q: 1,600+ Questions about COVID-19
Elvis Saravia
Getting Started with NLP
Elvis Saravia
Building tools and frameworks for large-scale social media mining (by Dr. Juan M. Banda)
Elvis Saravia
TextAttack: A Framework for Data Augmentation and Adversarial Training in NLP
Elvis Saravia
Dive into Deep Learning (Study Group): Introduction to Deep Learning | Session 1
Elvis Saravia
Dive into Deep Learning (Study Group): Multilayer Perceptrons | Session 4
Elvis Saravia
How I read and annotate ML papers
Elvis Saravia
Keep Learning ML (Session 1) | DSV, CompLex, Modern tools for emotions
Elvis Saravia
Dive into Deep Learning (Study Group): Preliminaries | Session 2
Elvis Saravia
Keep Learning ML #2 | Language-conditioned policy learning, Effective ML Testing, EagerPy
Elvis Saravia
Dive into Deep Learning (Study Group): Linear Neural Networks | Session 3
Elvis Saravia
Dive into Deep Learning (Study Group): Multilayer Perceptrons | Session 4
Elvis Saravia
Keep Learning ML #3 | Contrastively Trained Structured World Models
Elvis Saravia
Dive into Deep Learning (Study Group): Deep Learning Computation with PyTorch | Session 5
Elvis Saravia
Dive into Deep Learning (Study Group): Convolutional Neural Networks | Session 6
Elvis Saravia
Dive into Deep Learning (Study Group): Modern CNNs | Session 7
Elvis Saravia
101 ways to solve neural search with Jina
Elvis Saravia
(Hopefully-Reusable) Life Lessons for PhD Students in NLP
Elvis Saravia
How to save the world and forward your career in 5 easy steps | Women in NLP Talks
Elvis Saravia
Prompt Engineering Overview
Elvis Saravia
Getting Started with the OpenAI Playground
Elvis Saravia
LM-Guided Chain of Thought
Elvis Saravia
Elements of a Prompt
Elvis Saravia
Reasoning with Intermediate Revision and Search with LLMs #chatgpt #ai #llms #science #programming
Elvis Saravia
General Tips for Designing Prompts
Elvis Saravia
Efficient Infinite Context Transformers #ai #machinelearning #research #llms #science
Elvis Saravia
Best Practices and Lessons Learned on Synthetic Data for Language Models #ai #machinelearning #genai
Elvis Saravia
Reducing Hallucinations in Structured Outputs via RAG #chatgpt #ai #llms #programming
Elvis Saravia
Basic Prompt Examples for LLMs
Elvis Saravia
LLM In Context Recall is Prompt Dependent #llms #ai #chatgpt #machinelearning
Elvis Saravia
Zero-shot Prompting Explained
Elvis Saravia
RAG Faithfulness #llms #ai #gpt4
Elvis Saravia
Understanding LLM Settings
Elvis Saravia
Llama 3 is here! | First impressions and thoughts
Elvis Saravia
Llama 3 is Here! #ai #llms #llama3
Elvis Saravia
Microsoft introduces Phi-3 | The most capable small language model?
Elvis Saravia
Microsoft introduces Phi-3! #ai #llms #microsoft
Elvis Saravia
Make Your LLM Fully Utilize the Context #ai #llms #machinelearning
Elvis Saravia
When to Retrieve? #ai #llms #machinelearning
Elvis Saravia
Training an LLM to effectively use information retrieval
Elvis Saravia
State-of-the-art open-source LLM judges #ai #machinelearning #gpt4
Elvis Saravia
Better and Faster LLMs via Multi-token Prediction
Elvis Saravia
AlphaMath Almost Zero #ai #science #machinelearning
Elvis Saravia
SWE-Agent | An LLM-based Software Engineering Agent
Elvis Saravia
[LLM NEWS] AlphaFold 3, xLSTM, OpenAI's Model Spec, DeepSeek-V2, OpenDevin CodeAct 1.0
Elvis Saravia
LLM-powered tool for web scraping #ai #chatgpt #engineering
Elvis Saravia
Learn about LLMs in this NEW course #ai #chatgpt #engineering
Elvis Saravia
[LLM NEWS] KANs, Gemma 10M Context, OpenAI Updates?, Automatic Prompt Engineering, Tokenizer Arena
Elvis Saravia
[LLM News] GPT4-o, Project Astra, Veo, Copilot+ PCs, Gemini 1.5 Flash, Chameleon
Elvis Saravia
Enhancing Answer Selection in LLMs #ai #machinelearning #engineering
Elvis Saravia
On exploring LLMs #ai #promptengineering #chatgpt
Elvis Saravia
Transformers Can Do Arithmetic with the Right Embeddings #ai #machinelearning #engineering
Elvis Saravia
[LLM News] xAI Series B, Codestral, LLM Guide, AutoGen Course, Symbolic Chain-of-Thought
Elvis Saravia
PR-Agent #ai #gpt4 #software
Elvis Saravia
Extracting features from Claude 3 Sonnet
Elvis Saravia
Has prompt engineering been solved?
Elvis Saravia
More on: Prompt Craft
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
5 prompt engineering techniques to get the best out of a legacy project
Dev.to · Marco Coelho
The Real Reason Prompt Engineering Isn't Going Away
Dev.to AI
Common Prompt Engineering Mistakes and How to Avoid Them
Medium · ChatGPT
Day 5: Prompt Engineering Basics (For DevOps & Cloud Engineers)
Medium · AI
🎓
Tutor Explanation
DeepCamp AI