GPT-3 explained with examples. Possibilities, and implications.
Key Takeaways
The video explains GPT-3, its capabilities, and implications, covering topics such as few-shot learning, text completion, and potential applications, with tools like GPT-3, Transformer, and OpenAI being utilized.
Full Transcript
[Laughter] [Music] hello today we are going to discuss a very trendy topic it has made the news and it fills up my twitter feed these days and the topic is open ai's gpt gpt3 in this video we will discuss three things what is gpt3 what can gpt3 do we will have some cool apps and examples and how much do we know or not know about gpd3 and what potential problems and limitations might be let's start with the first question what is gpt3 no gpt 3 is not a time zone gpt stands for generative pre-trained transformer and is more or less a very cool and sophisticated language or sequence continuator the word transformer is referring to the transformer of 2017 invented by google 2018's gpt uses the transformer modules and has a total of 110 million parameters but 2019's gpt2 brought 1.5 billion parameters to the game the gpt language model trained on next word prediction can generate continuations of text so well that openai was afraid to release the model worried that it might be used for generating deep fakes and false texts to news well this is 2020 and this year has brought us many things but it has also given us gpt 3 that has no less than 175 billion parameters what will 2021 bring us the gpt-3 architecture is the same as the predecessors gpt2 and gpt only bigger does size matter though well kind of with many parameters the model can solve more complicated problems but with many parameters one should also train on larger and larger amounts of training data to avoid overfitting do you remember science fiction movies where the ai can finally connect to the internet and becomes extremely powerful reaching the singularity and surpassing the human intelligence well we are not in that situation yet but we have to acknowledge the sheer amount of data the 175 billion parameters were trained on gpt3 has more or less seen the whole internet and this has implications we do not grasp yet so what can gpt 3 do after it has seen so much this leads us to the next part of the video as the gbt3 paper already said it's in the title gpt 3 is very good at few shot learning few shot means that only very few training examples are needed to get good performance on the task when we were younger who are we kidding this was a few weeks ago so again when we were only a little bit younger we had to fine-tune pre-trained models like bird or gpd2 on not so small data sets with task specific annotation nowadays similar results to fine tuning can be achieved only by giving to gpt 3 some few tasks specific examples gpt 3 seems to do it all or can it though we will discuss this towards the end of the video for now let's see where gpt 3 shines while the paper discussed benchmarks or not so benchmark cases we will here see what the community on twitter could do with the api of gpd3 if you are interested to place it yourself you can request access and hope you get it few shot is the first topic here in our showcase where michael posts about his successful attempt to convert legal text into commoners english by showing only two examples to gpt3 the model was able to create even more similar results see for yourself well you could say that all this is a very good pattern matching algorithm which it more or less is so here is an example of another kind of pattern matching where paul katzen created one spreadsheet function with gpt3 which can look up state population twitter usernames and employers and do math needless to say that ai before gpt 3 could do these things but only after being trained on lots of such data and could hardly do all these things enumerated here at once not convinced yet we have more yash build a bot with gpt 3 that generates financial statements just amazing unsurprisingly since it is a language model gpt 3 is perfect for text completion like for generating automated answers to emails so you do not have to write those yourself anymore do you emails are perhaps not surprising what about poetry preserving verse and ensuring rhymes i am not so happy about this example in particular since i hoped coding was one of the last things ai would do but here it is check out these examples by sharif shameem where gpt 3 is generating code from just descriptions of what one wants to do what about chatbots with impersonation skills if you always wanted to chat with elon musk but he was too busy to chat with you you can chat with gpt3 impersonating him instead many deep fake storms ahead this beautiful example used google to extract the text from image and then process the text with gpt3 to extract ingredients find an emoji determined if it's unhealthy and give a definition o m g g p t 3 can also be a search engine if you want are there any limits one more example this time a funny one where merc's mensch cosmopol asked gpt3 about god he certainly has no questions anymore now let us speak about the elephant in the room coming to our last question what if this view shot behavior is just happening because gpt 3 has seen it all during training and now our tasks we come up with are not so new to the model at the end of the day well we just do not know and emily bender just said it i don't find anything linguistically interesting about massive language models especially without detailed information about their training data she has a point here especially we need to know what all of the training data was when thinking about harmful biases like in this example with really unpleasant results the huge amount of data means also a lot of compute power used during training the gpt 3 paper says practical large-scale pre-training requires large amounts of computation which is energy intensive training the gpt-3 consumed several thousands days of beta flops per second of compute during pre-training compared to tens of days of petaflops per second for a smaller gpt-2 model this is a problem for normal researchers especially for academia in many countries where compute resources are rather scarce the second problem is of course the environmental impact of the energy consumption during long training sessions the open ai paper brings a defense point about energy consumption the paper says though models like gpt 3 consume significant resources during training they can be surprisingly efficient once trained even with the full gpt 3 generating 100 pages of content from a pre-trained model can cost on the order of 0.4 kilowatt hour or only a few cents in energy costs this is a fair point but we ask ourselves what if this training of bigger and bigger models on even more training data becomes the trend of the future in training and developing huge models consuming lots of energy just goes on and on gpt 3 performs so well as seen in the twitter examples or in the gpt3 paper on sentence generation machine translation and so on so it's normal that the model seems very very attractive to large-scale industry and seems like the future will have a lot of applications with gpt 3 built in them well i do not want to make anyone worry about this but the gpt3 paper also shows that the model is very susceptible to adversarial attacks these attacks show how problematic it might be if gpt-3 deployed somewhere is targeted with malicious adversarial attacks more about adversarial attacks in the next video well a lot is still to be investigated here and i hope that openai will tell us more about the data it has trained the model on for a start if openai becomes more open and grants wider access to the model the ai research landscape will certainly change just an example during the last years the frequent and probably the most annoying paper reviewer question was why didn't you use bird miss coffee bean already prepares herself for the next reviewer question at conferences why didn't you use gpt 3 anyway until we know more miss coffee bean seems to agree with this opinion here which in our translation sounds like this gpt is like a small child that looked at the whole internet does not understand anything but can repeat everything and yearn help good also delivers an example where if asked in a certain manner gpt 3 responds that the sun has an eye a blade of grass has an eye too so we have seen with this example that this child does not have a lot of common sense only that this child can solve what a.i has been trying to solve for decades and this child will become some kind of an oracle i'm afraid well these are certainly interesting times to be alive you
Original Description
What is going on in AI research lately? GPT-3 crashed the party, let's see what it is and what it can do. Hoping we do not forget how problematic it might also become.
➡️ AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/
Outline:
* 00:00 What is GPT-3?
* 02:45 What can GPT-3 do? A Twitter Showcase
* 07:18 How much do we know about GPT-3?
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to boost our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
GPT-3 Paper 📄: Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan et al. "Language models are few-shot learners." arXiv preprint arXiv:2005.14165 (2020). https://arxiv.org/pdf/2005.14165.pdf
🎵 Music: Glitch by Audionautix is licensed under a Creative Commons Attribution license (https://creativecommons.org/licenses/by/4.0/)
Artist: http://audionautix.com/
✍️ Arabic Subtitles by Ali Haidar Ahmad https://www.linkedin.com/in/ali-ahmad-0706a51bb/ .
🔗 Links:
YouTube: https://www.youtube.com/channel/UCobqgqE4i5Kf7wrxRxhToQA/
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
#AICoffeeBreak #OpenAI #GPT3 #gpt #MsCoffeeBean #MachineLearning #AI #research
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from AI Coffee Break with Letitia · AI Coffee Break with Letitia · 8 of 60
1
2
3
4
5
6
7
▶
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
AI Coffee Break - Channel Trailer
AI Coffee Break with Letitia
How to check if a neural network has learned a specific phenomenon?
AI Coffee Break with Letitia
A brief history of the Transformer architecture in NLP
AI Coffee Break with Letitia
Our paper at CVPR 2020 - MUL Workshop and ACL 2020 - ALVR Workshop
AI Coffee Break with Letitia
The Transformer neural network architecture EXPLAINED. “Attention is all you need”
AI Coffee Break with Letitia
Transformer combining Vision and Language? ViLBERT - NLP meets Computer Vision
AI Coffee Break with Letitia
Pre-training of BERT-based Transformer architectures explained – language and vision!
AI Coffee Break with Letitia
GPT-3 explained with examples. Possibilities, and implications.
AI Coffee Break with Letitia
Adversarial Machine Learning explained! | With examples.
AI Coffee Break with Letitia
BERTology meets Biology | Solving biological problems with Transformers
AI Coffee Break with Letitia
Can a neural network tell if an image is mirrored? – Visual Chirality
AI Coffee Break with Letitia
The ultimate intro to Graph Neural Networks. Maybe.
AI Coffee Break with Letitia
Can language models understand? Bender and Koller argument.
AI Coffee Break with Letitia
GANs explained | Generative Adversarial Networks video with showcase!
AI Coffee Break with Letitia
What nobody tells you about MULTIMODAL Machine Learning! 🙊 THE definition.
AI Coffee Break with Letitia
Multimodal Machine Learning models do not work. Here is why. Part 1/2 – The SYMPTOMS
AI Coffee Break with Letitia
Why Multimodal Machine Learning models do not work. Part 2/2 – The CAUSES
AI Coffee Break with Letitia
An image is worth 16x16 words: ViT | Vision Transformer explained
AI Coffee Break with Letitia
AI understanding language!? A roadmap to natural language understanding.
AI Coffee Break with Letitia
"What Can We Do to Improve Peer Review in NLP?" 👀
AI Coffee Break with Letitia
The curse of dimensionality. Or is it a blessing?
AI Coffee Break with Letitia
PCA explained with intuition, a little math and code
AI Coffee Break with Letitia
Data-efficient Image Transformers EXPLAINED! Facebook AI's DeiT paper
AI Coffee Break with Letitia
OpenAI's DALL-E explained. How GPT-3 creates images from descriptions.
AI Coffee Break with Letitia
Leaking training data from GPT-2. How is this possible?
AI Coffee Break with Letitia
OpenAI’s CLIP explained! | Examples, links to code and pretrained model
AI Coffee Break with Letitia
Transformers can do both images and text. Here is why.
AI Coffee Break with Letitia
UMAP explained | The best dimensionality reduction?
AI Coffee Break with Letitia
NVIDIA Jarvis (now NVIDIA Riva) meets Ms. Coffee Bean
AI Coffee Break with Letitia
Transformer in Transformer: Paper explained and visualized | TNT
AI Coffee Break with Letitia
[RANT] Adversarial attack on OpenAI’s CLIP? Are we the fools or the foolers?
AI Coffee Break with Letitia
Pattern Exploiting Training explained! | PET, iPET, ADAPET
AI Coffee Break with Letitia
Deep Learning for Symbolic Mathematics!? | Paper EXPLAINED
AI Coffee Break with Letitia
FNet: Mixing Tokens with Fourier Transforms – Paper Explained
AI Coffee Break with Letitia
Are Pre-trained Convolutions Better than Pre-trained Transformers? – Paper Explained
AI Coffee Break with Letitia
"Please Commit More Blatant Academic Fraud" – A fellow PhD student's response.
AI Coffee Break with Letitia
Scaling Vision Transformers? How much data can a transformer get? #Shorts
AI Coffee Break with Letitia
How cross-modal are vision and language models really? 👀 Seeing past words. [Own work]
AI Coffee Break with Letitia
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization +Tokenizer explained
AI Coffee Break with Letitia
Positional embeddings in transformers EXPLAINED | Demystifying positional encodings.
AI Coffee Break with Letitia
Adding vs. concatenating positional embeddings & Learned positional encodings
AI Coffee Break with Letitia
Self-Attention with Relative Position Representations – Paper explained
AI Coffee Break with Letitia
Saddle points vs. local minima in high dimensional spaces | ❓ #AICoffeeBreakQuiz #Shorts
AI Coffee Break with Letitia
What is the model identifiability problem? | Explained in 60 seconds! | ❓ #AICoffeeBreakQuiz #Shorts
AI Coffee Break with Letitia
Data leakage during data preparation? | Using AntiPatterns to avoid MLOps Mistakes
AI Coffee Break with Letitia
Is today's AI smarter than YOU? #Shorts
AI Coffee Break with Letitia
Convolution vs Cross-Correlation. How most CNNs do not compute convolutions. | ❓ #Shorts
AI Coffee Break with Letitia
Why do we care about cross-correlations vs convolutions | ❓ #AICoffeeBreakQuiz #Shorts
AI Coffee Break with Letitia
The convolution is not shift invariant. | Invariance vs Equivariance | ❓ #AICoffeeBreakQuiz #Shorts
AI Coffee Break with Letitia
How to increase the receptive field in CNNs? | #AICoffeeBreakQuiz #Shorts
AI Coffee Break with Letitia
What is tokenization and how does it work? Tokenizers explained.
AI Coffee Break with Letitia
Foundation Models | On the opportunities and risks of calling pre-trained models “Foundation Models”
AI Coffee Break with Letitia
How modern search engines work – Vector databases explained! | Weaviate open-source
AI Coffee Break with Letitia
Eyes tell all: How to tell that an AI generated a face?
AI Coffee Break with Letitia
Swin Transformer paper animated and explained
AI Coffee Break with Letitia
Data BAD | What Will it Take to Fix Benchmarking for NLU?
AI Coffee Break with Letitia
SimVLM explained | What the paper doesn’t tell you
AI Coffee Break with Letitia
Generalization – Interpolation – Extrapolation in Machine Learning: Which is it now!?
AI Coffee Break with Letitia
Do Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuiz
AI Coffee Break with Letitia
The efficiency misnomer | Size does not matter | What does the number of parameters mean in a model?
AI Coffee Break with Letitia
More on: Reading ML Papers
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
I Spent Weeks Looking for a Research Gap Before I Realized I Was Searching the Wrong Way
Medium · AI
ICMI 2026 Reviews [D]
Reddit r/MachineLearning
Workshop submission for main conference paper under review [D]
Reddit r/MachineLearning
Kept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]
Reddit r/MachineLearning
🎓
Tutor Explanation
DeepCamp AI