GPT-3 explained with examples. Possibilities, and implications.

AI Coffee Break with Letitia · Beginner ·📄 Research Papers Explained ·5y ago

Skills: Reading ML Papers80%Research Methods70%RAG Basics60%

Key Takeaways

The video explains GPT-3, its capabilities, and implications, covering topics such as few-shot learning, text completion, and potential applications, with tools like GPT-3, Transformer, and OpenAI being utilized.

Full Transcript

[Laughter] [Music] hello today we are going to discuss a very trendy topic it has made the news and it fills up my twitter feed these days and the topic is open ai's gpt gpt3 in this video we will discuss three things what is gpt3 what can gpt3 do we will have some cool apps and examples and how much do we know or not know about gpd3 and what potential problems and limitations might be let's start with the first question what is gpt3 no gpt 3 is not a time zone gpt stands for generative pre-trained transformer and is more or less a very cool and sophisticated language or sequence continuator the word transformer is referring to the transformer of 2017 invented by google 2018's gpt uses the transformer modules and has a total of 110 million parameters but 2019's gpt2 brought 1.5 billion parameters to the game the gpt language model trained on next word prediction can generate continuations of text so well that openai was afraid to release the model worried that it might be used for generating deep fakes and false texts to news well this is 2020 and this year has brought us many things but it has also given us gpt 3 that has no less than 175 billion parameters what will 2021 bring us the gpt-3 architecture is the same as the predecessors gpt2 and gpt only bigger does size matter though well kind of with many parameters the model can solve more complicated problems but with many parameters one should also train on larger and larger amounts of training data to avoid overfitting do you remember science fiction movies where the ai can finally connect to the internet and becomes extremely powerful reaching the singularity and surpassing the human intelligence well we are not in that situation yet but we have to acknowledge the sheer amount of data the 175 billion parameters were trained on gpt3 has more or less seen the whole internet and this has implications we do not grasp yet so what can gpt 3 do after it has seen so much this leads us to the next part of the video as the gbt3 paper already said it's in the title gpt 3 is very good at few shot learning few shot means that only very few training examples are needed to get good performance on the task when we were younger who are we kidding this was a few weeks ago so again when we were only a little bit younger we had to fine-tune pre-trained models like bird or gpd2 on not so small data sets with task specific annotation nowadays similar results to fine tuning can be achieved only by giving to gpt 3 some few tasks specific examples gpt 3 seems to do it all or can it though we will discuss this towards the end of the video for now let's see where gpt 3 shines while the paper discussed benchmarks or not so benchmark cases we will here see what the community on twitter could do with the api of gpd3 if you are interested to place it yourself you can request access and hope you get it few shot is the first topic here in our showcase where michael posts about his successful attempt to convert legal text into commoners english by showing only two examples to gpt3 the model was able to create even more similar results see for yourself well you could say that all this is a very good pattern matching algorithm which it more or less is so here is an example of another kind of pattern matching where paul katzen created one spreadsheet function with gpt3 which can look up state population twitter usernames and employers and do math needless to say that ai before gpt 3 could do these things but only after being trained on lots of such data and could hardly do all these things enumerated here at once not convinced yet we have more yash build a bot with gpt 3 that generates financial statements just amazing unsurprisingly since it is a language model gpt 3 is perfect for text completion like for generating automated answers to emails so you do not have to write those yourself anymore do you emails are perhaps not surprising what about poetry preserving verse and ensuring rhymes i am not so happy about this example in particular since i hoped coding was one of the last things ai would do but here it is check out these examples by sharif shameem where gpt 3 is generating code from just descriptions of what one wants to do what about chatbots with impersonation skills if you always wanted to chat with elon musk but he was too busy to chat with you you can chat with gpt3 impersonating him instead many deep fake storms ahead this beautiful example used google to extract the text from image and then process the text with gpt3 to extract ingredients find an emoji determined if it's unhealthy and give a definition o m g g p t 3 can also be a search engine if you want are there any limits one more example this time a funny one where merc's mensch cosmopol asked gpt3 about god he certainly has no questions anymore now let us speak about the elephant in the room coming to our last question what if this view shot behavior is just happening because gpt 3 has seen it all during training and now our tasks we come up with are not so new to the model at the end of the day well we just do not know and emily bender just said it i don't find anything linguistically interesting about massive language models especially without detailed information about their training data she has a point here especially we need to know what all of the training data was when thinking about harmful biases like in this example with really unpleasant results the huge amount of data means also a lot of compute power used during training the gpt 3 paper says practical large-scale pre-training requires large amounts of computation which is energy intensive training the gpt-3 consumed several thousands days of beta flops per second of compute during pre-training compared to tens of days of petaflops per second for a smaller gpt-2 model this is a problem for normal researchers especially for academia in many countries where compute resources are rather scarce the second problem is of course the environmental impact of the energy consumption during long training sessions the open ai paper brings a defense point about energy consumption the paper says though models like gpt 3 consume significant resources during training they can be surprisingly efficient once trained even with the full gpt 3 generating 100 pages of content from a pre-trained model can cost on the order of 0.4 kilowatt hour or only a few cents in energy costs this is a fair point but we ask ourselves what if this training of bigger and bigger models on even more training data becomes the trend of the future in training and developing huge models consuming lots of energy just goes on and on gpt 3 performs so well as seen in the twitter examples or in the gpt3 paper on sentence generation machine translation and so on so it's normal that the model seems very very attractive to large-scale industry and seems like the future will have a lot of applications with gpt 3 built in them well i do not want to make anyone worry about this but the gpt3 paper also shows that the model is very susceptible to adversarial attacks these attacks show how problematic it might be if gpt-3 deployed somewhere is targeted with malicious adversarial attacks more about adversarial attacks in the next video well a lot is still to be investigated here and i hope that openai will tell us more about the data it has trained the model on for a start if openai becomes more open and grants wider access to the model the ai research landscape will certainly change just an example during the last years the frequent and probably the most annoying paper reviewer question was why didn't you use bird miss coffee bean already prepares herself for the next reviewer question at conferences why didn't you use gpt 3 anyway until we know more miss coffee bean seems to agree with this opinion here which in our translation sounds like this gpt is like a small child that looked at the whole internet does not understand anything but can repeat everything and yearn help good also delivers an example where if asked in a certain manner gpt 3 responds that the sun has an eye a blade of grass has an eye too so we have seen with this example that this child does not have a lot of common sense only that this child can solve what a.i has been trying to solve for decades and this child will become some kind of an oracle i'm afraid well these are certainly interesting times to be alive you

Original Description

What is going on in AI research lately? GPT-3 crashed the party, let's see what it is and what it can do. Hoping we do not forget how problematic it might also become. ➡️ AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/ Outline: * 00:00 What is GPT-3? * 02:45 What can GPT-3 do? A Twitter Showcase * 07:18 How much do we know about GPT-3? ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ 🔥 Optionally, pay us a coffee to boost our Coffee Bean production! ☕ Patreon: https://www.patreon.com/AICoffeeBreak Ko-fi: https://ko-fi.com/aicoffeebreak ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ GPT-3 Paper 📄: Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan et al. "Language models are few-shot learners." arXiv preprint arXiv:2005.14165 (2020). https://arxiv.org/pdf/2005.14165.pdf 🎵 Music: Glitch by Audionautix is licensed under a Creative Commons Attribution license (https://creativecommons.org/licenses/by/4.0/) Artist: http://audionautix.com/ ✍️ Arabic Subtitles by Ali Haidar Ahmad https://www.linkedin.com/in/ali-ahmad-0706a51bb/ . 🔗 Links: YouTube: https://www.youtube.com/channel/UCobqgqE4i5Kf7wrxRxhToQA/ Twitter: https://twitter.com/AICoffeeBreak Reddit: https://www.reddit.com/r/AICoffeeBreak/ #AICoffeeBreak #OpenAI #GPT3 #gpt #MsCoffeeBean #MachineLearning #AI #research

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from AI Coffee Break with Letitia · AI Coffee Break with Letitia · 8 of 60

← Previous Next →

AI Coffee Break - Channel Trailer

AI Coffee Break - Channel Trailer

AI Coffee Break with Letitia

How to check if a neural network has learned a specific phenomenon?

How to check if a neural network has learned a specific phenomenon?

AI Coffee Break with Letitia

A brief history of the Transformer architecture in NLP

A brief history of the Transformer architecture in NLP

AI Coffee Break with Letitia

Our paper at CVPR 2020 - MUL Workshop and ACL 2020 - ALVR Workshop

Our paper at CVPR 2020 - MUL Workshop and ACL 2020 - ALVR Workshop

AI Coffee Break with Letitia

The Transformer neural network architecture EXPLAINED. “Attention is all you need”

The Transformer neural network architecture EXPLAINED. “Attention is all you need”

AI Coffee Break with Letitia

Transformer combining Vision and Language? ViLBERT - NLP meets Computer Vision

Transformer combining Vision and Language? ViLBERT - NLP meets Computer Vision

AI Coffee Break with Letitia

Pre-training of BERT-based Transformer architectures explained – language and vision!

Pre-training of BERT-based Transformer architectures explained – language and vision!

AI Coffee Break with Letitia

GPT-3 explained with examples. Possibilities, and implications.

GPT-3 explained with examples. Possibilities, and implications.

AI Coffee Break with Letitia

Adversarial Machine Learning explained! | With examples.

Adversarial Machine Learning explained! | With examples.

AI Coffee Break with Letitia

BERTology meets Biology | Solving biological problems with Transformers

BERTology meets Biology | Solving biological problems with Transformers

AI Coffee Break with Letitia

Can a neural network tell if an image is mirrored? – Visual Chirality

Can a neural network tell if an image is mirrored? – Visual Chirality

AI Coffee Break with Letitia

The ultimate intro to Graph Neural Networks. Maybe.

The ultimate intro to Graph Neural Networks. Maybe.

AI Coffee Break with Letitia

Can language models understand? Bender and Koller argument.

Can language models understand? Bender and Koller argument.

AI Coffee Break with Letitia

GANs explained | Generative Adversarial Networks video with showcase!

GANs explained | Generative Adversarial Networks video with showcase!

AI Coffee Break with Letitia

What nobody tells you about MULTIMODAL Machine Learning! 🙊 THE definition.

What nobody tells you about MULTIMODAL Machine Learning! 🙊 THE definition.

AI Coffee Break with Letitia

Multimodal Machine Learning models do not work. Here is why. Part 1/2 – The SYMPTOMS

Multimodal Machine Learning models do not work. Here is why. Part 1/2 – The SYMPTOMS

AI Coffee Break with Letitia

Why Multimodal Machine Learning models do not work. Part 2/2 – The CAUSES

Why Multimodal Machine Learning models do not work. Part 2/2 – The CAUSES

AI Coffee Break with Letitia

An image is worth 16x16 words: ViT | Vision Transformer explained

An image is worth 16x16 words: ViT | Vision Transformer explained

AI Coffee Break with Letitia

AI understanding language!? A roadmap to natural language understanding.

AI understanding language!? A roadmap to natural language understanding.

AI Coffee Break with Letitia

"What Can We Do to Improve Peer Review in NLP?" 👀

"What Can We Do to Improve Peer Review in NLP?" 👀

AI Coffee Break with Letitia

The curse of dimensionality. Or is it a blessing?

The curse of dimensionality. Or is it a blessing?

AI Coffee Break with Letitia

PCA explained with intuition, a little math and code

PCA explained with intuition, a little math and code

AI Coffee Break with Letitia

Data-efficient Image Transformers EXPLAINED! Facebook AI's DeiT paper

Data-efficient Image Transformers EXPLAINED! Facebook AI's DeiT paper

AI Coffee Break with Letitia

OpenAI's DALL-E explained. How GPT-3 creates images from descriptions.

OpenAI's DALL-E explained. How GPT-3 creates images from descriptions.

AI Coffee Break with Letitia

Leaking training data from GPT-2. How is this possible?

Leaking training data from GPT-2. How is this possible?

AI Coffee Break with Letitia

OpenAI’s CLIP explained! | Examples, links to code and pretrained model

OpenAI’s CLIP explained! | Examples, links to code and pretrained model

AI Coffee Break with Letitia

Transformers can do both images and text. Here is why.

Transformers can do both images and text. Here is why.

AI Coffee Break with Letitia

UMAP explained | The best dimensionality reduction?

UMAP explained | The best dimensionality reduction?

AI Coffee Break with Letitia

NVIDIA Jarvis (now NVIDIA Riva) meets Ms. Coffee Bean

NVIDIA Jarvis (now NVIDIA Riva) meets Ms. Coffee Bean

AI Coffee Break with Letitia

Transformer in Transformer: Paper explained and visualized | TNT

Transformer in Transformer: Paper explained and visualized | TNT

AI Coffee Break with Letitia

[RANT] Adversarial attack on OpenAI’s CLIP? Are we the fools or the foolers?

[RANT] Adversarial attack on OpenAI’s CLIP? Are we the fools or the foolers?

AI Coffee Break with Letitia

Pattern Exploiting Training explained! | PET, iPET, ADAPET

Pattern Exploiting Training explained! | PET, iPET, ADAPET

AI Coffee Break with Letitia

Deep Learning for Symbolic Mathematics!? | Paper EXPLAINED

Deep Learning for Symbolic Mathematics!? | Paper EXPLAINED

AI Coffee Break with Letitia

FNet: Mixing Tokens with Fourier Transforms – Paper Explained

FNet: Mixing Tokens with Fourier Transforms – Paper Explained

AI Coffee Break with Letitia

Are Pre-trained Convolutions Better than Pre-trained Transformers? – Paper Explained

Are Pre-trained Convolutions Better than Pre-trained Transformers? – Paper Explained

AI Coffee Break with Letitia

"Please Commit More Blatant Academic Fraud" – A fellow PhD student's response.

"Please Commit More Blatant Academic Fraud" – A fellow PhD student's response.

AI Coffee Break with Letitia

Scaling Vision Transformers? How much data can a transformer get? #Shorts

Scaling Vision Transformers? How much data can a transformer get? #Shorts

AI Coffee Break with Letitia

How cross-modal are vision and language models really? 👀 Seeing past words. [Own work]

How cross-modal are vision and language models really? 👀 Seeing past words. [Own work]

AI Coffee Break with Letitia

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization +Tokenizer explained

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization +Tokenizer explained

AI Coffee Break with Letitia

Positional embeddings in transformers EXPLAINED | Demystifying positional encodings.

Positional embeddings in transformers EXPLAINED | Demystifying positional encodings.

AI Coffee Break with Letitia

Adding vs. concatenating positional embeddings & Learned positional encodings

Adding vs. concatenating positional embeddings & Learned positional encodings

AI Coffee Break with Letitia

Self-Attention with Relative Position Representations – Paper explained

Self-Attention with Relative Position Representations – Paper explained

AI Coffee Break with Letitia

Saddle points vs. local minima in high dimensional spaces | ❓ #AICoffeeBreakQuiz #Shorts

Saddle points vs. local minima in high dimensional spaces | ❓ #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

What is the model identifiability problem? | Explained in 60 seconds! | ❓ #AICoffeeBreakQuiz #Shorts

What is the model identifiability problem? | Explained in 60 seconds! | ❓ #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

Data leakage during data preparation? | Using AntiPatterns to avoid MLOps Mistakes

Data leakage during data preparation? | Using AntiPatterns to avoid MLOps Mistakes

AI Coffee Break with Letitia

Is today's AI smarter than YOU? #Shorts

Is today's AI smarter than YOU? #Shorts

AI Coffee Break with Letitia

Convolution vs Cross-Correlation. How most CNNs do not compute convolutions. | ❓ #Shorts

Convolution vs Cross-Correlation. How most CNNs do not compute convolutions. | ❓ #Shorts

AI Coffee Break with Letitia

Why do we care about cross-correlations vs convolutions | ❓ #AICoffeeBreakQuiz #Shorts

Why do we care about cross-correlations vs convolutions | ❓ #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

The convolution is not shift invariant. | Invariance vs Equivariance | ❓ #AICoffeeBreakQuiz #Shorts

The convolution is not shift invariant. | Invariance vs Equivariance | ❓ #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

How to increase the receptive field in CNNs? | #AICoffeeBreakQuiz #Shorts

How to increase the receptive field in CNNs? | #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

What is tokenization and how does it work? Tokenizers explained.

What is tokenization and how does it work? Tokenizers explained.

AI Coffee Break with Letitia

Foundation Models | On the opportunities and risks of calling pre-trained models “Foundation Models”

Foundation Models | On the opportunities and risks of calling pre-trained models “Foundation Models”

AI Coffee Break with Letitia

How modern search engines work – Vector databases explained! | Weaviate open-source

How modern search engines work – Vector databases explained! | Weaviate open-source

AI Coffee Break with Letitia

Eyes tell all: How to tell that an AI generated a face?

Eyes tell all: How to tell that an AI generated a face?

AI Coffee Break with Letitia

Swin Transformer paper animated and explained

Swin Transformer paper animated and explained

AI Coffee Break with Letitia

Data BAD | What Will it Take to Fix Benchmarking for NLU?

Data BAD | What Will it Take to Fix Benchmarking for NLU?

AI Coffee Break with Letitia

SimVLM explained | What the paper doesn’t tell you

SimVLM explained | What the paper doesn’t tell you

AI Coffee Break with Letitia

Generalization – Interpolation – Extrapolation in Machine Learning: Which is it now!?

Generalization – Interpolation – Extrapolation in Machine Learning: Which is it now!?

AI Coffee Break with Letitia

Do Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuiz

Do Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuiz

AI Coffee Break with Letitia

The efficiency misnomer | Size does not matter | What does the number of parameters mean in a model?

The efficiency misnomer | Size does not matter | What does the number of parameters mean in a model?

AI Coffee Break with Letitia

This video introduces GPT-3, a powerful language model with 175 billion parameters, and explores its capabilities, such as few-shot learning, text completion, and potential applications, including chatbots, poetry generation, and code generation. The video also discusses the implications of GPT-3 and its potential impact on the AI research landscape.

Key Takeaways

Understand the basics of GPT-3 and its architecture
Explore few-shot learning and its applications
Experiment with GPT-3 for text completion and other tasks
Evaluate the implications of GPT-3 on AI research and society
Fine-tune GPT-3 for specific tasks and applications

💡 GPT-3 has the potential to revolutionize natural language processing and machine learning, but its development and deployment require careful consideration of its implications and potential risks.

🔒 Pro feature: Ask AI to explain this lesson →

More on: Reading ML Papers

View skill →

Automatic Literature Review with GPT-3 - I embedded and indexed all of arXiv into a search engine!

Automatic Literature Review with GPT-3 - I embedded and indexed all of arXiv into a search engine!

Marcos Lopez Caniego - ESASky's JupyterLab widget| JupyterCon 2020

Marcos Lopez Caniego - ESASky's JupyterLab widget| JupyterCon 2020

Obsidian Zotero Integration Plugin | Streamline Your Research Paper Workflow 📝️

Obsidian Zotero Integration Plugin | Streamline Your Research Paper Workflow 📝️

This FULLY FREE Research Agent can BUILD Reports in Minutes!!!

This FULLY FREE Research Agent can BUILD Reports in Minutes!!!

Claude 3.7 Sonnet API | Build a Research Assistant

Claude 3.7 Sonnet API | Build a Research Assistant

I Built An Obsidian AI Research Assistant with Oz...

I Built An Obsidian AI Research Assistant with Oz...

Related Reads

I Spent Weeks Looking for a Research Gap Before I Realized I Was Searching the Wrong Way

Learn how to effectively find research gaps by changing your approach, a crucial skill for AI researchers and academics

ICMI 2026 Reviews [D]

Learn how to interpret ICMI 2026 reviews and improve your paper's acceptance chances

Reddit r/MachineLearning

Workshop submission for main conference paper under review [D]

Learn how to navigate submitting a paper to a non-archival workshop before the final decision of a main conference like ECCV

Reddit r/MachineLearning

Kept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]

Streamline your research with a new Chrome extension and website that integrates 3M papers from arxiv, OpenReview, GitHub, and HuggingFace, including citation graphs and SPECTER2 neighbors, and provide feedback to improve it

Reddit r/MachineLearning

Indians Under House Arrest in America? 😱 Immigration Crisis Explained | SumanTV Classroom

SumanTV Classroom