Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]

bycloud · Advanced ·🧠 Large Language Models ·5y ago

Key Takeaways

The video discusses Image-GPT, an AI model that predicts pixels to complete images, similar to how predictive text systems work. It uses the same logic as GPT-3, but for images, and can generate coherent and natural-looking images.

Full Transcript

predictive text system where it can guess accurately what you want to say when you are texting someone by suggesting a few words on top of your keyboard for you to easily finish your sentence for some of them it strained upon a large amount of text beforehand to be able to learn what comes next after the zap you typed a white predictive text well what if you put the same logic where you predict or autocompletes the next pixel for an image just like when you keep tapping the predictive text where it creates a lone and confusing paragraph but instead produce an image so you may already heard of GPT 2 or even GPT 3 which are language models where you can produce an extremely convincing text which completes whatever you input into the AI model or generate a paragraph out of nowhere that is coherent so openly I made this language model where it dumped all the text on the internet into that model and have a super large amount of parameters which is like 175 billions brute force a near-perfect text generator that can basically do anything I think open iya has completed one of the most important steps towards perfecting data science which is basically brute forcing first like amass in order for mathematicians to come up with a mathematical formula first L required them to brute force and explore the area to understand how things work in that math room ok so back to the topic of outer completing or predictive pixels open there I recently published a paper where they use the exact same logic with gp3 text learning but on images instead the concepts is fairly straightforward and the AI just basically looks at all the previous pixels and add a new one based on all the previous ones then it takes in the newly generated one two and generates the next next pixel and which just continues until you generate a whole image for the input right now apparently they only provide 32 times 32 image size so that's all we can admire for right now so how exactly can it predict a crop image generating natural views seems to be the best fun for this AI if the crop is exactly where the birds are standing on the output has some really interesting results here's one where it even generated reflections - which is crazy the same thing goes to grassland it generates all kinds of things that could potentially be in the grassland Mount Rushmore has a body or leg sometimes which is hilarious I also trimmed another image of a bird and it logically generates back the birds tail and the branch which is amazing removing the grounds of a castle also made some pretty interesting generations and with the pixel art of a castle the AI gets really creative with it and has all kinds of cool castle designs for simple logos like Photoshop it sometimes can generate back the letter P and s but sometimes it just goes funny the same thing happens to Google's logo there are a few ones that are very close to spelling out Google or just a lot of doodles which is pretty funny I tried on various other logos too like Firefox one did really well generating back but Apple just kind of deford so how much can the present pixels affect the generation I use the socially awesome awkward penguin as an example if you cut at the half-way without the presence of the blue pixels it just completes the whole image in red which is pretty logical but if you leave a line of blue pixels the following generations will mostly be blue - for some more fun examples I tested on some meme formats it does not exactly produce a proper meme but the AI seems to know what exactly the objective might be like this is a human billboard no I mean penguin billboard yeah asked for the pictures of animals or human faces it runs really accurately like the Shiba right here you can see or even the guy in the butterfly meme is now wearing a suit what it's struggling is generating saitama's body yes this is amazing uh it has some problems generating from half of the phase to make it look not deformed or probably it's my problem a guy on Twitter called Michael phrasing hope I pronounced that right had some more fascinating results than me he was able to produce something longer which I have no idea to do for example the ocean one right here I'm guessing this is a boat and it is a really cool one in my opinion it's like an extending view into the deep sea that was generated by GPT image but the best result he had is on the human faces according to him it takes around like one hour to complete each of these results in the end knock on the lights is probably the best example of boot force out there opening yeah literally is testing the limit of brute forcing and this is going so well I love it if you want to play with this AI I'll link a collab down in the description which is made by Alfredo and join my disk or if you have any questions or join just for fun follow me on Twitter if you haven't and now see you all next time [Music]

Original Description

This is what happens when we dumped the entire internet's images into an AI, love this. Hope I can see more efficient or higher resolution results soon in the future~ Join my Discord https://discord.gg/NhJZGtH My Twitter https://twitter.com/bycloudai Support me on Patreon https://www.patreon.com/bycloud OpenAI's blog on Image-GPT https://openai.com/blog/image-gpt/ Image-GPT Transformer Colab https://colab.research.google.com/github/apeguero1/image-gpt/blob/master/Transformers_Image_GPT.ipynb#scrollTo=P9tObiU-qVgv The pixel art of a castle is made by https://twitter.com/helzinko music Steaminwaffles - Wait Background Art https://twitter.com/bynicalcynical
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from bycloud · bycloud · 12 of 60

1 Can Deepfake work on Anime?
Can Deepfake work on Anime?
bycloud
2 AI that Can Copy Voices
AI that Can Copy Voices
bycloud
3 Live Action Is Terrible So AI Turned It Back Into Anime
Live Action Is Terrible So AI Turned It Back Into Anime
bycloud
4 2 AIs Enhance Anime to 4K 240FPS, but is it good?
2 AIs Enhance Anime to 4K 240FPS, but is it good?
bycloud
5 IRL to Anime With Cartoonization AI
IRL to Anime With Cartoonization AI
bycloud
6 How Does AI Generated Songs Sound Like? [OpenAI Jukebox]
How Does AI Generated Songs Sound Like? [OpenAI Jukebox]
bycloud
7 AI Makes Any Images Cinematic [3D Photo Inpainting]
AI Makes Any Images Cinematic [3D Photo Inpainting]
bycloud
8 AI Generates Anime Faces, And It's Getting Even Better [StyleGAN2]
AI Generates Anime Faces, And It's Getting Even Better [StyleGAN2]
bycloud
9 Tech Behind The Meme: Dame Da Ne AI - Single Image Deepfake
Tech Behind The Meme: Dame Da Ne AI - Single Image Deepfake
bycloud
10 AI Generates New Light Source for Images [PaintingLight]
AI Generates New Light Source for Images [PaintingLight]
bycloud
11 Depixelizing Doom Guy? Mona Lisa in Real Life? The "Upscaling" AI: PULSE
Depixelizing Doom Guy? Mona Lisa in Real Life? The "Upscaling" AI: PULSE
bycloud
Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]
Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]
bycloud
13 AI Generates 3D Human Model from 2D Image [PIFuHD - FacebookAI]
AI Generates 3D Human Model from 2D Image [PIFuHD - FacebookAI]
bycloud
14 AI Assisted Masking - Save Your Precious Time Right Now [AE Rotobrush 2]
AI Assisted Masking - Save Your Precious Time Right Now [AE Rotobrush 2]
bycloud
15 This AI Reconstruct Real Life Objects From Just Images [NeRF]
This AI Reconstruct Real Life Objects From Just Images [NeRF]
bycloud
16 Image Restoration AI - Upscale and Restore Faces with DFDNet
Image Restoration AI - Upscale and Restore Faces with DFDNet
bycloud
17 Best Image Colorization AI 2020
Best Image Colorization AI 2020
bycloud
18 Image Decomposition AI - Edit Highlights and Textures Easily [Appearance Eraser]
Image Decomposition AI - Edit Highlights and Textures Easily [Appearance Eraser]
bycloud
19 Deepfake With Audio Only [Wav2Lip]
Deepfake With Audio Only [Wav2Lip]
bycloud
20 Copy IRL, Paste on your PC [AR Cut & Paste]
Copy IRL, Paste on your PC [AR Cut & Paste]
bycloud
21 This AI Transform Faces into Hyper-Realistic Cartoon Characters [Toonify]
This AI Transform Faces into Hyper-Realistic Cartoon Characters [Toonify]
bycloud
22 This AI Restores Old Photos with Damages Automatically!
This AI Restores Old Photos with Damages Automatically!
bycloud
23 Anime Filter with AI - Snapchat vs. TikTok
Anime Filter with AI - Snapchat vs. TikTok
bycloud
24 AI Reduces Bandwidth Problems for Video Calls [NVIDIA Maxine]
AI Reduces Bandwidth Problems for Video Calls [NVIDIA Maxine]
bycloud
25 AI Motion Capture - Track Your Hands & Body WITHOUT Bodysuit [FrankMocap]
AI Motion Capture - Track Your Hands & Body WITHOUT Bodysuit [FrankMocap]
bycloud
26 AI Converts Cartoon Characters To Real Life [Pixel2Style2Pixel]
AI Converts Cartoon Characters To Real Life [Pixel2Style2Pixel]
bycloud
27 AI Sky Replacement with SkyAR
AI Sky Replacement with SkyAR
bycloud
28 Better Than DAIN? NEW BEST Tool for Boosting Video's FPS with AI [RIFE/Flowframes]
Better Than DAIN? NEW BEST Tool for Boosting Video's FPS with AI [RIFE/Flowframes]
bycloud
29 AI That Paints Anything Stroke By Stroke
AI That Paints Anything Stroke By Stroke
bycloud
30 What Happens When AI Robots Design Themselves
What Happens When AI Robots Design Themselves
bycloud
31 Deepfake Movements with 1 image ONLY [Liquid Warping GAN]
Deepfake Movements with 1 image ONLY [Liquid Warping GAN]
bycloud
32 ANYTHING can be a "Green Screen" Now [Real-Time High-Resolution Background Matting]
ANYTHING can be a "Green Screen" Now [Real-Time High-Resolution Background Matting]
bycloud
33 AI Transform any Image into Sketch or Line Art [ArtLine]
AI Transform any Image into Sketch or Line Art [ArtLine]
bycloud
34 AI That Could Soon Replace Vector Artists [DALL-E]
AI That Could Soon Replace Vector Artists [DALL-E]
bycloud
35 Photoshop Detector AI Is Useless
Photoshop Detector AI Is Useless
bycloud
36 The Future Of Online Shopping
The Future Of Online Shopping
bycloud
37 How The Future of Image Search Would Look Like
How The Future of Image Search Would Look Like
bycloud
38 Everyone Can Make 3D Animations Easily Now! [Monster Mash]
Everyone Can Make 3D Animations Easily Now! [Monster Mash]
bycloud
39 3D Video Stabilization with AI [NSFF]
3D Video Stabilization with AI [NSFF]
bycloud
40 OpenAI’s Sarcastic Chat Bot [GPT-3 API Beta]
OpenAI’s Sarcastic Chat Bot [GPT-3 API Beta]
bycloud
41 You Describe & AI Photoshops Faces For You [StyleCLIP]
You Describe & AI Photoshops Faces For You [StyleCLIP]
bycloud
42 You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]
You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]
bycloud
43 This AI Transfers Anime Back Into Sketch [Anime2Sketch]
This AI Transfers Anime Back Into Sketch [Anime2Sketch]
bycloud
44 AI Learns To Play CS:GO By Watching Humans Play!
AI Learns To Play CS:GO By Watching Humans Play!
bycloud
45 How AI Fixes The Horrendous CR7 Statue
How AI Fixes The Horrendous CR7 Statue
bycloud
46 Best Vocal Isolation & Instrumental Extraction 2021 [lalal.ai vs Spleeter]
Best Vocal Isolation & Instrumental Extraction 2021 [lalal.ai vs Spleeter]
bycloud
47 Face Enhance AI Restores Extremely Blurry Faces [GPEN]
Face Enhance AI Restores Extremely Blurry Faces [GPEN]
bycloud
48 AI That Only Needs 1 Image To Deepfake [SimSwap]
AI That Only Needs 1 Image To Deepfake [SimSwap]
bycloud
49 The Amazing AI Behind the TikTok JoJo Pose Challenge [BoostMonocularDepth + 3DP]
The Amazing AI Behind the TikTok JoJo Pose Challenge [BoostMonocularDepth + 3DP]
bycloud
50 StyleGAN3!? - What AI Actually Sees When Generating Faces [Alias-Free GAN]
StyleGAN3!? - What AI Actually Sees When Generating Faces [Alias-Free GAN]
bycloud
51 AI generated art goes brrrrr [VQGAN+CLIP]
AI generated art goes brrrrr [VQGAN+CLIP]
bycloud
52 AI That Doodles Any Given Description
AI That Doodles Any Given Description
bycloud
53 Best AI Motion Capture 2021 - OpenPose vs DeepMotion
Best AI Motion Capture 2021 - OpenPose vs DeepMotion
bycloud
54 Anime Image Enhance AI Has Gone To The Next Level [Real-ESRGAN]
Anime Image Enhance AI Has Gone To The Next Level [Real-ESRGAN]
bycloud
55 This Video's Voice Is Entirely Made From Audio Deepfake
This Video's Voice Is Entirely Made From Audio Deepfake
bycloud
56 I Can’t Sing So I Cloned My Voice w/ AI To Cover Goodbye Sengen (English Cover)
I Can’t Sing So I Cloned My Voice w/ AI To Cover Goodbye Sengen (English Cover)
bycloud
57 Best Background Removal - AIs Removes BG Without Green Screen And It's Amazing. [RVM]
Best Background Removal - AIs Removes BG Without Green Screen And It's Amazing. [RVM]
bycloud
58 How I Deepfaked VTuber Gawr Gura with AI
How I Deepfaked VTuber Gawr Gura with AI
bycloud
59 AI Magic Removal - Removes ANYTHING & Inpaints For You [LaMa]
AI Magic Removal - Removes ANYTHING & Inpaints For You [LaMa]
bycloud
60 I Did NOT Expect AI Anime Filter To Be This Good [AnimeGANv2]
I Did NOT Expect AI Anime Filter To Be This Good [AnimeGANv2]
bycloud

The video teaches how Image-GPT works and its capabilities in generating coherent images. It also discusses the potential applications and limitations of this technology.

Key Takeaways
  1. Understand the basics of predictive text systems
  2. Learn how Image-GPT applies the same logic to predict pixels
  3. Experiment with Image-GPT to generate images
  4. Evaluate the results and limitations of Image-GPT
💡 Image-GPT can generate coherent and natural-looking images by predicting pixels, but its capabilities are limited by the quality of the training data and the complexity of the images.

Related AI Lessons

Embeddings Simplified
Learn the basics of embeddings and how they simplify complex data, a crucial concept in AI and ML
Medium · RAG
Building LSTMs with PyTorch and Lightning AI Part 7: Resuming Training with Checkpoints
Learn to resume LSTM training with checkpoints using PyTorch and Lightning AI, enabling efficient model iteration and development
Dev.to · Rijul Rajesh
How AI Learns with Less Labeled Data
Learn how AI can learn with less labeled data, a crucial aspect of machine learning beyond model selection
Medium · AI
Comparing Sarvam-30B and Qwen2.5–14B on Spider Text-to-SQL: An Active-Parameter Perspective
Learn how to compare large language models like Sarvam-30B and Qwen2.5-14B on the Spider Text-to-SQL benchmark from an active-parameter perspective
Medium · LLM
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →