Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]
Key Takeaways
The video discusses Image-GPT, an AI model that predicts pixels to complete images, similar to how predictive text systems work. It uses the same logic as GPT-3, but for images, and can generate coherent and natural-looking images.
Full Transcript
predictive text system where it can guess accurately what you want to say when you are texting someone by suggesting a few words on top of your keyboard for you to easily finish your sentence for some of them it strained upon a large amount of text beforehand to be able to learn what comes next after the zap you typed a white predictive text well what if you put the same logic where you predict or autocompletes the next pixel for an image just like when you keep tapping the predictive text where it creates a lone and confusing paragraph but instead produce an image so you may already heard of GPT 2 or even GPT 3 which are language models where you can produce an extremely convincing text which completes whatever you input into the AI model or generate a paragraph out of nowhere that is coherent so openly I made this language model where it dumped all the text on the internet into that model and have a super large amount of parameters which is like 175 billions brute force a near-perfect text generator that can basically do anything I think open iya has completed one of the most important steps towards perfecting data science which is basically brute forcing first like amass in order for mathematicians to come up with a mathematical formula first L required them to brute force and explore the area to understand how things work in that math room ok so back to the topic of outer completing or predictive pixels open there I recently published a paper where they use the exact same logic with gp3 text learning but on images instead the concepts is fairly straightforward and the AI just basically looks at all the previous pixels and add a new one based on all the previous ones then it takes in the newly generated one two and generates the next next pixel and which just continues until you generate a whole image for the input right now apparently they only provide 32 times 32 image size so that's all we can admire for right now so how exactly can it predict a crop image generating natural views seems to be the best fun for this AI if the crop is exactly where the birds are standing on the output has some really interesting results here's one where it even generated reflections - which is crazy the same thing goes to grassland it generates all kinds of things that could potentially be in the grassland Mount Rushmore has a body or leg sometimes which is hilarious I also trimmed another image of a bird and it logically generates back the birds tail and the branch which is amazing removing the grounds of a castle also made some pretty interesting generations and with the pixel art of a castle the AI gets really creative with it and has all kinds of cool castle designs for simple logos like Photoshop it sometimes can generate back the letter P and s but sometimes it just goes funny the same thing happens to Google's logo there are a few ones that are very close to spelling out Google or just a lot of doodles which is pretty funny I tried on various other logos too like Firefox one did really well generating back but Apple just kind of deford so how much can the present pixels affect the generation I use the socially awesome awkward penguin as an example if you cut at the half-way without the presence of the blue pixels it just completes the whole image in red which is pretty logical but if you leave a line of blue pixels the following generations will mostly be blue - for some more fun examples I tested on some meme formats it does not exactly produce a proper meme but the AI seems to know what exactly the objective might be like this is a human billboard no I mean penguin billboard yeah asked for the pictures of animals or human faces it runs really accurately like the Shiba right here you can see or even the guy in the butterfly meme is now wearing a suit what it's struggling is generating saitama's body yes this is amazing uh it has some problems generating from half of the phase to make it look not deformed or probably it's my problem a guy on Twitter called Michael phrasing hope I pronounced that right had some more fascinating results than me he was able to produce something longer which I have no idea to do for example the ocean one right here I'm guessing this is a boat and it is a really cool one in my opinion it's like an extending view into the deep sea that was generated by GPT image but the best result he had is on the human faces according to him it takes around like one hour to complete each of these results in the end knock on the lights is probably the best example of boot force out there opening yeah literally is testing the limit of brute forcing and this is going so well I love it if you want to play with this AI I'll link a collab down in the description which is made by Alfredo and join my disk or if you have any questions or join just for fun follow me on Twitter if you haven't and now see you all next time [Music]
Original Description
This is what happens when we dumped the entire internet's images into an AI, love this. Hope I can see more efficient or higher resolution results soon in the future~
Join my Discord
https://discord.gg/NhJZGtH
My Twitter
https://twitter.com/bycloudai
Support me on Patreon
https://www.patreon.com/bycloud
OpenAI's blog on Image-GPT
https://openai.com/blog/image-gpt/
Image-GPT Transformer Colab
https://colab.research.google.com/github/apeguero1/image-gpt/blob/master/Transformers_Image_GPT.ipynb#scrollTo=P9tObiU-qVgv
The pixel art of a castle is made by
https://twitter.com/helzinko
music
Steaminwaffles - Wait
Background Art
https://twitter.com/bynicalcynical
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from bycloud · bycloud · 12 of 60
1
2
3
4
5
6
7
8
9
10
11
▶
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Can Deepfake work on Anime?
bycloud
AI that Can Copy Voices
bycloud
Live Action Is Terrible So AI Turned It Back Into Anime
bycloud
2 AIs Enhance Anime to 4K 240FPS, but is it good?
bycloud
IRL to Anime With Cartoonization AI
bycloud
How Does AI Generated Songs Sound Like? [OpenAI Jukebox]
bycloud
AI Makes Any Images Cinematic [3D Photo Inpainting]
bycloud
AI Generates Anime Faces, And It's Getting Even Better [StyleGAN2]
bycloud
Tech Behind The Meme: Dame Da Ne AI - Single Image Deepfake
bycloud
AI Generates New Light Source for Images [PaintingLight]
bycloud
Depixelizing Doom Guy? Mona Lisa in Real Life? The "Upscaling" AI: PULSE
bycloud
Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]
bycloud
AI Generates 3D Human Model from 2D Image [PIFuHD - FacebookAI]
bycloud
AI Assisted Masking - Save Your Precious Time Right Now [AE Rotobrush 2]
bycloud
This AI Reconstruct Real Life Objects From Just Images [NeRF]
bycloud
Image Restoration AI - Upscale and Restore Faces with DFDNet
bycloud
Best Image Colorization AI 2020
bycloud
Image Decomposition AI - Edit Highlights and Textures Easily [Appearance Eraser]
bycloud
Deepfake With Audio Only [Wav2Lip]
bycloud
Copy IRL, Paste on your PC [AR Cut & Paste]
bycloud
This AI Transform Faces into Hyper-Realistic Cartoon Characters [Toonify]
bycloud
This AI Restores Old Photos with Damages Automatically!
bycloud
Anime Filter with AI - Snapchat vs. TikTok
bycloud
AI Reduces Bandwidth Problems for Video Calls [NVIDIA Maxine]
bycloud
AI Motion Capture - Track Your Hands & Body WITHOUT Bodysuit [FrankMocap]
bycloud
AI Converts Cartoon Characters To Real Life [Pixel2Style2Pixel]
bycloud
AI Sky Replacement with SkyAR
bycloud
Better Than DAIN? NEW BEST Tool for Boosting Video's FPS with AI [RIFE/Flowframes]
bycloud
AI That Paints Anything Stroke By Stroke
bycloud
What Happens When AI Robots Design Themselves
bycloud
Deepfake Movements with 1 image ONLY [Liquid Warping GAN]
bycloud
ANYTHING can be a "Green Screen" Now [Real-Time High-Resolution Background Matting]
bycloud
AI Transform any Image into Sketch or Line Art [ArtLine]
bycloud
AI That Could Soon Replace Vector Artists [DALL-E]
bycloud
Photoshop Detector AI Is Useless
bycloud
The Future Of Online Shopping
bycloud
How The Future of Image Search Would Look Like
bycloud
Everyone Can Make 3D Animations Easily Now! [Monster Mash]
bycloud
3D Video Stabilization with AI [NSFF]
bycloud
OpenAI’s Sarcastic Chat Bot [GPT-3 API Beta]
bycloud
You Describe & AI Photoshops Faces For You [StyleCLIP]
bycloud
You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]
bycloud
This AI Transfers Anime Back Into Sketch [Anime2Sketch]
bycloud
AI Learns To Play CS:GO By Watching Humans Play!
bycloud
How AI Fixes The Horrendous CR7 Statue
bycloud
Best Vocal Isolation & Instrumental Extraction 2021 [lalal.ai vs Spleeter]
bycloud
Face Enhance AI Restores Extremely Blurry Faces [GPEN]
bycloud
AI That Only Needs 1 Image To Deepfake [SimSwap]
bycloud
The Amazing AI Behind the TikTok JoJo Pose Challenge [BoostMonocularDepth + 3DP]
bycloud
StyleGAN3!? - What AI Actually Sees When Generating Faces [Alias-Free GAN]
bycloud
AI generated art goes brrrrr [VQGAN+CLIP]
bycloud
AI That Doodles Any Given Description
bycloud
Best AI Motion Capture 2021 - OpenPose vs DeepMotion
bycloud
Anime Image Enhance AI Has Gone To The Next Level [Real-ESRGAN]
bycloud
This Video's Voice Is Entirely Made From Audio Deepfake
bycloud
I Can’t Sing So I Cloned My Voice w/ AI To Cover Goodbye Sengen (English Cover)
bycloud
Best Background Removal - AIs Removes BG Without Green Screen And It's Amazing. [RVM]
bycloud
How I Deepfaked VTuber Gawr Gura with AI
bycloud
AI Magic Removal - Removes ANYTHING & Inpaints For You [LaMa]
bycloud
I Did NOT Expect AI Anime Filter To Be This Good [AnimeGANv2]
bycloud
More on: LLM Engineering
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Embeddings Simplified
Medium · RAG
Building LSTMs with PyTorch and Lightning AI Part 7: Resuming Training with Checkpoints
Dev.to · Rijul Rajesh
How AI Learns with Less Labeled Data
Medium · AI
Comparing Sarvam-30B and Qwen2.5–14B on Spider Text-to-SQL: An Active-Parameter Perspective
Medium · LLM
🎓
Tutor Explanation
DeepCamp AI