Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]

bycloud · Advanced ·🧠 Large Language Models ·5y ago

Skills: LLM Engineering90%Image Generation Basics85%CV Basics80%

Key Takeaways

The video discusses Image-GPT, an AI model that predicts pixels to complete images, similar to how predictive text systems work. It uses the same logic as GPT-3, but for images, and can generate coherent and natural-looking images.

Full Transcript

predictive text system where it can guess accurately what you want to say when you are texting someone by suggesting a few words on top of your keyboard for you to easily finish your sentence for some of them it strained upon a large amount of text beforehand to be able to learn what comes next after the zap you typed a white predictive text well what if you put the same logic where you predict or autocompletes the next pixel for an image just like when you keep tapping the predictive text where it creates a lone and confusing paragraph but instead produce an image so you may already heard of GPT 2 or even GPT 3 which are language models where you can produce an extremely convincing text which completes whatever you input into the AI model or generate a paragraph out of nowhere that is coherent so openly I made this language model where it dumped all the text on the internet into that model and have a super large amount of parameters which is like 175 billions brute force a near-perfect text generator that can basically do anything I think open iya has completed one of the most important steps towards perfecting data science which is basically brute forcing first like amass in order for mathematicians to come up with a mathematical formula first L required them to brute force and explore the area to understand how things work in that math room ok so back to the topic of outer completing or predictive pixels open there I recently published a paper where they use the exact same logic with gp3 text learning but on images instead the concepts is fairly straightforward and the AI just basically looks at all the previous pixels and add a new one based on all the previous ones then it takes in the newly generated one two and generates the next next pixel and which just continues until you generate a whole image for the input right now apparently they only provide 32 times 32 image size so that's all we can admire for right now so how exactly can it predict a crop image generating natural views seems to be the best fun for this AI if the crop is exactly where the birds are standing on the output has some really interesting results here's one where it even generated reflections - which is crazy the same thing goes to grassland it generates all kinds of things that could potentially be in the grassland Mount Rushmore has a body or leg sometimes which is hilarious I also trimmed another image of a bird and it logically generates back the birds tail and the branch which is amazing removing the grounds of a castle also made some pretty interesting generations and with the pixel art of a castle the AI gets really creative with it and has all kinds of cool castle designs for simple logos like Photoshop it sometimes can generate back the letter P and s but sometimes it just goes funny the same thing happens to Google's logo there are a few ones that are very close to spelling out Google or just a lot of doodles which is pretty funny I tried on various other logos too like Firefox one did really well generating back but Apple just kind of deford so how much can the present pixels affect the generation I use the socially awesome awkward penguin as an example if you cut at the half-way without the presence of the blue pixels it just completes the whole image in red which is pretty logical but if you leave a line of blue pixels the following generations will mostly be blue - for some more fun examples I tested on some meme formats it does not exactly produce a proper meme but the AI seems to know what exactly the objective might be like this is a human billboard no I mean penguin billboard yeah asked for the pictures of animals or human faces it runs really accurately like the Shiba right here you can see or even the guy in the butterfly meme is now wearing a suit what it's struggling is generating saitama's body yes this is amazing uh it has some problems generating from half of the phase to make it look not deformed or probably it's my problem a guy on Twitter called Michael phrasing hope I pronounced that right had some more fascinating results than me he was able to produce something longer which I have no idea to do for example the ocean one right here I'm guessing this is a boat and it is a really cool one in my opinion it's like an extending view into the deep sea that was generated by GPT image but the best result he had is on the human faces according to him it takes around like one hour to complete each of these results in the end knock on the lights is probably the best example of boot force out there opening yeah literally is testing the limit of brute forcing and this is going so well I love it if you want to play with this AI I'll link a collab down in the description which is made by Alfredo and join my disk or if you have any questions or join just for fun follow me on Twitter if you haven't and now see you all next time [Music]

Original Description

This is what happens when we dumped the entire internet's images into an AI, love this. Hope I can see more efficient or higher resolution results soon in the future~ Join my Discord https://discord.gg/NhJZGtH My Twitter https://twitter.com/bycloudai Support me on Patreon https://www.patreon.com/bycloud OpenAI's blog on Image-GPT https://openai.com/blog/image-gpt/ Image-GPT Transformer Colab https://colab.research.google.com/github/apeguero1/image-gpt/blob/master/Transformers_Image_GPT.ipynb#scrollTo=P9tObiU-qVgv The pixel art of a castle is made by https://twitter.com/helzinko music Steaminwaffles - Wait Background Art https://twitter.com/bynicalcynical

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from bycloud · bycloud · 12 of 60

← Previous Next →

Can Deepfake work on Anime?

Can Deepfake work on Anime?

AI that Can Copy Voices

AI that Can Copy Voices

Live Action Is Terrible So AI Turned It Back Into Anime

Live Action Is Terrible So AI Turned It Back Into Anime

2 AIs Enhance Anime to 4K 240FPS, but is it good?

2 AIs Enhance Anime to 4K 240FPS, but is it good?

IRL to Anime With Cartoonization AI

IRL to Anime With Cartoonization AI

How Does AI Generated Songs Sound Like? [OpenAI Jukebox]

How Does AI Generated Songs Sound Like? [OpenAI Jukebox]

AI Makes Any Images Cinematic [3D Photo Inpainting]

AI Makes Any Images Cinematic [3D Photo Inpainting]

AI Generates Anime Faces, And It's Getting Even Better [StyleGAN2]

AI Generates Anime Faces, And It's Getting Even Better [StyleGAN2]

Tech Behind The Meme: Dame Da Ne AI - Single Image Deepfake

Tech Behind The Meme: Dame Da Ne AI - Single Image Deepfake

AI Generates New Light Source for Images [PaintingLight]

AI Generates New Light Source for Images [PaintingLight]

Depixelizing Doom Guy? Mona Lisa in Real Life? The "Upscaling" AI: PULSE

Depixelizing Doom Guy? Mona Lisa in Real Life? The "Upscaling" AI: PULSE

Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]

Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]

AI Generates 3D Human Model from 2D Image [PIFuHD - FacebookAI]

AI Generates 3D Human Model from 2D Image [PIFuHD - FacebookAI]

AI Assisted Masking - Save Your Precious Time Right Now [AE Rotobrush 2]

AI Assisted Masking - Save Your Precious Time Right Now [AE Rotobrush 2]

This AI Reconstruct Real Life Objects From Just Images [NeRF]

This AI Reconstruct Real Life Objects From Just Images [NeRF]

Image Restoration AI - Upscale and Restore Faces with DFDNet

Image Restoration AI - Upscale and Restore Faces with DFDNet

Best Image Colorization AI 2020

Best Image Colorization AI 2020

Image Decomposition AI - Edit Highlights and Textures Easily [Appearance Eraser]

Image Decomposition AI - Edit Highlights and Textures Easily [Appearance Eraser]

Deepfake With Audio Only [Wav2Lip]

Deepfake With Audio Only [Wav2Lip]

Copy IRL, Paste on your PC [AR Cut & Paste]

Copy IRL, Paste on your PC [AR Cut & Paste]

This AI Transform Faces into Hyper-Realistic Cartoon Characters [Toonify]

This AI Transform Faces into Hyper-Realistic Cartoon Characters [Toonify]

This AI Restores Old Photos with Damages Automatically!

This AI Restores Old Photos with Damages Automatically!

Anime Filter with AI - Snapchat vs. TikTok

Anime Filter with AI - Snapchat vs. TikTok

AI Reduces Bandwidth Problems for Video Calls [NVIDIA Maxine]

AI Reduces Bandwidth Problems for Video Calls [NVIDIA Maxine]

AI Motion Capture - Track Your Hands & Body WITHOUT Bodysuit [FrankMocap]

AI Motion Capture - Track Your Hands & Body WITHOUT Bodysuit [FrankMocap]

AI Converts Cartoon Characters To Real Life [Pixel2Style2Pixel]

AI Converts Cartoon Characters To Real Life [Pixel2Style2Pixel]

AI Sky Replacement with SkyAR

AI Sky Replacement with SkyAR

Better Than DAIN? NEW BEST Tool for Boosting Video's FPS with AI [RIFE/Flowframes]

Better Than DAIN? NEW BEST Tool for Boosting Video's FPS with AI [RIFE/Flowframes]

AI That Paints Anything Stroke By Stroke

AI That Paints Anything Stroke By Stroke

What Happens When AI Robots Design Themselves

What Happens When AI Robots Design Themselves

Deepfake Movements with 1 image ONLY [Liquid Warping GAN]

Deepfake Movements with 1 image ONLY [Liquid Warping GAN]

ANYTHING can be a "Green Screen" Now [Real-Time High-Resolution Background Matting]

ANYTHING can be a "Green Screen" Now [Real-Time High-Resolution Background Matting]

AI Transform any Image into Sketch or Line Art [ArtLine]

AI Transform any Image into Sketch or Line Art [ArtLine]

AI That Could Soon Replace Vector Artists [DALL-E]

AI That Could Soon Replace Vector Artists [DALL-E]

Photoshop Detector AI Is Useless

Photoshop Detector AI Is Useless

The Future Of Online Shopping

The Future Of Online Shopping

How The Future of Image Search Would Look Like

How The Future of Image Search Would Look Like

Everyone Can Make 3D Animations Easily Now! [Monster Mash]

Everyone Can Make 3D Animations Easily Now! [Monster Mash]

3D Video Stabilization with AI [NSFF]

3D Video Stabilization with AI [NSFF]

OpenAI’s Sarcastic Chat Bot [GPT-3 API Beta]

OpenAI’s Sarcastic Chat Bot [GPT-3 API Beta]

You Describe & AI Photoshops Faces For You [StyleCLIP]

You Describe & AI Photoshops Faces For You [StyleCLIP]

You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]

You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]

This AI Transfers Anime Back Into Sketch [Anime2Sketch]

This AI Transfers Anime Back Into Sketch [Anime2Sketch]

AI Learns To Play CS:GO By Watching Humans Play!

AI Learns To Play CS:GO By Watching Humans Play!

How AI Fixes The Horrendous CR7 Statue

How AI Fixes The Horrendous CR7 Statue

Best Vocal Isolation & Instrumental Extraction 2021 [lalal.ai vs Spleeter]

Best Vocal Isolation & Instrumental Extraction 2021 [lalal.ai vs Spleeter]

Face Enhance AI Restores Extremely Blurry Faces [GPEN]

Face Enhance AI Restores Extremely Blurry Faces [GPEN]

AI That Only Needs 1 Image To Deepfake [SimSwap]

AI That Only Needs 1 Image To Deepfake [SimSwap]

The Amazing AI Behind the TikTok JoJo Pose Challenge [BoostMonocularDepth + 3DP]

The Amazing AI Behind the TikTok JoJo Pose Challenge [BoostMonocularDepth + 3DP]

StyleGAN3!? - What AI Actually Sees When Generating Faces [Alias-Free GAN]

StyleGAN3!? - What AI Actually Sees When Generating Faces [Alias-Free GAN]

AI generated art goes brrrrr [VQGAN+CLIP]

AI generated art goes brrrrr [VQGAN+CLIP]

AI That Doodles Any Given Description

AI That Doodles Any Given Description

Best AI Motion Capture 2021 - OpenPose vs DeepMotion

Best AI Motion Capture 2021 - OpenPose vs DeepMotion

Anime Image Enhance AI Has Gone To The Next Level [Real-ESRGAN]

Anime Image Enhance AI Has Gone To The Next Level [Real-ESRGAN]

This Video's Voice Is Entirely Made From Audio Deepfake

This Video's Voice Is Entirely Made From Audio Deepfake

I Can’t Sing So I Cloned My Voice w/ AI To Cover Goodbye Sengen (English Cover)

I Can’t Sing So I Cloned My Voice w/ AI To Cover Goodbye Sengen (English Cover)

Best Background Removal - AIs Removes BG Without Green Screen And It's Amazing. [RVM]

Best Background Removal - AIs Removes BG Without Green Screen And It's Amazing. [RVM]

How I Deepfaked VTuber Gawr Gura with AI

How I Deepfaked VTuber Gawr Gura with AI

AI Magic Removal - Removes ANYTHING & Inpaints For You [LaMa]

AI Magic Removal - Removes ANYTHING & Inpaints For You [LaMa]

I Did NOT Expect AI Anime Filter To Be This Good [AnimeGANv2]

I Did NOT Expect AI Anime Filter To Be This Good [AnimeGANv2]

The video teaches how Image-GPT works and its capabilities in generating coherent images. It also discusses the potential applications and limitations of this technology.

Key Takeaways

Understand the basics of predictive text systems
Learn how Image-GPT applies the same logic to predict pixels
Experiment with Image-GPT to generate images
Evaluate the results and limitations of Image-GPT

💡 Image-GPT can generate coherent and natural-looking images by predicting pixels, but its capabilities are limited by the quality of the training data and the complexity of the images.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Engineering

View skill →

Build an LLM and RAG-based Chat Application using AlloyDB and LangChain

FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Ultimate Guide: Deploy Google ADK Agents to Vertex AI & Cloud Run (Step-by-Step Tutorial)

Ultimate Guide: Deploy Google ADK Agents to Vertex AI & Cloud Run (Step-by-Step Tutorial)

Shane | LLM Implementation

How to Make an Asteroids Game Bot (LIVE)

How to Make an Asteroids Game Bot (LIVE)

Using Claude Code + Nano Banana Pro To Create a Dataset of Engineering Drawings

Using Claude Code + Nano Banana Pro To Create a Dataset of Engineering Drawings

Automata Learning Lab

Related Reads

How I Stopped Fighting Hallucinations in LLM Data Extraction

Learn to stop fighting hallucinations in LLM data extraction and improve your data quality

Dev.to · zhongqiyue

Anthropic’s Claude Sonnet 5 Is “Near-Opus Intelligence” For All Plans via @sejournal, @martinibuster

Anthropic's Claude Sonnet 5 model offers near-opus intelligence for all plans, including the free tier, with introductory pricing on tokens

Search Engine Journal

Understanding How LLMs Work: From Text to Tokens, Embeddings, Transformers, and Predictions

Learn how Large Language Models (LLMs) process text into tokens, embeddings, and predictions, and why understanding their inner workings matters for AI applications

Dev.to · Klinsmann R

How ChatGPT Understands Your Questions: A Beginner-Friendly Guide

Learn how ChatGPT understands your questions and improves its responses with fine-tuning and context understanding

Dev.to · Shreyas Rasaikar

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)