What is Google Deep Dream? (Basic Theory) | Deep Dream Series #1

Aleksa Gordić - The AI Epiphany · Beginner ·📰 AI News & Updates ·5y ago

Skills: LLM Foundations80%CV Basics70%Prompting Basics60%

Key Takeaways

The video explains the basic theory behind Google Deep Dream, a neural network-based algorithm that amplifies features in input images, and demonstrates its application using tools like PyTorch and Gaussian kernel.

Full Transcript

i'm starting a new video series with this video about the algorithm called uh deep dream and in this video we're going to cover the basic theory behind the algorithm and in the next one we'll go through the code i just open sourced a couple of days ago as you can see it here on the screen and it will give you some really psychedelic looking uh imagery uh but that's not the the point obviously of of this uh of this series uh it's gonna teach you you're gonna learn a lot of more about deep learning itself about like uh computational graphs in pi torch uh and also how to create some really nice artwork using uh the uh this this code but first let's jump to some basic theory about the deep dream algorithm itself so what's the idea behind this algorithm so given an input image the the thing that happens is that the neural network uh see pre-trained on some like classification test or something like on imagenet or mit places 365 basically sees some something in that image certain features and whatever the network sees we just uh tell it to just amplify those exact features and by doing that you get the image on the right that you can see on the screen i found the story how it all started really interesting so mort vince of the guy that actually created the deep dream algorithm the initial uh idea uh actually woke up like around two am and had a nightmare he just sat down on his computer coded this up in maybe 15 minutes and instantaneously he got some really interesting results uh what's even more uncommon maybe is that they did not publish a paper on this they just wrote the blog and subsequently they open sourced uh a simple implementation of the deep dream algorithm and you can see on the screen they started getting some really weird creatures and results uh titled like pig snail or the dogfish it just sparked a lot of imagination from the very beginning and those creatures came from this uh cloud image you can see in the top left and also the the interesting thing is that depending on what you give to the network what you input as the as the as the input for example if you if you give it some scene it will start populating it with some towers or buildings and if you give it some kind of maybe a leaf or something it will start populating it with insects and animals so it also depends on the on the data that the network itself was trained on not only on the input image but i'll get into a bit more details a bit later but deep dream was not only about like creating this psychedelic crazy looking images it was actually a part of the effort to better understand neural networks uh a field called interpretability of neural networks and you can see in the in the bottom left those examples so what they actually told the network was hey uh tell me what do you see what do you think dumbbell is and going in reverse and reconstructing the image they got these four images in the bottom left and you can see that the network actually learned to always expect to see some muscular hands of a weightlifter next to the dumbbell or a bodybuilder whatever and did not manage to actually extract the concept itself of a dumbbell how we humans actually think about the dumbbell itself as a independent object not dependent on the context where we usually see it and let's get a bit deeper into how the thing actually works so what you do you have a pre-trained neural network as i already said uh pre-trained on some classification data set like imagenet or mit places 365 and you pass in the input image and you feed it through to a certain layer and when you come to a certain layer whatever activations you have there you want to amplify them and you do that by doing the gradient ascent and not descent for example you took a simple l2 or msc loss on those activations and you wanted to do a classical procedure that's a minimization uh what you would end up with would be the that the input image would either become black or more probably just some random noise image on the other hand if you tell it to maximize instead of minimize those activations you get the deep dream results that we're familiar with as the image on the right here now the interesting thing is that depending on which exact layer in the network you're trying to optimize maximize in this case you get different interesting results so if you take some shallower layers from like some of the first layers in the neural network and you try and do this uh procedure i just described you'll get this low-level app patterns really interesting patterns and colors edges and corners the stuff that we know the usual low-level uh vision um stuff but on the other hand if you took some deeper layer in the network you would start getting some more increasingly complex features such as like maybe animal eyes or snout or something i'm assuming of course here that the network was trained on imagenet which contained a lot of like images of dogs and cats and animals in general but if we had some other data set then those high level features would be some other things and not eyes or snouts now i already mentioned imagenet and mit places 365 but those are not the only data sets out there obviously you can use some other ones but just that i use these two and these two are really common in the deep dream community so you can see on the left here the image was produced by a network that was trained on a lot of imagery that contained animals on the right you can see uh the network that was trained on the mit places which contains some scenery human-built objects uh pubs cafes and stuff like that and that's the reason it has these uh human-like built uh things uh ingrained into the image itself going a bit more deeper into the uh theory uh so i already mentioned gradient ascent so basically the only difference is you just change the sign uh when you do the update for a single pixel you don't do the minus where you use the learning rate and the the gradients you just switch it to plus but there are also some more advanced stuff like uh image pyramid so the uh the idea there is so first what what's image pyramid so it's basically you have your your base image and you just down sample it by some ratio let's say 1.5 and then you do it again and again again and you just stack those images and that's the image pyramid so now why do we do that now the thing is there is something called receptive field of network and when you feed all of those images from the image pyramid with different sizes to the network you will see different features in all of them some will be more fine-grained some will be like coarser features and combining all of those across the layers you'll get a really nice looking deep dream image so why they work is that uh when you have a smaller image in the input a single neuron somewhere in some deeper layer will be able to cover most of the pixels in that input image and they will be able to manipulate and tweak those pixels uh whereas when you have bigger image that pixel will only can will only be able to change and modify just a small area of the input image so the algorithm proceeds like this you just start with the smallest resolution you do the deep dream then you just up sample the image then again you do the deep dream and then again again until you get to the base image of the pyramid the biggest resolution and you just stop there are two more important things that you need to implement in order to have some nice images uh one is uh the gradient smoothing and normalization so basically if you only took gradients that you get from this optimization and just blindly apply them the image will quickly explode out of the the bounds that we are keeping it in so what you have to do is to just treat the gradients as a as a as a simple image and take a gaussian kernel and just uh filter out the through the image your convolution with the with the with the kernel and that will just smooth now the gradients and additionally you just want to take the maybe standard deviation of the of the whole thing and just scale all of those gradients with that standard deviation and that will just keep the things uh scoped aside from that it's quite uh usual to see to to just uh after applying the gradient of sand step you just want to clip the image so that you keep it in the designated boundary that you want to keep it in and those are usually image net normalized boundaries that should be enough for you to understand the next coding video where we'll go step through the code and just try and run this ourselves and get some really awesome looking images yeah and by the way don't forget to check out the description where i've linked the github code uh feel free to just play around with it before i create the actual video of explaining how to use it i guess most of you are coders or experienced machine learning engineers already so go ahead and play and as always if you found this video useful and only if you found it useful uh please consider subscribing and sharing the video that would mean a lot to me and yeah see you in the next video

Original Description

❤️ Become The AI Epiphany Patreon ❤️ ► https://www.patreon.com/theaiepiphany ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Learn the basic theory behind the Deep Dream algorithm. You'll learn about: ✔️ What Google Deep Dream is and how it all began ✔️Impacts of shallower/deeper layers on the final result ✔️Basic concepts: gradient ascent, image pyramids, etc. ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ ✅ DeepDream in PyTorch: https://github.com/gordicaleksa/pytorch-deepdream ✅ Original blog: https://ai.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ ⌚️ Timetable: 1:10 - How it all started in 2015 3:30 - Simple explanation of how it works 4:30 - Impact of using shallow layers 5:30 - Dataset matters (ImageNet vs Places 365) 6:10 - More details on how it works ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ 💰 BECOME A PATREON OF THE AI EPIPHANY ❤️ If these videos, GitHub projects, and blogs help you, consider helping me out by supporting me on Patreon! The AI Epiphany ► https://www.patreon.com/theaiepiphany One-time donations: https://www.paypal.com/paypalme/theaiepiphany Much love! ❤️ ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ 💡 The AI Epiphany is a channel dedicated to simplifying the field of AI using creative visualizations and in general, a stronger focus on geometrical and visual intuition, rather than the algebraic and numerical "intuition". ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ 👋 CONNECT WITH ME ON SOCIAL LinkedIn ► https://www.linkedin.com/in/aleksagordic/ Twitter ► https://twitter.com/gordic_aleksa Instagram ► https://www.instagram.com/aiepiphany/ Facebook ► https://www.facebook.com/aiepiphany/ 👨‍👩‍👧‍👦 JOIN OUR DISCORD COMMUNITY: Discord ► https://discord.gg/peBrCpheKE 📢 SUBSCRIBE TO MY MONTHLY AI NEWSLETTER: Substack ► https://aiepiphany.substack.com/ 💻 FOLLOW ME ON GITHUB FOR COOL PROJECTS: GitHub ► https://github.com/gordicaleksa 📚 FOLLOW ME ON MEDIUM: Medium ► https://gordicaleksa.medium.com/ ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ #deepdream #deeplearning #art #ai

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Aleksa Gordić - The AI Epiphany · Aleksa Gordić - The AI Epiphany · 10 of 60

← Previous Next →

Intro | Neural Style Transfer #1

Intro | Neural Style Transfer #1

Aleksa Gordić - The AI Epiphany

Basic Theory | Neural Style Transfer #2

Basic Theory | Neural Style Transfer #2

Aleksa Gordić - The AI Epiphany

Optimization method | Neural Style Transfer #3

Optimization method | Neural Style Transfer #3

Aleksa Gordić - The AI Epiphany

Advanced Theory | Neural Style Transfer #4

Advanced Theory | Neural Style Transfer #4

Aleksa Gordić - The AI Epiphany

Anyone can make deepfakes now!

Anyone can make deepfakes now!

Aleksa Gordić - The AI Epiphany

What is Computer Vision? | The Art of Creating Seeing Machines

What is Computer Vision? | The Art of Creating Seeing Machines

Aleksa Gordić - The AI Epiphany

Feed-forward method | Neural Style Transfer #5

Feed-forward method | Neural Style Transfer #5

Aleksa Gordić - The AI Epiphany

Alan Turing | Computing Machinery and Intelligence

Alan Turing | Computing Machinery and Intelligence

Aleksa Gordić - The AI Epiphany

Feed-forward method (training) | Neural Style Transfer #6

Feed-forward method (training) | Neural Style Transfer #6

Aleksa Gordić - The AI Epiphany

What is Google Deep Dream? (Basic Theory) | Deep Dream Series #1

What is Google Deep Dream? (Basic Theory) | Deep Dream Series #1

Aleksa Gordić - The AI Epiphany

Semantic Segmentation in PyTorch | Neural Style Transfer #7

Semantic Segmentation in PyTorch | Neural Style Transfer #7

Aleksa Gordić - The AI Epiphany

How to get started with Machine Learning

How to get started with Machine Learning

Aleksa Gordić - The AI Epiphany

How to learn PyTorch? (3 easy steps) | 2021

How to learn PyTorch? (3 easy steps) | 2021

Aleksa Gordić - The AI Epiphany

PyTorch or TensorFlow?

PyTorch or TensorFlow?

Aleksa Gordić - The AI Epiphany

3 Machine Learning Projects For Beginners (Highly visual) | 2021

3 Machine Learning Projects For Beginners (Highly visual) | 2021

Aleksa Gordić - The AI Epiphany

Machine Learning Projects (Intermediate level) | 2021

Machine Learning Projects (Intermediate level) | 2021

Aleksa Gordić - The AI Epiphany

Cheapest (0$) Deep Learning Hardware Options | 2021

Cheapest (0$) Deep Learning Hardware Options | 2021

Aleksa Gordić - The AI Epiphany

How to learn deep learning? (Transformers Example)

How to learn deep learning? (Transformers Example)

Aleksa Gordić - The AI Epiphany

How do transformers work? (Attention is all you need)

How do transformers work? (Attention is all you need)

Aleksa Gordić - The AI Epiphany

Developing a deep learning project (case study on transformer)

Developing a deep learning project (case study on transformer)

Aleksa Gordić - The AI Epiphany

Vision Transformer (ViT) - An image is worth 16x16 words | Paper Explained

Vision Transformer (ViT) - An image is worth 16x16 words | Paper Explained

Aleksa Gordić - The AI Epiphany

GPT-3 - Language Models are Few-Shot Learners | Paper Explained

GPT-3 - Language Models are Few-Shot Learners | Paper Explained

Aleksa Gordić - The AI Epiphany

Google DeepMind's AlphaFold 2 explained! (Protein folding, AlphaFold 1, a glimpse into AlphaFold 2)

Google DeepMind's AlphaFold 2 explained! (Protein folding, AlphaFold 1, a glimpse into AlphaFold 2)

Aleksa Gordić - The AI Epiphany

Attention Is All You Need (Transformer) | Paper Explained

Attention Is All You Need (Transformer) | Paper Explained

Aleksa Gordić - The AI Epiphany

Graph Attention Networks (GAT) | GNN Paper Explained

Graph Attention Networks (GAT) | GNN Paper Explained

Aleksa Gordić - The AI Epiphany

Graph Convolutional Networks (GCN) | GNN Paper Explained

Graph Convolutional Networks (GCN) | GNN Paper Explained

Aleksa Gordić - The AI Epiphany

Graph SAGE - Inductive Representation Learning on Large Graphs | GNN Paper Explained

Graph SAGE - Inductive Representation Learning on Large Graphs | GNN Paper Explained

Aleksa Gordić - The AI Epiphany

PinSage - Graph Convolutional Neural Networks for Web-Scale Recommender Systems | Paper Explained

PinSage - Graph Convolutional Neural Networks for Web-Scale Recommender Systems | Paper Explained

Aleksa Gordić - The AI Epiphany

OpenAI CLIP - Connecting Text and Images | Paper Explained

OpenAI CLIP - Connecting Text and Images | Paper Explained

Aleksa Gordić - The AI Epiphany

Temporal Graph Networks (TGN) | GNN Paper Explained

Temporal Graph Networks (TGN) | GNN Paper Explained

Aleksa Gordić - The AI Epiphany

Graph Neural Network Project Update! (I'm coding GAT from scratch)

Graph Neural Network Project Update! (I'm coding GAT from scratch)

Aleksa Gordić - The AI Epiphany

Graph Attention Network Project Walkthrough

Graph Attention Network Project Walkthrough

Aleksa Gordić - The AI Epiphany

How to get started with Graph ML? (Blog walkthrough)

How to get started with Graph ML? (Blog walkthrough)

Aleksa Gordić - The AI Epiphany

DQN - Playing Atari with Deep Reinforcement Learning | RL Paper Explained

DQN - Playing Atari with Deep Reinforcement Learning | RL Paper Explained

Aleksa Gordić - The AI Epiphany

AlphaGo - Mastering the game of Go with deep neural networks and tree search | RL Paper Explained

AlphaGo - Mastering the game of Go with deep neural networks and tree search | RL Paper Explained

Aleksa Gordić - The AI Epiphany

DeepMind's AlphaGo Zero and AlphaZero | RL paper explained

DeepMind's AlphaGo Zero and AlphaZero | RL paper explained

Aleksa Gordić - The AI Epiphany

OpenAI - Solving Rubik's Cube with a Robot Hand | RL paper explained

OpenAI - Solving Rubik's Cube with a Robot Hand | RL paper explained

Aleksa Gordić - The AI Epiphany

MuZero - Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model | RL Paper explained

MuZero - Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model | RL Paper explained

Aleksa Gordić - The AI Epiphany

EfficientNetV2 - Smaller Models and Faster Training | Paper explained

EfficientNetV2 - Smaller Models and Faster Training | Paper explained

Aleksa Gordić - The AI Epiphany

Implementing DeepMind's DQN from scratch! | Project Update

Implementing DeepMind's DQN from scratch! | Project Update

Aleksa Gordić - The AI Epiphany

MLP-Mixer: An all-MLP Architecture for Vision | Paper explained

MLP-Mixer: An all-MLP Architecture for Vision | Paper explained

Aleksa Gordić - The AI Epiphany

DeepMind's Android RL Environment - AndroidEnv

DeepMind's Android RL Environment - AndroidEnv

Aleksa Gordić - The AI Epiphany

When Vision Transformers Outperform ResNets without Pretraining | Paper Explained

When Vision Transformers Outperform ResNets without Pretraining | Paper Explained

Aleksa Gordić - The AI Epiphany

Non-Parametric Transformers | Paper explained

Non-Parametric Transformers | Paper explained

Aleksa Gordić - The AI Epiphany

Chip Placement with Deep Reinforcement Learning | Paper Explained

Chip Placement with Deep Reinforcement Learning | Paper Explained

Aleksa Gordić - The AI Epiphany

Text Style Brush - Transfer of text aesthetics from a single example | Paper Explained

Text Style Brush - Transfer of text aesthetics from a single example | Paper Explained

Aleksa Gordić - The AI Epiphany

Graphormer - Do Transformers Really Perform Bad for Graph Representation? | Paper Explained

Graphormer - Do Transformers Really Perform Bad for Graph Representation? | Paper Explained

Aleksa Gordić - The AI Epiphany

GANs N' Roses: Stable, Controllable, Diverse Image to Image Translation | Paper Explained

GANs N' Roses: Stable, Controllable, Diverse Image to Image Translation | Paper Explained

Aleksa Gordić - The AI Epiphany

VQ-VAEs: Neural Discrete Representation Learning | Paper + PyTorch Code Explained

VQ-VAEs: Neural Discrete Representation Learning | Paper + PyTorch Code Explained

Aleksa Gordić - The AI Epiphany

VQ-GAN: Taming Transformers for High-Resolution Image Synthesis | Paper Explained

VQ-GAN: Taming Transformers for High-Resolution Image Synthesis | Paper Explained

Aleksa Gordić - The AI Epiphany

Multimodal Few-Shot Learning with Frozen Language Models | Paper Explained

Multimodal Few-Shot Learning with Frozen Language Models | Paper Explained

Aleksa Gordić - The AI Epiphany

Focal Transformer: Focal Self-attention for Local-Global Interactions in Vision Transformers

Focal Transformer: Focal Self-attention for Local-Global Interactions in Vision Transformers

Aleksa Gordić - The AI Epiphany

AudioCLIP: Extending CLIP to Image, Text and Audio | Paper Explained

AudioCLIP: Extending CLIP to Image, Text and Audio | Paper Explained

Aleksa Gordić - The AI Epiphany

RMA: Rapid Motor Adaptation for Legged Robots | Paper Explained

RMA: Rapid Motor Adaptation for Legged Robots | Paper Explained

Aleksa Gordić - The AI Epiphany

DALL-E: Zero-Shot Text-to-Image Generation | Paper Explained

DALL-E: Zero-Shot Text-to-Image Generation | Paper Explained

Aleksa Gordić - The AI Epiphany

DETR: End-to-End Object Detection with Transformers | Paper Explained

DETR: End-to-End Object Detection with Transformers | Paper Explained

Aleksa Gordić - The AI Epiphany

DINO: Emerging Properties in Self-Supervised Vision Transformers | Paper Explained!

DINO: Emerging Properties in Self-Supervised Vision Transformers | Paper Explained!

Aleksa Gordić - The AI Epiphany

DeepMind DetCon: Efficient Visual Pretraining with Contrastive Detection | Paper Explained

DeepMind DetCon: Efficient Visual Pretraining with Contrastive Detection | Paper Explained

Aleksa Gordić - The AI Epiphany

Do Vision Transformers See Like Convolutional Neural Networks? | Paper Explained

Do Vision Transformers See Like Convolutional Neural Networks? | Paper Explained

Aleksa Gordić - The AI Epiphany

Fastformer: Additive Attention Can Be All You Need | Paper Explained

Fastformer: Additive Attention Can Be All You Need | Paper Explained

Aleksa Gordić - The AI Epiphany

This video teaches the basic theory behind Google Deep Dream and its application in image processing, demonstrating how to amplify features in input images using neural networks. Viewers will learn about the algorithm's history, its components, and how to implement it using various tools. The video is intended for beginners interested in deep learning and computer vision.

Key Takeaways

Feed a base image to the network and perform gradient ascent to amplify activations
Downsample the image and feed it to the network again to get different features at different resolutions
Combine features across layers to get a nice-looking Deep Dream image
Upsample an image
Apply Deep Dream
Repeat until reaching base image resolution
Implement gradient smoothing and normalization
Filter gradients with a Gaussian kernel

💡 The Deep Dream algorithm uses gradient ascent to amplify activations in a neural network, resulting in surreal and dream-like images, and its implementation involves upsampling, downsampling, and combining features across layers.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related Reads

Is the AI bubble about to burst? A data scientist’s honest take

A data scientist shares their honest take on whether the AI bubble is about to burst, providing an informed perspective on the technology's potential and limitations

Is the AI bubble about to burst? A data scientist’s honest take

A data scientist shares their honest take on whether the AI bubble is about to burst, providing a grounded perspective on the technology

Medium · Machine Learning

Is the AI bubble about to burst? A data scientist’s honest take

A data scientist shares their honest take on whether the AI bubble is about to burst, providing a grounded perspective on the technology

Medium · Data Science

Is the AI bubble about to burst? A data scientist’s honest take

A data scientist shares their honest take on whether the AI bubble is about to burst, providing a grounded perspective on the technology

Medium · Startup

Chapters (5)

1:10 How it all started in 2015

3:30 Simple explanation of how it works

4:30 Impact of using shallow layers

5:30 Dataset matters (ImageNet vs Places 365)

6:10 More details on how it works

Tackling Malaria in Africa with Technology at the Huawei ICT Competition