What is Google Deep Dream? (Basic Theory) | Deep Dream Series #1

Aleksa Gordić - The AI Epiphany · Beginner ·📰 AI News & Updates ·5y ago

Key Takeaways

The video explains the basic theory behind Google Deep Dream, a neural network-based algorithm that amplifies features in input images, and demonstrates its application using tools like PyTorch and Gaussian kernel.

Full Transcript

i'm starting a new video series with this video about the algorithm called uh deep dream and in this video we're going to cover the basic theory behind the algorithm and in the next one we'll go through the code i just open sourced a couple of days ago as you can see it here on the screen and it will give you some really psychedelic looking uh imagery uh but that's not the the point obviously of of this uh of this series uh it's gonna teach you you're gonna learn a lot of more about deep learning itself about like uh computational graphs in pi torch uh and also how to create some really nice artwork using uh the uh this this code but first let's jump to some basic theory about the deep dream algorithm itself so what's the idea behind this algorithm so given an input image the the thing that happens is that the neural network uh see pre-trained on some like classification test or something like on imagenet or mit places 365 basically sees some something in that image certain features and whatever the network sees we just uh tell it to just amplify those exact features and by doing that you get the image on the right that you can see on the screen i found the story how it all started really interesting so mort vince of the guy that actually created the deep dream algorithm the initial uh idea uh actually woke up like around two am and had a nightmare he just sat down on his computer coded this up in maybe 15 minutes and instantaneously he got some really interesting results uh what's even more uncommon maybe is that they did not publish a paper on this they just wrote the blog and subsequently they open sourced uh a simple implementation of the deep dream algorithm and you can see on the screen they started getting some really weird creatures and results uh titled like pig snail or the dogfish it just sparked a lot of imagination from the very beginning and those creatures came from this uh cloud image you can see in the top left and also the the interesting thing is that depending on what you give to the network what you input as the as the as the input for example if you if you give it some scene it will start populating it with some towers or buildings and if you give it some kind of maybe a leaf or something it will start populating it with insects and animals so it also depends on the on the data that the network itself was trained on not only on the input image but i'll get into a bit more details a bit later but deep dream was not only about like creating this psychedelic crazy looking images it was actually a part of the effort to better understand neural networks uh a field called interpretability of neural networks and you can see in the in the bottom left those examples so what they actually told the network was hey uh tell me what do you see what do you think dumbbell is and going in reverse and reconstructing the image they got these four images in the bottom left and you can see that the network actually learned to always expect to see some muscular hands of a weightlifter next to the dumbbell or a bodybuilder whatever and did not manage to actually extract the concept itself of a dumbbell how we humans actually think about the dumbbell itself as a independent object not dependent on the context where we usually see it and let's get a bit deeper into how the thing actually works so what you do you have a pre-trained neural network as i already said uh pre-trained on some classification data set like imagenet or mit places 365 and you pass in the input image and you feed it through to a certain layer and when you come to a certain layer whatever activations you have there you want to amplify them and you do that by doing the gradient ascent and not descent for example you took a simple l2 or msc loss on those activations and you wanted to do a classical procedure that's a minimization uh what you would end up with would be the that the input image would either become black or more probably just some random noise image on the other hand if you tell it to maximize instead of minimize those activations you get the deep dream results that we're familiar with as the image on the right here now the interesting thing is that depending on which exact layer in the network you're trying to optimize maximize in this case you get different interesting results so if you take some shallower layers from like some of the first layers in the neural network and you try and do this uh procedure i just described you'll get this low-level app patterns really interesting patterns and colors edges and corners the stuff that we know the usual low-level uh vision um stuff but on the other hand if you took some deeper layer in the network you would start getting some more increasingly complex features such as like maybe animal eyes or snout or something i'm assuming of course here that the network was trained on imagenet which contained a lot of like images of dogs and cats and animals in general but if we had some other data set then those high level features would be some other things and not eyes or snouts now i already mentioned imagenet and mit places 365 but those are not the only data sets out there obviously you can use some other ones but just that i use these two and these two are really common in the deep dream community so you can see on the left here the image was produced by a network that was trained on a lot of imagery that contained animals on the right you can see uh the network that was trained on the mit places which contains some scenery human-built objects uh pubs cafes and stuff like that and that's the reason it has these uh human-like built uh things uh ingrained into the image itself going a bit more deeper into the uh theory uh so i already mentioned gradient ascent so basically the only difference is you just change the sign uh when you do the update for a single pixel you don't do the minus where you use the learning rate and the the gradients you just switch it to plus but there are also some more advanced stuff like uh image pyramid so the uh the idea there is so first what what's image pyramid so it's basically you have your your base image and you just down sample it by some ratio let's say 1.5 and then you do it again and again again and you just stack those images and that's the image pyramid so now why do we do that now the thing is there is something called receptive field of network and when you feed all of those images from the image pyramid with different sizes to the network you will see different features in all of them some will be more fine-grained some will be like coarser features and combining all of those across the layers you'll get a really nice looking deep dream image so why they work is that uh when you have a smaller image in the input a single neuron somewhere in some deeper layer will be able to cover most of the pixels in that input image and they will be able to manipulate and tweak those pixels uh whereas when you have bigger image that pixel will only can will only be able to change and modify just a small area of the input image so the algorithm proceeds like this you just start with the smallest resolution you do the deep dream then you just up sample the image then again you do the deep dream and then again again until you get to the base image of the pyramid the biggest resolution and you just stop there are two more important things that you need to implement in order to have some nice images uh one is uh the gradient smoothing and normalization so basically if you only took gradients that you get from this optimization and just blindly apply them the image will quickly explode out of the the bounds that we are keeping it in so what you have to do is to just treat the gradients as a as a as a simple image and take a gaussian kernel and just uh filter out the through the image your convolution with the with the with the kernel and that will just smooth now the gradients and additionally you just want to take the maybe standard deviation of the of the whole thing and just scale all of those gradients with that standard deviation and that will just keep the things uh scoped aside from that it's quite uh usual to see to to just uh after applying the gradient of sand step you just want to clip the image so that you keep it in the designated boundary that you want to keep it in and those are usually image net normalized boundaries that should be enough for you to understand the next coding video where we'll go step through the code and just try and run this ourselves and get some really awesome looking images yeah and by the way don't forget to check out the description where i've linked the github code uh feel free to just play around with it before i create the actual video of explaining how to use it i guess most of you are coders or experienced machine learning engineers already so go ahead and play and as always if you found this video useful and only if you found it useful uh please consider subscribing and sharing the video that would mean a lot to me and yeah see you in the next video

Original Description

❤️ Become The AI Epiphany Patreon ❤️ ► https://www.patreon.com/theaiepiphany ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Learn the basic theory behind the Deep Dream algorithm. You'll learn about: ✔️ What Google Deep Dream is and how it all began ✔️Impacts of shallower/deeper layers on the final result ✔️Basic concepts: gradient ascent, image pyramids, etc. ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ ✅ DeepDream in PyTorch: https://github.com/gordicaleksa/pytorch-deepdream ✅ Original blog: https://ai.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ ⌚️ Timetable: 1:10 - How it all started in 2015 3:30 - Simple explanation of how it works 4:30 - Impact of using shallow layers 5:30 - Dataset matters (ImageNet vs Places 365) 6:10 - More details on how it works ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ 💰 BECOME A PATREON OF THE AI EPIPHANY ❤️ If these videos, GitHub projects, and blogs help you, consider helping me out by supporting me on Patreon! The AI Epiphany ► https://www.patreon.com/theaiepiphany One-time donations: https://www.paypal.com/paypalme/theaiepiphany Much love! ❤️ ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ 💡 The AI Epiphany is a channel dedicated to simplifying the field of AI using creative visualizations and in general, a stronger focus on geometrical and visual intuition, rather than the algebraic and numerical "intuition". ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ 👋 CONNECT WITH ME ON SOCIAL LinkedIn ► https://www.linkedin.com/in/aleksagordic/ Twitter ► https://twitter.com/gordic_aleksa Instagram ► https://www.instagram.com/aiepiphany/ Facebook ► https://www.facebook.com/aiepiphany/ 👨‍👩‍👧‍👦 JOIN OUR DISCORD COMMUNITY: Discord ► https://discord.gg/peBrCpheKE 📢 SUBSCRIBE TO MY MONTHLY AI NEWSLETTER: Substack ► https://aiepiphany.substack.com/ 💻 FOLLOW ME ON GITHUB FOR COOL PROJECTS: GitHub ► https://github.com/gordicaleksa 📚 FOLLOW ME ON MEDIUM: Medium ► https://gordicaleksa.medium.com/ ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ #deepdream #deeplearning #art #ai
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Aleksa Gordić - The AI Epiphany · Aleksa Gordić - The AI Epiphany · 10 of 60

1 Intro | Neural Style Transfer #1
Intro | Neural Style Transfer #1
Aleksa Gordić - The AI Epiphany
2 Basic Theory | Neural Style Transfer #2
Basic Theory | Neural Style Transfer #2
Aleksa Gordić - The AI Epiphany
3 Optimization method | Neural Style Transfer #3
Optimization method | Neural Style Transfer #3
Aleksa Gordić - The AI Epiphany
4 Advanced Theory | Neural Style Transfer #4
Advanced Theory | Neural Style Transfer #4
Aleksa Gordić - The AI Epiphany
5 Anyone can make deepfakes now!
Anyone can make deepfakes now!
Aleksa Gordić - The AI Epiphany
6 What is Computer Vision? | The Art of Creating Seeing Machines
What is Computer Vision? | The Art of Creating Seeing Machines
Aleksa Gordić - The AI Epiphany
7 Feed-forward method | Neural Style Transfer #5
Feed-forward method | Neural Style Transfer #5
Aleksa Gordić - The AI Epiphany
8 Alan Turing | Computing Machinery and Intelligence
Alan Turing | Computing Machinery and Intelligence
Aleksa Gordić - The AI Epiphany
9 Feed-forward method (training) | Neural Style Transfer #6
Feed-forward method (training) | Neural Style Transfer #6
Aleksa Gordić - The AI Epiphany
What is Google Deep Dream? (Basic Theory) | Deep Dream Series #1
What is Google Deep Dream? (Basic Theory) | Deep Dream Series #1
Aleksa Gordić - The AI Epiphany
11 Semantic Segmentation in PyTorch | Neural Style Transfer #7
Semantic Segmentation in PyTorch | Neural Style Transfer #7
Aleksa Gordić - The AI Epiphany
12 How to get started with Machine Learning
How to get started with Machine Learning
Aleksa Gordić - The AI Epiphany
13 How to learn PyTorch? (3 easy steps) | 2021
How to learn PyTorch? (3 easy steps) | 2021
Aleksa Gordić - The AI Epiphany
14 PyTorch or TensorFlow?
PyTorch or TensorFlow?
Aleksa Gordić - The AI Epiphany
15 3 Machine Learning Projects For Beginners (Highly visual) | 2021
3 Machine Learning Projects For Beginners (Highly visual) | 2021
Aleksa Gordić - The AI Epiphany
16 Machine Learning Projects (Intermediate level) | 2021
Machine Learning Projects (Intermediate level) | 2021
Aleksa Gordić - The AI Epiphany
17 Cheapest (0$) Deep Learning Hardware Options | 2021
Cheapest (0$) Deep Learning Hardware Options | 2021
Aleksa Gordić - The AI Epiphany
18 How to learn deep learning? (Transformers Example)
How to learn deep learning? (Transformers Example)
Aleksa Gordić - The AI Epiphany
19 How do transformers work? (Attention is all you need)
How do transformers work? (Attention is all you need)
Aleksa Gordić - The AI Epiphany
20 Developing a deep learning project (case study on transformer)
Developing a deep learning project (case study on transformer)
Aleksa Gordić - The AI Epiphany
21 Vision Transformer (ViT) - An image is worth 16x16 words | Paper Explained
Vision Transformer (ViT) - An image is worth 16x16 words | Paper Explained
Aleksa Gordić - The AI Epiphany
22 GPT-3 - Language Models are Few-Shot Learners | Paper Explained
GPT-3 - Language Models are Few-Shot Learners | Paper Explained
Aleksa Gordić - The AI Epiphany
23 Google DeepMind's AlphaFold 2 explained! (Protein folding, AlphaFold 1, a glimpse into AlphaFold 2)
Google DeepMind's AlphaFold 2 explained! (Protein folding, AlphaFold 1, a glimpse into AlphaFold 2)
Aleksa Gordić - The AI Epiphany
24 Attention Is All You Need (Transformer) | Paper Explained
Attention Is All You Need (Transformer) | Paper Explained
Aleksa Gordić - The AI Epiphany
25 Graph Attention Networks (GAT) | GNN Paper Explained
Graph Attention Networks (GAT) | GNN Paper Explained
Aleksa Gordić - The AI Epiphany
26 Graph Convolutional Networks (GCN) | GNN Paper Explained
Graph Convolutional Networks (GCN) | GNN Paper Explained
Aleksa Gordić - The AI Epiphany
27 Graph SAGE - Inductive Representation Learning on Large Graphs | GNN Paper Explained
Graph SAGE - Inductive Representation Learning on Large Graphs | GNN Paper Explained
Aleksa Gordić - The AI Epiphany
28 PinSage - Graph Convolutional Neural Networks for Web-Scale Recommender Systems | Paper Explained
PinSage - Graph Convolutional Neural Networks for Web-Scale Recommender Systems | Paper Explained
Aleksa Gordić - The AI Epiphany
29 OpenAI CLIP - Connecting Text and Images | Paper Explained
OpenAI CLIP - Connecting Text and Images | Paper Explained
Aleksa Gordić - The AI Epiphany
30 Temporal Graph Networks (TGN) | GNN Paper Explained
Temporal Graph Networks (TGN) | GNN Paper Explained
Aleksa Gordić - The AI Epiphany
31 Graph Neural Network Project Update! (I'm coding GAT from scratch)
Graph Neural Network Project Update! (I'm coding GAT from scratch)
Aleksa Gordić - The AI Epiphany
32 Graph Attention Network Project Walkthrough
Graph Attention Network Project Walkthrough
Aleksa Gordić - The AI Epiphany
33 How to get started with Graph ML? (Blog walkthrough)
How to get started with Graph ML? (Blog walkthrough)
Aleksa Gordić - The AI Epiphany
34 DQN - Playing Atari with Deep Reinforcement Learning | RL Paper Explained
DQN - Playing Atari with Deep Reinforcement Learning | RL Paper Explained
Aleksa Gordić - The AI Epiphany
35 AlphaGo - Mastering the game of Go with deep neural networks and tree search | RL Paper Explained
AlphaGo - Mastering the game of Go with deep neural networks and tree search | RL Paper Explained
Aleksa Gordić - The AI Epiphany
36 DeepMind's AlphaGo Zero and AlphaZero | RL paper explained
DeepMind's AlphaGo Zero and AlphaZero | RL paper explained
Aleksa Gordić - The AI Epiphany
37 OpenAI - Solving Rubik's Cube with a Robot Hand | RL paper explained
OpenAI - Solving Rubik's Cube with a Robot Hand | RL paper explained
Aleksa Gordić - The AI Epiphany
38 MuZero - Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model | RL Paper explained
MuZero - Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model | RL Paper explained
Aleksa Gordić - The AI Epiphany
39 EfficientNetV2 - Smaller Models and Faster Training | Paper explained
EfficientNetV2 - Smaller Models and Faster Training | Paper explained
Aleksa Gordić - The AI Epiphany
40 Implementing DeepMind's DQN from scratch! | Project Update
Implementing DeepMind's DQN from scratch! | Project Update
Aleksa Gordić - The AI Epiphany
41 MLP-Mixer: An all-MLP Architecture for Vision | Paper explained
MLP-Mixer: An all-MLP Architecture for Vision | Paper explained
Aleksa Gordić - The AI Epiphany
42 DeepMind's Android RL Environment - AndroidEnv
DeepMind's Android RL Environment - AndroidEnv
Aleksa Gordić - The AI Epiphany
43 When Vision Transformers Outperform ResNets without Pretraining | Paper Explained
When Vision Transformers Outperform ResNets without Pretraining | Paper Explained
Aleksa Gordić - The AI Epiphany
44 Non-Parametric Transformers | Paper explained
Non-Parametric Transformers | Paper explained
Aleksa Gordić - The AI Epiphany
45 Chip Placement with Deep Reinforcement Learning | Paper Explained
Chip Placement with Deep Reinforcement Learning | Paper Explained
Aleksa Gordić - The AI Epiphany
46 Text Style Brush - Transfer of text aesthetics from a single example | Paper Explained
Text Style Brush - Transfer of text aesthetics from a single example | Paper Explained
Aleksa Gordić - The AI Epiphany
47 Graphormer - Do Transformers Really Perform Bad for Graph Representation? | Paper Explained
Graphormer - Do Transformers Really Perform Bad for Graph Representation? | Paper Explained
Aleksa Gordić - The AI Epiphany
48 GANs N' Roses: Stable, Controllable, Diverse Image to Image Translation | Paper Explained
GANs N' Roses: Stable, Controllable, Diverse Image to Image Translation | Paper Explained
Aleksa Gordić - The AI Epiphany
49 VQ-VAEs: Neural Discrete Representation Learning | Paper + PyTorch Code Explained
VQ-VAEs: Neural Discrete Representation Learning | Paper + PyTorch Code Explained
Aleksa Gordić - The AI Epiphany
50 VQ-GAN: Taming Transformers for High-Resolution Image Synthesis | Paper Explained
VQ-GAN: Taming Transformers for High-Resolution Image Synthesis | Paper Explained
Aleksa Gordić - The AI Epiphany
51 Multimodal Few-Shot Learning with Frozen Language Models | Paper Explained
Multimodal Few-Shot Learning with Frozen Language Models | Paper Explained
Aleksa Gordić - The AI Epiphany
52 Focal Transformer: Focal Self-attention for Local-Global Interactions in Vision Transformers
Focal Transformer: Focal Self-attention for Local-Global Interactions in Vision Transformers
Aleksa Gordić - The AI Epiphany
53 AudioCLIP: Extending CLIP to Image, Text and Audio | Paper Explained
AudioCLIP: Extending CLIP to Image, Text and Audio | Paper Explained
Aleksa Gordić - The AI Epiphany
54 RMA: Rapid Motor Adaptation for Legged Robots | Paper Explained
RMA: Rapid Motor Adaptation for Legged Robots | Paper Explained
Aleksa Gordić - The AI Epiphany
55 DALL-E: Zero-Shot Text-to-Image Generation | Paper Explained
DALL-E: Zero-Shot Text-to-Image Generation | Paper Explained
Aleksa Gordić - The AI Epiphany
56 DETR: End-to-End Object Detection with Transformers | Paper Explained
DETR: End-to-End Object Detection with Transformers | Paper Explained
Aleksa Gordić - The AI Epiphany
57 DINO: Emerging Properties in Self-Supervised Vision Transformers | Paper Explained!
DINO: Emerging Properties in Self-Supervised Vision Transformers | Paper Explained!
Aleksa Gordić - The AI Epiphany
58 DeepMind DetCon: Efficient Visual Pretraining with Contrastive Detection | Paper Explained
DeepMind DetCon: Efficient Visual Pretraining with Contrastive Detection | Paper Explained
Aleksa Gordić - The AI Epiphany
59 Do Vision Transformers See Like Convolutional Neural Networks? | Paper Explained
Do Vision Transformers See Like Convolutional Neural Networks? | Paper Explained
Aleksa Gordić - The AI Epiphany
60 Fastformer: Additive Attention Can Be All You Need | Paper Explained
Fastformer: Additive Attention Can Be All You Need | Paper Explained
Aleksa Gordić - The AI Epiphany

This video teaches the basic theory behind Google Deep Dream and its application in image processing, demonstrating how to amplify features in input images using neural networks. Viewers will learn about the algorithm's history, its components, and how to implement it using various tools. The video is intended for beginners interested in deep learning and computer vision.

Key Takeaways
  1. Feed a base image to the network and perform gradient ascent to amplify activations
  2. Downsample the image and feed it to the network again to get different features at different resolutions
  3. Combine features across layers to get a nice-looking Deep Dream image
  4. Upsample an image
  5. Apply Deep Dream
  6. Repeat until reaching base image resolution
  7. Implement gradient smoothing and normalization
  8. Filter gradients with a Gaussian kernel
💡 The Deep Dream algorithm uses gradient ascent to amplify activations in a neural network, resulting in surreal and dream-like images, and its implementation involves upsampling, downsampling, and combining features across layers.

Related AI Lessons

The AI Moat Paradox: The Better Models Become, the Less Models Matter
The AI moat paradox suggests that as AI models improve, their importance may decrease, and understanding this concept is crucial for AI professionals and businesses.
Medium · AI
170,927 AI Papers Reveal the Biggest Research Shifts of the First Half of 2026
Discover the biggest AI research shifts of 2026 based on 170,927 papers, and learn how to apply these trends to your work
Medium · Machine Learning
170,927 AI Papers Reveal the Biggest Research Shifts of the First Half of 2026
Discover the major research shifts in AI from 170,927 papers published in the first half of 2026, and learn how to analyze trends in AI research
Medium · Data Science
[PoV] When Everyone Is Smart, No One Is
In a world where AI makes everyone smart, the value of intelligence decreases, and new challenges arise
Medium · AI

Chapters (5)

1:10 How it all started in 2015
3:30 Simple explanation of how it works
4:30 Impact of using shallow layers
5:30 Dataset matters (ImageNet vs Places 365)
6:10 More details on how it works
Up next
Man dies after horror Gold Coast house fire; high-speed Sydney motorway pursuit | 9 News Australia
9 News Australia
Watch →