Feed-forward method | Neural Style Transfer #5

Aleksa Gordić - The AI Epiphany · Advanced ·📄 Research Papers Explained ·6y ago

Key Takeaways

This video explains the feed-forward method for neural style transfer using CNN-based neural networks

Full Transcript

okay so this video is gonna be really nice and short I went ahead and reconstructed original Johnson's paper on fast neural style transfer since the CMM based approach as opposed to the Gatos approach which was optimization based let's see if some results I got using the for pre-trained models I trained beforehand so in the left column you can see the style images I used and on the right you can see the output from the models I trained so keep in mind that the the model in the top row is the only one that was fully trained so it's all like it did to e box of the whole ms cocoa dataset so that's around 83,000 images whereas the three bottom ones still need to be additionally trained now I'm just gonna do a quick time-lapse of the setup because we already did that in the previous coding video [Music] [Applause] [Music] once you have your con environment configured there are only two more steps to get some results so the first one is go ahead and run this resource download or script and it will just go ahead and download the pre trained models that I've uploaded right here to this your URL on Dropbox it will then unzip them and place them in this folder here Marvel's binaries so you'll have for pre-trained models on your disposal and the second step is just go ahead and run this sterilization script it's located here just run it with the default parameters and we'll get some results this is the result we get using the default content image that's located here just go to data cannon images and this one is used by default so I'm just gonna go and show you a little bit about how the sterilization script itself brooks i'm gonna go ahead and stop this one here mm-hmm oops and I'm just gonna show you where it's actually saved it saved to the output images directory here so you can be able you'll be able to find it later on in this directory okay let's see how the script works let me close this one and zoom this in a little bit like this so what we have here is a couple of default locations first one is for continum ages and we already saw that one that's in here we have output images where the images will get dumped binaries where we have our models and this is just basic error checking here basically checking if if we only have Pytor models inside of the of this binaries here directory and I just create the output directory where we'll dump images and here the arguments those are just the content image that you want to use that you want to pass through the model then we have the you wanna set the width of the output image and it's set to 500 by default here we have default model that's called mosaic and we just wrap all of those parameters called stylized static image function let's see what this function actually does so we go up side here we can first see we first want to check out if the device has GPU or only CPU support and then we basically just create this image path we pass it through this function called prepare image which will kind of add a badge I mention and do some normalization will check the function a bit later and I just instantiate here the transfer transformer model which is the actual model that performs the stylization we figure out the state so we just load the state from by torch model that's in binaries here we print some metadata and we finally load the weights inside of the the model and we just put it into this eval mode which is really important if you want to do inference I want to do this when I call this function here and finally we just wrap this into this context torch no gret which will basically forbid pi torch to calculate radians which will which would kind of be a huge memory overhead so we call the model on the can image we just paste it to CPU we convert it to non PI here and because we had the badge dimension that we previously added we gotta extract the zeroth bet here and that's the stylized image when we just save the image that's the image we saw a couple of minutes ago that was dumped here into your output image folder this one just check out that prepare image we saw in the utils here so this is how it looks like you basically just loads damage from the specified path specified by this variable image path and that will just basically create an umpire image normalized to zero to one range and then we'll pretty much add this normalization transform that byte order provides us with and we'll use image net mean and standard deviation to normalize our images after we apply the transformation here we just push to push the this this image onto the GPU if you have one and we just add this patch dimension in front because that's that's because models are always expecting dispatch dimensions so pretty pretty simple stuff in the next video I'll I'll cover training that they'll be a bit more challenging we're gonna figure out how to use tensor board and just debug and visualize our our training metrics so let me just go ahead and try one more image here so I'm gonna use image called figures and let me set the width here to 350 say we'll use the same model mosaic and if I start the one if I run it this is what we get so how good the output image will be actually depends also on the sides that you put here in the width so if I put 550 say here and I run that one it will have different stylization and it's much nicer if you ask me the next video will cover training until that time just go ahead and play with this repo try and figure out how the things work and I'll be a really nice here learning experience if you like the content go ahead and subscribe like and share the videos if you think they can bring value to somebody else see you next time [Music]

Original Description

❤️ Become The AI Epiphany Patreon ❤️ ► https://www.patreon.com/theaiepiphany ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ The 5th video in the NST series! 🎨 You'll learn about: ✔️ fast, feed-forward, CNN-based neural style transfer! You don't need to read the research paper to be able to play with the code and make awesome stuff! You'll probably need some basic understanding if you want to train the models yourself - but I'll cover that in the next video! ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ ✅ GitHub code: https://github.com/gordicaleksa/pytorch-nst-feedforward ✅ Paper (Johnson et al.): https://arxiv.org/pdf/1603.08155.pdf ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ ⌚️ Timetable: 0:00 - GitHub repo walk-through 0:50 - Setup timelapse (check the repo for details) 1:03 - step1: Download pre-trained models 1:30 - step2: Run the stylization script 2:10 - Stylization script walk-through ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ 💰 BECOME A PATREON OF THE AI EPIPHANY ❤️ If these videos, GitHub projects, and blogs help you, consider helping me out by supporting me on Patreon! The AI Epiphany ► https://www.patreon.com/theaiepiphany One-time donations: https://www.paypal.com/paypalme/theaiepiphany Much love! ❤️ ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ 💡 The AI Epiphany is a channel dedicated to simplifying the field of AI using creative visualizations and in general, a stronger focus on geometrical and visual intuition, rather than the algebraic and numerical "intuition". ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ 👋 CONNECT WITH ME ON SOCIAL LinkedIn ► https://www.linkedin.com/in/aleksagordic/ Twitter ► https://twitter.com/gordic_aleksa Instagram ► https://www.instagram.com/aiepiphany/ Facebook ► https://www.facebook.com/aiepiphany/ 👨‍👩‍👧‍👦 JOIN OUR DISCORD COMMUNITY: Discord ► https://discord.gg/peBrCpheKE 📢 SUBSCRIBE TO MY MONTHLY AI NEWSLETTER: Substack ► https://aiepiphany.substack.com/ 💻 FOLLOW ME ON GITHUB FOR COOL PROJECTS: GitHub ► https://github.com/gordicaleksa 📚 FOLLOW ME ON MEDIUM: Medium ► https://gordicaleksa.medium.com/ ▬▬▬▬
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Aleksa Gordić - The AI Epiphany · Aleksa Gordić - The AI Epiphany · 7 of 60

1 Intro | Neural Style Transfer #1
Intro | Neural Style Transfer #1
Aleksa Gordić - The AI Epiphany
2 Basic Theory | Neural Style Transfer #2
Basic Theory | Neural Style Transfer #2
Aleksa Gordić - The AI Epiphany
3 Optimization method | Neural Style Transfer #3
Optimization method | Neural Style Transfer #3
Aleksa Gordić - The AI Epiphany
4 Advanced Theory | Neural Style Transfer #4
Advanced Theory | Neural Style Transfer #4
Aleksa Gordić - The AI Epiphany
5 Anyone can make deepfakes now!
Anyone can make deepfakes now!
Aleksa Gordić - The AI Epiphany
6 What is Computer Vision? | The Art of Creating Seeing Machines
What is Computer Vision? | The Art of Creating Seeing Machines
Aleksa Gordić - The AI Epiphany
Feed-forward method | Neural Style Transfer #5
Feed-forward method | Neural Style Transfer #5
Aleksa Gordić - The AI Epiphany
8 Alan Turing | Computing Machinery and Intelligence
Alan Turing | Computing Machinery and Intelligence
Aleksa Gordić - The AI Epiphany
9 Feed-forward method (training) | Neural Style Transfer #6
Feed-forward method (training) | Neural Style Transfer #6
Aleksa Gordić - The AI Epiphany
10 What is Google Deep Dream? (Basic Theory) | Deep Dream Series #1
What is Google Deep Dream? (Basic Theory) | Deep Dream Series #1
Aleksa Gordić - The AI Epiphany
11 Semantic Segmentation in PyTorch | Neural Style Transfer #7
Semantic Segmentation in PyTorch | Neural Style Transfer #7
Aleksa Gordić - The AI Epiphany
12 How to get started with Machine Learning
How to get started with Machine Learning
Aleksa Gordić - The AI Epiphany
13 How to learn PyTorch? (3 easy steps) | 2021
How to learn PyTorch? (3 easy steps) | 2021
Aleksa Gordić - The AI Epiphany
14 PyTorch or TensorFlow?
PyTorch or TensorFlow?
Aleksa Gordić - The AI Epiphany
15 3 Machine Learning Projects For Beginners (Highly visual) | 2021
3 Machine Learning Projects For Beginners (Highly visual) | 2021
Aleksa Gordić - The AI Epiphany
16 Machine Learning Projects (Intermediate level) | 2021
Machine Learning Projects (Intermediate level) | 2021
Aleksa Gordić - The AI Epiphany
17 Cheapest (0$) Deep Learning Hardware Options | 2021
Cheapest (0$) Deep Learning Hardware Options | 2021
Aleksa Gordić - The AI Epiphany
18 How to learn deep learning? (Transformers Example)
How to learn deep learning? (Transformers Example)
Aleksa Gordić - The AI Epiphany
19 How do transformers work? (Attention is all you need)
How do transformers work? (Attention is all you need)
Aleksa Gordić - The AI Epiphany
20 Developing a deep learning project (case study on transformer)
Developing a deep learning project (case study on transformer)
Aleksa Gordić - The AI Epiphany
21 Vision Transformer (ViT) - An image is worth 16x16 words | Paper Explained
Vision Transformer (ViT) - An image is worth 16x16 words | Paper Explained
Aleksa Gordić - The AI Epiphany
22 GPT-3 - Language Models are Few-Shot Learners | Paper Explained
GPT-3 - Language Models are Few-Shot Learners | Paper Explained
Aleksa Gordić - The AI Epiphany
23 Google DeepMind's AlphaFold 2 explained! (Protein folding, AlphaFold 1, a glimpse into AlphaFold 2)
Google DeepMind's AlphaFold 2 explained! (Protein folding, AlphaFold 1, a glimpse into AlphaFold 2)
Aleksa Gordić - The AI Epiphany
24 Attention Is All You Need (Transformer) | Paper Explained
Attention Is All You Need (Transformer) | Paper Explained
Aleksa Gordić - The AI Epiphany
25 Graph Attention Networks (GAT) | GNN Paper Explained
Graph Attention Networks (GAT) | GNN Paper Explained
Aleksa Gordić - The AI Epiphany
26 Graph Convolutional Networks (GCN) | GNN Paper Explained
Graph Convolutional Networks (GCN) | GNN Paper Explained
Aleksa Gordić - The AI Epiphany
27 Graph SAGE - Inductive Representation Learning on Large Graphs | GNN Paper Explained
Graph SAGE - Inductive Representation Learning on Large Graphs | GNN Paper Explained
Aleksa Gordić - The AI Epiphany
28 PinSage - Graph Convolutional Neural Networks for Web-Scale Recommender Systems | Paper Explained
PinSage - Graph Convolutional Neural Networks for Web-Scale Recommender Systems | Paper Explained
Aleksa Gordić - The AI Epiphany
29 OpenAI CLIP - Connecting Text and Images | Paper Explained
OpenAI CLIP - Connecting Text and Images | Paper Explained
Aleksa Gordić - The AI Epiphany
30 Temporal Graph Networks (TGN) | GNN Paper Explained
Temporal Graph Networks (TGN) | GNN Paper Explained
Aleksa Gordić - The AI Epiphany
31 Graph Neural Network Project Update! (I'm coding GAT from scratch)
Graph Neural Network Project Update! (I'm coding GAT from scratch)
Aleksa Gordić - The AI Epiphany
32 Graph Attention Network Project Walkthrough
Graph Attention Network Project Walkthrough
Aleksa Gordić - The AI Epiphany
33 How to get started with Graph ML? (Blog walkthrough)
How to get started with Graph ML? (Blog walkthrough)
Aleksa Gordić - The AI Epiphany
34 DQN - Playing Atari with Deep Reinforcement Learning | RL Paper Explained
DQN - Playing Atari with Deep Reinforcement Learning | RL Paper Explained
Aleksa Gordić - The AI Epiphany
35 AlphaGo - Mastering the game of Go with deep neural networks and tree search | RL Paper Explained
AlphaGo - Mastering the game of Go with deep neural networks and tree search | RL Paper Explained
Aleksa Gordić - The AI Epiphany
36 DeepMind's AlphaGo Zero and AlphaZero | RL paper explained
DeepMind's AlphaGo Zero and AlphaZero | RL paper explained
Aleksa Gordić - The AI Epiphany
37 OpenAI - Solving Rubik's Cube with a Robot Hand | RL paper explained
OpenAI - Solving Rubik's Cube with a Robot Hand | RL paper explained
Aleksa Gordić - The AI Epiphany
38 MuZero - Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model | RL Paper explained
MuZero - Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model | RL Paper explained
Aleksa Gordić - The AI Epiphany
39 EfficientNetV2 - Smaller Models and Faster Training | Paper explained
EfficientNetV2 - Smaller Models and Faster Training | Paper explained
Aleksa Gordić - The AI Epiphany
40 Implementing DeepMind's DQN from scratch! | Project Update
Implementing DeepMind's DQN from scratch! | Project Update
Aleksa Gordić - The AI Epiphany
41 MLP-Mixer: An all-MLP Architecture for Vision | Paper explained
MLP-Mixer: An all-MLP Architecture for Vision | Paper explained
Aleksa Gordić - The AI Epiphany
42 DeepMind's Android RL Environment - AndroidEnv
DeepMind's Android RL Environment - AndroidEnv
Aleksa Gordić - The AI Epiphany
43 When Vision Transformers Outperform ResNets without Pretraining | Paper Explained
When Vision Transformers Outperform ResNets without Pretraining | Paper Explained
Aleksa Gordić - The AI Epiphany
44 Non-Parametric Transformers | Paper explained
Non-Parametric Transformers | Paper explained
Aleksa Gordić - The AI Epiphany
45 Chip Placement with Deep Reinforcement Learning | Paper Explained
Chip Placement with Deep Reinforcement Learning | Paper Explained
Aleksa Gordić - The AI Epiphany
46 Text Style Brush - Transfer of text aesthetics from a single example | Paper Explained
Text Style Brush - Transfer of text aesthetics from a single example | Paper Explained
Aleksa Gordić - The AI Epiphany
47 Graphormer - Do Transformers Really Perform Bad for Graph Representation? | Paper Explained
Graphormer - Do Transformers Really Perform Bad for Graph Representation? | Paper Explained
Aleksa Gordić - The AI Epiphany
48 GANs N' Roses: Stable, Controllable, Diverse Image to Image Translation | Paper Explained
GANs N' Roses: Stable, Controllable, Diverse Image to Image Translation | Paper Explained
Aleksa Gordić - The AI Epiphany
49 VQ-VAEs: Neural Discrete Representation Learning | Paper + PyTorch Code Explained
VQ-VAEs: Neural Discrete Representation Learning | Paper + PyTorch Code Explained
Aleksa Gordić - The AI Epiphany
50 VQ-GAN: Taming Transformers for High-Resolution Image Synthesis | Paper Explained
VQ-GAN: Taming Transformers for High-Resolution Image Synthesis | Paper Explained
Aleksa Gordić - The AI Epiphany
51 Multimodal Few-Shot Learning with Frozen Language Models | Paper Explained
Multimodal Few-Shot Learning with Frozen Language Models | Paper Explained
Aleksa Gordić - The AI Epiphany
52 Focal Transformer: Focal Self-attention for Local-Global Interactions in Vision Transformers
Focal Transformer: Focal Self-attention for Local-Global Interactions in Vision Transformers
Aleksa Gordić - The AI Epiphany
53 AudioCLIP: Extending CLIP to Image, Text and Audio | Paper Explained
AudioCLIP: Extending CLIP to Image, Text and Audio | Paper Explained
Aleksa Gordić - The AI Epiphany
54 RMA: Rapid Motor Adaptation for Legged Robots | Paper Explained
RMA: Rapid Motor Adaptation for Legged Robots | Paper Explained
Aleksa Gordić - The AI Epiphany
55 DALL-E: Zero-Shot Text-to-Image Generation | Paper Explained
DALL-E: Zero-Shot Text-to-Image Generation | Paper Explained
Aleksa Gordić - The AI Epiphany
56 DETR: End-to-End Object Detection with Transformers | Paper Explained
DETR: End-to-End Object Detection with Transformers | Paper Explained
Aleksa Gordić - The AI Epiphany
57 DINO: Emerging Properties in Self-Supervised Vision Transformers | Paper Explained!
DINO: Emerging Properties in Self-Supervised Vision Transformers | Paper Explained!
Aleksa Gordić - The AI Epiphany
58 DeepMind DetCon: Efficient Visual Pretraining with Contrastive Detection | Paper Explained
DeepMind DetCon: Efficient Visual Pretraining with Contrastive Detection | Paper Explained
Aleksa Gordić - The AI Epiphany
59 Do Vision Transformers See Like Convolutional Neural Networks? | Paper Explained
Do Vision Transformers See Like Convolutional Neural Networks? | Paper Explained
Aleksa Gordić - The AI Epiphany
60 Fastformer: Additive Attention Can Be All You Need | Paper Explained
Fastformer: Additive Attention Can Be All You Need | Paper Explained
Aleksa Gordić - The AI Epiphany

Related AI Lessons

I Spent Weeks Looking for a Research Gap Before I Realized I Was Searching the Wrong Way
Learn how to effectively find research gaps by changing your approach, a crucial skill for AI researchers and academics
Medium · AI
ICMI 2026 Reviews [D]
Learn how to interpret ICMI 2026 reviews and improve your paper's acceptance chances
Reddit r/MachineLearning
Workshop submission for main conference paper under review [D]
Learn how to navigate submitting a paper to a non-archival workshop before the final decision of a main conference like ECCV
Reddit r/MachineLearning
Kept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]
Streamline your research with a new Chrome extension and website that integrates 3M papers from arxiv, OpenReview, GitHub, and HuggingFace, including citation graphs and SPECTER2 neighbors, and provide feedback to improve it
Reddit r/MachineLearning

Chapters (5)

GitHub repo walk-through
0:50 Setup timelapse (check the repo for details)
1:03 step1: Download pre-trained models
1:30 step2: Run the stylization script
2:10 Stylization script walk-through
Up next
Beyond Big Vendors: ERP Systems Explained #shorts
Digital Transformation with Eric Kimberling
Watch →