New TECH: Vision Transformer 2023 on Image Classification | AI

Discover AI · Advanced ·🧬 Deep Learning ·3y ago

Skills: CV Basics90%Modern CV Models80%Generative CV70%

Key Takeaways

The video discusses the application of Vision Transformers in image classification, specifically in medical image classification, and compares their performance to Convolutional Neural Networks (CNNs). It also explores the use of transfer learning techniques based on Vision Transformers for mammogram classification.

Full Transcript

hello Community Vision Transformers 2023 we look at the technology at image classification task in oncology and one of my last videos I showed you the Symmetry in convolutional and N layers because of the algebraic properties of its inherent group structure in convolutional neural networks we have locality and translation equivalence inherent to the model in Transformers We lack those two features but if we train it on a large data set we have the self-attention allows us to integrate information across the entire image and our MLP layers are local and translationally equivariant so we achieve something a convolutional neural network cannot achieve because of its design great publication now with recent Transformers we have the pre-training phase where we have an MLP layer on top of our encoder stack with a hidden layer and a fine tuning where we have our Transformer encoder stack with a single linear layer currently we have 43 methods how to do a visual transform let's apply it on an application cancer screening I use here a platform from Switzerland with more than 411p reviewed scientific journals and this is the journal we're gonna have a look at today okay so let's start talking about Vision Transformer based transfer learning for mammogram classification highly interesting article from January 2023 you know what you know what I like especially about this is that here we have South Korea a National Institute of Technology Ethiopia we have Italy this is so nice of a real cooperation now they have in detail now a study about Vision Transformers compared to convolutional neural network and they have some amazing insights and this is the reason why I picked especially this publication from January 23. so they show that Vision Transformers have a better performance and they develop a transfer learning technique based on Vision Transformer to classify breast Mass mammograms muscles what is their target to improve the early diagnostic of breast cancer this is the topic we're going to talk about and this is the content of this scientific publication so here we go to tell you about breast cancer yes and then remember we look at this only from the technical perspective of a neural network nothing else I'm not a medical doctor please go see your ontologist if you have any questions we are here looking at only at the computer science level okay here we go CNN's a computationally expensive due to the multiple convolution a different features level CNN lacks the ability to handle rotation and scale invariants with no augmentation and fails to encode relative spatial information so you see they address some shortcomings in the current technology but they also tell us and this is an important message that finding good data sets for training is a challenge in a medical image domain now on the one hand I understand about the privacy and your private sphere but on the other hand if you're working in a hospital I think you should have access to the medical images that you are able to support your oncologist so here we go there are several approaches they used for the lack of training image data sets but this is not the point we are interesting here in a vision Transformer based transfer learning method for a very specific task Mama worm classification this is all we are interested in on a technical level so and to tell us when analyzing mammograms cnns concentrate on a particular region because they have inherent their local properties disregarding the rest of the image which causes the cnns to miss some crucial details which would have been discovered if the entire image was examined at once so you see this is the problem that you have with the current technology and then they tell us that the vision Transformers have recently gained prominence in the field of computer vision surpassing cnns in tasks that require natural image classification remember we are here at image classification so this is a pure classification task we know for Transformer infrastructure I showed you this when we had the classification of sentences now interesting that they tell us their vits outperformed the most advanced CNN models this is what they tell us beautiful they start a little bit about the architecture that of course they are not working with words or sub words so that they have no visual tokens but otherwise the architecture of the vit encoder stack is more or less identical you have your multi-head attention Network you have your feed forward Network run with one or two hidden layers and this is it more or less now that the market had a tension network is in charge of creating the attention Maps but not now for word or word pieces but now for the provided embedded visual tokens beautiful everything as you know it don't have to tell you any more about it yes yes yes so interesting to know is now the data set they have as always when you train a Transformer you have to have data and you have to have a lot of data and they say here we use the digital database for screening mammography and this is to train and to test division Transformer based transfer Learning System but in this database institutional database they only have not only is the reason why I tell you this 13 000 images so they have close to 6 000 images from benign and 7.1 K images for malignant tissues so remember what I told you if you want to train a vision Transformers you need millions of pictures if you really want to reach high performance so what they tell us that the databases they have with six thousand or seven thousand images of medical images is not enough so this is the reason why they explored transfer learning in this particular scientific publication yes the pictures I don't need to go here in the details interestingly for a medical publication they talk to you about division Transformer architecture and this is highly interesting because they're really excellent in the explanation if you want to have a look at this you will find it point on great they say Okay normally in the natural language processing in our NLP mod we have a one-dimensional array which is our sentence here images of core two-dimensional X and Y component and they tell you they're divided in smaller patches then they have each patch was flattened into a vector they have the mapping of the flattened patches to d-dimension using a linear protection layer and you're not going to believe it remember in bird we had the CLS the classification token or the start token now also we have here prefix and like in bird we have positional embeddings so you see more or less really the same structure only the input embedding patches is a little bit different to what we know from our tokenizer structure and they tell us the Transformer encoder Network what I call the encoder stack is a stack of L identical layers yes yes yes the MLP with a hit single hidden layer was used to implement a classification head during the pre-training so here we are in the pre-training phase as I showed you before and then when you do the fine tuning use throw away this header and you just use a single linear layer for a classification task and in this implementation you have a Galo not available yeah this is from the original publication you know this now the nice thing is transfer learning transfer learning was employed such the division Transformer models pre-trained on the large image that set this is now an image data set that has close no it has 14 million images so compared to the 13k images that they have of medical images here as you can see it is here this is now a large database where you have 14 million images so this is really where you can use this data set as a pre-training data set and then you come with your 13k medical images and do the fine tuning on top of this so this is what they tell you here please have a look yourself they say the objective was to use division Transformer Knowledge from the large natural image data circuit image net with 14 million pictures and then to classify two classes in benign and malignant tissues of course so they detach the pre-trained prediction head and replace it you're not gonna believe it with a feed forward layer where they have two classes for the downstream task this is absolute identical to what I showed you in my last 10 videos about Transformer architecture the encoder stack for bird with a decoder stack for GPT identical architectural design so this is then used for fine tuning here I told you again the weights of the imagenet pre-trained vision Transformers on this huge 14 million image data set where then utilized as the starting point like in bird you train your pre-train under huge data set so that the system learns English and then on top you do the transfer learning the fine tuning to generate a new objective function sophomore yes yes yes is all the same now the interesting is and this is amazing for a medical paper they use three state-of-the-art Vision Transformers they say hey I don't want to look at the classical vit model that I showed you before but I won't also have this window model and a pyramid Vision Transformer model so really interesting The Chosen three Vision Transformer ways methods Advanced methods and they compare it yeah this is the the window shifting architecture if you want to have a look at the detail and here down there here you have the the spatial reduction attention in this pvt pvip design of our Transformer but this is not the point just want to show you what they've done they have they said hey what about we pre-train the vision Transformer models on the mammogram data set so this is here the small 13k data set from scratch and then use this architecture and compare this from scratch from a small data set and we already know from the mathematical algebraic group construct that this will not deliver results because I showed you if you look at the CNN design of the architecture and the vision Transformer that we have complete opposite symmetries within the system and you need a lot of data sets to really achieve a high level of performance with vision Transformers so no wonder they found this argument and then they compare the transfer learning revision Transformers or the with of course a convolutional neural network implementation details yes yes yes but just jump to the result with me the proposed Vision Transformer based so the classical Vision Transformer we have here transfer learning model so not that we pre-trained it only under 13k medical images but we pre-train it under 14 million images and then we fine-tuned it on the medical for 13 K images this exhibited Superior performance so this is the winner first you pre-train it on the huge model exactly as we expected and then you fine tune it on your particular Downstream task and this was a classification in the pre-screening of an image what else this provides strong evidence that Vision Transformer based transfer learning is effective in improving the Deep learning approach to abreast mammograms thereby improving the early diagnostic techniques for detection of breast cancer beautiful our technology has a real world application that will help oncologists with their work this is exactly why we do this why we build those systems why we train those systems they should have an impact in the real world so beautiful this is more or less the result they achieved yes you can have a look at the details the bar charts and whatsoever but this is not the point of what I want to show you yet it compared it with resonant of course discussion yeah your discussion this was something I wanted to show you so the vision Transformer based transfer learning approach provided the highest quantitative and statistical measures for classifying breast mammogram fee either in a two-class system as benign or malignant issues this is an important information because if you want to know jump into the system if you want to be a medical doctor if you want to be a computer scientist in the medical field you should know where the current state of the art is what approaches are implemented what approach has the highest quantitative results and where you can start doesn't mean that the approach before is now not at all relevant maybe you can combine it you can take elements out of different methods and methodologies and form something new but it is just interesting to see the power of vision Transformer based transfer learning Approach at January 2023 that they Proclaim for a particular task classifying breast mammograms this is the superior technology when they apply it and they tell us and this is interesting I think this is amazing the prime reason for the better performance of vision Transformers is the ability to capture Global Information from the early layers of our Transformers of our encoder stack and the Deep self-attention mechanism of course that is inherent in each layer in each Transformer block that we have that enables features in each patch to be carefully analyzed for a decision making unbelievable to see this here in a publication that is focused to clearly on medical images it is absolutely in line but we would expect from Pure mathematical reasoning what else additionally they showed that training their models from scratch because the small number of images yes was not so effective we know this then they conclude therefore transfer learning provided better results as it used weights in our annual Network that were pre-trained on a large data set as I showed you on 14 million images such as imagenet and leverages the knowledge to learn from smaller data set such as here it is classical focused medical image data set of just 13 000 images for medical image classification where they then fine tune their model great publication gives us a lot of inside what's going on currently where the research is going in medical or Indy and I just wanted to show you this if you jump into this topic they're at a high level so you can apply everything I showed you in my last videos from the mathematical approach from the topological approach when we constructed specific topological spaces and Vector spaces and embeddings and structures and we talked about a Transformer architecture from GPT from bird all of this comes now here together for a medical application conclusion consequently we found that that Vision Transformer based transfer learning is effective for breast mammography image classification providing Superior performance with less computational complexity and this is really what you want you want to have less computational complexity means lower cost lower time to compute and you want to have a superior performance in the very early detection of possible my line tissues great so this Vision Transformer based transfer learning outperformed the current convolutional neural network based transfer learning four breast mammograph classification this is the result in January 2023 of this publication you have to order now it was funded by the South Korean government interesting wait and you have done all the different references this is the publication I definitely wanted to show you have a look at this and it is really outstanding because it gives you a lot of technical or computer science oriented information from their experience when they applied Vision Transformer based transfer learning for a particular Downstream task for a classification task of medical images as you can see here New Perspective to boost Vision Transformers for medical image classification it is a Hot Topic and we need more data because we do not have enough medical images available and people are trying to find new training algorithm to cope with this shortage in images if you want to see at which level the in the images are analyzed have a look at this two from word embedding and vision embedding a highly inspirational free prints you should read if you want to understand how Vision Transformers see in their particular layers addressing the shortage in available images to train look at what Adobe is doing with its cloud here we have now how many hours you have to train to achieve a performance and skip it and no this is too much so this is it for today Vision Transformers on cancer screening in 2023 I hope you enjoyed it I see you in my next video

Original Description

Understand state-of-the-art tech in Vision Technology, eg medical image classification, beginning of 2023. We learn the current tech of Vision Transformer vs CNN in a medical real-world application: "Vision-Transformer-Based Transfer Learning for Mammogram Classification". An in-depth analysis of Convolutional Neural Networks vs Vision Transformers for medical image classification, to improve the early diagnosis of breast cancer in support of oncologists. Research should have a positive impact on this world. Scientific publication (all rights with authors): Ayana, G.; Dese, K.; Dereje, Y.; Kebede, Y.; Barki, H.; Amdissa, D.; Husen, N.; Mulugeta, F.; Habtamu, B.; Choe, S.-W. Vision-Transformer-Based Transfer Learning for Mammogram Classification. Diagnostics 2023, 13, 178. https://doi.org/10.3390/diagnostics13020178 Other relevant links, as mentioned: https://paperswithcode.com/dataset/imagenet https://paperswithcode.com/methods/category/vision-transformer #ai #vision #medicalimaging #transformer #cancer

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Discover AI · Discover AI · 35 of 60

← Previous Next →

Step Into the Unknown (by YouChat) - May 2023 be your best year yet

Step Into the Unknown (by YouChat) - May 2023 be your best year yet

Wishing you all an amazing 2023 filled with Love, Laughter, and Happiness!

Wishing you all an amazing 2023 filled with Love, Laughter, and Happiness!

Create a Smarter Future!

Create a Smarter Future!

The Art of Text to Vector Transformation: A Comprehensive Look at AI and NLP Transformers

The Art of Text to Vector Transformation: A Comprehensive Look at AI and NLP Transformers

Feature Vectors: The Key to Unlocking the Power of BERT and SBERT Transformer Models

Feature Vectors: The Key to Unlocking the Power of BERT and SBERT Transformer Models

Domain-Specific AI Models: How to Create Customized BERT and SBERT Models for Your Business

Domain-Specific AI Models: How to Create Customized BERT and SBERT Models for Your Business

Achieve Unimaginable Levels of Domain Knowledge through SBERT Extreme in 3D (SBERT 48)

Achieve Unimaginable Levels of Domain Knowledge through SBERT Extreme in 3D (SBERT 48)

Unlocking Scientific Domain Knowledge w/ BPE Tokenizer: An Amazing Journey! (SBERT 49)

Unlocking Scientific Domain Knowledge w/ BPE Tokenizer: An Amazing Journey! (SBERT 49)

SBERT Extreme 3D: Train a BERT Tokenizer on your (scientific) Domain Knowledge (SBERT 50)

SBERT Extreme 3D: Train a BERT Tokenizer on your (scientific) Domain Knowledge (SBERT 50)

Discover Vision Transformer (ViT) Tech in 2023

Discover Vision Transformer (ViT) Tech in 2023

Pre-Train BERT from scratch: Solution for Company Domain Knowledge Data | PyTorch (SBERT 51)

Pre-Train BERT from scratch: Solution for Company Domain Knowledge Data | PyTorch (SBERT 51)

Flan-T5-XL model on a free COLAB | A free LLM - that explains itself w/ reasoning /write essay | AI

Flan-T5-XL model on a free COLAB | A free LLM - that explains itself w/ reasoning /write essay | AI

BERT and GPT in Language Models like ChatGPT or BLOOM | EASY Tutorial on Large Language Models LLM

BERT and GPT in Language Models like ChatGPT or BLOOM | EASY Tutorial on Large Language Models LLM

Free Alternative to ChatGPT: Flan-T5-XL GUI (open-source) #shorts

Free Alternative to ChatGPT: Flan-T5-XL GUI (open-source) #shorts

From T5 to T5X: A Game-Changing Evolution with JAX & FLAX

From T5 to T5X: A Game-Changing Evolution with JAX & FLAX

How to start with ChatGPT? | Short Introduction to OpenAI API #shorts

How to start with ChatGPT? | Short Introduction to OpenAI API #shorts

The Future of Conversational AI? Google's PaLM w/ RLHF | LLM ChatGPT Competitor

The Future of Conversational AI? Google's PaLM w/ RLHF | LLM ChatGPT Competitor

Microsoft and ChatGPU

Microsoft and ChatGPU

From Zero to FLAN-T5 XL Model GUI with Gradio: A Step-by-Step Guide on Free COLAB Notebook PyTorch

From Zero to FLAN-T5 XL Model GUI with Gradio: A Step-by-Step Guide on Free COLAB Notebook PyTorch

Google's 2nd Answer to "BING ChatGPT": Sparrow | after BARD w/ LaMDA | 2nd Gen Conversational AI

Google's 2nd Answer to "BING ChatGPT": Sparrow | after BARD w/ LaMDA | 2nd Gen Conversational AI

TF2: Pre-Train BERT from scratch (a Transformer), fine-tune & run inference on text | KERAS NLP

TF2: Pre-Train BERT from scratch (a Transformer), fine-tune & run inference on text | KERAS NLP

3D Visualization for BERT: How to Pre-Train with a New Layer & Fine-Tune with Downstream Task Layer

3D Visualization for BERT: How to Pre-Train with a New Layer & Fine-Tune with Downstream Task Layer

FLAN-T5-XXL on NVIDIA A100 GPU w/ HF Inference Endpoints, let's explore 11b models!

FLAN-T5-XXL on NVIDIA A100 GPU w/ HF Inference Endpoints, let's explore 11b models!

ChatGPT - Can it Lie to you?

ChatGPT - Can it Lie to you?

ChatGPT Alternative: Perplexity by Perplexity.AI

ChatGPT Alternative: Perplexity by Perplexity.AI

2023 KerasNLP Tutorial: Explore Latest KERAS Toolbox & NLP Processing Library for BERT - TF2

2023 KerasNLP Tutorial: Explore Latest KERAS Toolbox & NLP Processing Library for BERT - TF2

Self-aware AI: You.com/chat vs Perplexity.ai | Live Demo, LLMs show Future of ChatGPT w/ BING

Self-aware AI: You.com/chat vs Perplexity.ai | Live Demo, LLMs show Future of ChatGPT w/ BING

BLOOM 176B Inference on AWS | Bigger than GPT-3 for more Power!

BLOOM 176B Inference on AWS | Bigger than GPT-3 for more Power!

Fine-tune ChatGPT? Buy Embeddings /OpenAI? What are Embeddings? My own ChatGPT? | Visual Q+A

Fine-tune ChatGPT? Buy Embeddings /OpenAI? What are Embeddings? My own ChatGPT? | Visual Q+A

Unleashing the Power of BLOOM 176B with AWS ml.p4de.24xlarge, DJL & DeepSpeed: The Ultimate Boost!

Unleashing the Power of BLOOM 176B with AWS ml.p4de.24xlarge, DJL & DeepSpeed: The Ultimate Boost!

After ChatGPT: NEW BioGPT by Microsoft | Do YOU trust Microsoft for your Medication?

After ChatGPT: NEW BioGPT by Microsoft | Do YOU trust Microsoft for your Medication?

Improve ChatGPT: Modular, Adaptive, Smart LLM | Inside ChatGPT

Improve ChatGPT: Modular, Adaptive, Smart LLM | Inside ChatGPT

Fine-tune ChatGPT w/ in-context learning ICL - Chain of Thought, AMA, reasoning & acting: ReAct

Fine-tune ChatGPT w/ in-context learning ICL - Chain of Thought, AMA, reasoning & acting: ReAct

The Intersection of Copyright Law and Human Faces: Exploring Virtual K-Pop with MAVE

The Intersection of Copyright Law and Human Faces: Exploring Virtual K-Pop with MAVE

New TECH: Vision Transformer 2023 on Image Classification | AI

New TECH: Vision Transformer 2023 on Image Classification | AI

PyTorch code Vision Transformer: Apply ViT models pre-trained and fine-tuned | AI Tech

PyTorch code Vision Transformer: Apply ViT models pre-trained and fine-tuned | AI Tech

New BING ChatGPT: Unlock the Power of Emotions in your Search Engine!

New BING ChatGPT: Unlock the Power of Emotions in your Search Engine!

New BING ChatGPT loses its mind

New BING ChatGPT loses its mind

Self-Attention Heads of last Layer of Vision Transformer (ViT) visualized (pre-trained with DINO)

Self-Attention Heads of last Layer of Vision Transformer (ViT) visualized (pre-trained with DINO)

Visualizing the Self-Attention Head of the Last Layer in DINO ViT: A Unique Perspective on Vision AI

Visualizing the Self-Attention Head of the Last Layer in DINO ViT: A Unique Perspective on Vision AI

Microsoft strongly restricts access to ChatGPT on new BING - WHY?

Microsoft strongly restricts access to ChatGPT on new BING - WHY?

PyTorch ViT: The Ultimate Guide to Fine-Tuning for Object Identification (COLAB)

PyTorch ViT: The Ultimate Guide to Fine-Tuning for Object Identification (COLAB)

New BING Chat AGGRESSIVE

New BING Chat AGGRESSIVE

Panoptic Image Segmentation: Mask2Former explained | Identify all objects!

Panoptic Image Segmentation: Mask2Former explained | Identify all objects!

Code Panoptic Image Segmentation w/ Vision Transformer & Mask2Former - A PyTorch tutorial

Code Panoptic Image Segmentation w/ Vision Transformer & Mask2Former - A PyTorch tutorial

Dream Job Alert: AI Prompt Engineer - $335K | AI Prompt Design: A Crash Course

Dream Job Alert: AI Prompt Engineer - $335K | AI Prompt Design: A Crash Course

Streamlining Similar Image Detection with ViT in PyTorch: A Step-by-Step Guide

Streamlining Similar Image Detection with ViT in PyTorch: A Step-by-Step Guide

Microsoft's CEO in Trouble #shorts

Microsoft's CEO in Trouble #shorts

Why wait for KOSMOS-1? Code a VISION - LLM w/ ViT, Flan-T5 LLM and BLIP-2: Multimodal LLMs (MLLM)

Why wait for KOSMOS-1? Code a VISION - LLM w/ ViT, Flan-T5 LLM and BLIP-2: Multimodal LLMs (MLLM)

OpenAI's ChatGPT can NOW summarize external Sources on the Internet?

OpenAI's ChatGPT can NOW summarize external Sources on the Internet?

ChatGPT polarizes

ChatGPT polarizes

Hospital /Clinic AI Decision Models: Performance of 12 AI LLM Systems (incl $$) Radiology, Biomed

Hospital /Clinic AI Decision Models: Performance of 12 AI LLM Systems (incl $$) Radiology, Biomed

ChatGPT Prompt Engineering w/ in-context learning (ICL) - 7 Examples | Tutorial

ChatGPT Prompt Engineering w/ in-context learning (ICL) - 7 Examples | Tutorial

Chat with your Image! BLIP-2 connects Q-Former w/ VISION-LANGUAGE models (ViT & T5 LLM)

Chat with your Image! BLIP-2 connects Q-Former w/ VISION-LANGUAGE models (ViT & T5 LLM)

ChatGPT: Multidimensional Prompts

ChatGPT: Multidimensional Prompts

ChatGPT: In-context Retrieval-Augmented Learning (IC-RALM) | In-context Learning (ICL) Examples

ChatGPT: In-context Retrieval-Augmented Learning (IC-RALM) | In-context Learning (ICL) Examples

Code your BLIP-2 APP: VISION Transformer (ViT) + Chat LLM (Flan-T5) = MLLM

Code your BLIP-2 APP: VISION Transformer (ViT) + Chat LLM (Flan-T5) = MLLM

Buy Microsoft "Azure OpenAI Service" or buy from OpenAI its API for ChatGPT access & tuning?

Buy Microsoft "Azure OpenAI Service" or buy from OpenAI its API for ChatGPT access & tuning?

Pretraining vs Fine-tuning vs In-context Learning of LLM (GPT-x) EXPLAINED | Ultimate Guide ($)

Pretraining vs Fine-tuning vs In-context Learning of LLM (GPT-x) EXPLAINED | Ultimate Guide ($)

Reversible Transformer: ReFORMER for GPU Memory Optimization! Reversible Residual Layers?

Reversible Transformer: ReFORMER for GPU Memory Optimization! Reversible Residual Layers?

This video teaches how to apply Vision Transformers to image classification tasks, particularly in medical image classification, and how to use transfer learning techniques to improve performance. It also compares the performance of Vision Transformers to CNNs and discusses the advantages of using Vision Transformers.

Key Takeaways

Pre-train Vision Transformers on a large image dataset
Fine-tune Vision Transformers on a medical image dataset
Compare the performance of Vision Transformers to CNNs
Use transfer learning techniques to improve performance
Apply Vision Transformers to medical image classification tasks

💡 Vision Transformers can outperform CNNs in image classification tasks, particularly in medical image classification, and transfer learning techniques can be used to improve performance.

🔒 Pro feature: Ask AI to explain this lesson →

More on: CV Basics

View skill →

Identify Horses or Humans with TensorFlow and Vertex AI

Building a Dog Breed Identifier App from scratch - DogNet

Building a Dog Breed Identifier App from scratch - DogNet

Aladdin Persson

Apply OpenGL Texturing and Camera Systems

Apply OpenGL Texturing and Camera Systems

Aerial Image Segmentation with PyTorch

Aerial Image Segmentation with PyTorch

How to Install Stable Diffusion - automatic1111

How to Install Stable Diffusion - automatic1111

Sebastian Kamph

NVIDIA RTXGI Unreal Engine 4 Plugin: Introduction and Setup

NVIDIA RTXGI Unreal Engine 4 Plugin: Introduction and Setup

NVIDIA Developer

Related AI Lessons

Want to get started with deep learning

Get started with deep learning by leveraging resources like Andrew Karpathy's playlist and frameworks such as TensorFlow or PyTorch

Reddit r/deeplearning

Building a Deepfake Detector From Scratch — What Nobody Tells You

Learn to build a deepfake detector from scratch and understand the challenges involved in detecting AI-generated fake media

Medium · Deep Learning

Unfolding the Meandering Path: High-Dimensional Invariance and the Flat 2D Plane of Neural…

Learn about high-dimensional invariance and its relation to the flat 2D plane of neural networks, and how to apply these concepts to improve model performance

Medium · Deep Learning

Implementing Neural Style Transfer from Scratch: The Project That Started It All

Learn to implement Neural Style Transfer from scratch and understand its significance in deep learning

Medium · Deep Learning

Image Classification with ml5.js

The Coding Train