Vokenization Explained!

Connor Shorten · Beginner ·📄 Research Papers Explained ·5y ago

Skills: Reading ML Papers90%Research Methods80%RAG Basics70%Vector Stores60%RAG Evaluation50%

This video explains a new approach to Visually supervise Language models that achieves performance gains on Language-Only tasks like the GLUE benchmark and SQuAD question answering. This is done by constructing a token-image matching (vokens) and classifying corresponding tokens with a a weakly supervised loss function. Thanks for watching! Please Subscribe! Paper Links: Vokenization: https://arxiv.org/pdf/2010.06775.pdf ImageBERT: https://arxiv.org/pdf/2001.07966.pdf VilBERT: https://arxiv.org/pdf/1908.02265.pdf LXMERT: https://arxiv.org/pdf/1908.07490.pdf UNITER: https://arxiv.org/pdf/1909.11740.pdf Visual Genome: https://visualgenome.org/ 12-in-1: Multi-task Vision and Language Representation Learning: https://arxiv.org/pdf/1912.02315.pdf How Context Affects Language Models' Factual Predictions: https://arxiv.org/pdf/2005.04611.pdf Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines: https://www.nature.com/articles/s41746-020-00341-z ConVIRT: https://arxiv.org/pdf/2010.00747.pdf Climbing towards NLU: https://arxiv.org/pdf/2010.00747.pdf Weak Supervision: A New Programming Paradigm for Machine Learning: http://ai.stanford.edu/blog/weak-supervision/ Thanks for watching! Chapters 0:00 Introduction 1:16 Idea of Vision-Language Models 2:40 Overview of Vokenization 3:38 Voken Examples 4:45 Weak Supervision 6:00 Image Retrieval for Supervision 7:47 What is Grounded Language? 8:25 Issues with Existing Datasets 10:28 Exciting Results for Vision-Language! 13:07 Multi-Modal Learning 14:45 On Meaing, Form, and Understanding 16:04 Information Retrieval in NLP

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Connor Shorten · Connor Shorten · 0 of 60

← Previous Next →

DeepWalk Explained

DeepWalk Explained

Inception Network Explained

Inception Network Explained

Progressive Growing of GANs Explained

Progressive Growing of GANs Explained

Improved Techniques for Training GANs

Improved Techniques for Training GANs

Word2Vec Explained

Word2Vec Explained

Must Read Papers on GANs

Must Read Papers on GANs

Unsupervised Feature Learning

Unsupervised Feature Learning

Self-Supervised GANs

Self-Supervised GANs

Embedding Graphs with Deep Learning

Embedding Graphs with Deep Learning

Transfer Learning in GANs

Transfer Learning in GANs

ReLU Activation Function

ReLU Activation Function

AC-GAN Explained

AC-GAN Explained

SimGAN Explained

SimGAN Explained

DC-GAN Explained!

DC-GAN Explained!

ResNet Explained!

ResNet Explained!

Graph Convolutional Networks

Graph Convolutional Networks

Neural Architecture Search

Neural Architecture Search

Video Classification with Deep Learning

Video Classification with Deep Learning

BigGANs in Data Augmentation

BigGANs in Data Augmentation

Introduction to Deep Learning

Introduction to Deep Learning

EfficientNet Explained!

EfficientNet Explained!

Self-Attention GAN

Self-Attention GAN

Curriculum Learning in Deep Neural Networks

Curriculum Learning in Deep Neural Networks

Deep Learning Podcast #1 | Edward Dixon | Stochastic Weight Averaging

Deep Learning Podcast #1 | Edward Dixon | Stochastic Weight Averaging

Deep Compression

Deep Compression

Skin Cancer Classification with Deep Learning

Skin Cancer Classification with Deep Learning

Deep Learning Podcast #2 | Edward Peake | Deep Learning in Medical Imaging

Deep Learning Podcast #2 | Edward Peake | Deep Learning in Medical Imaging

The Lottery Ticket Hypothesis Explained!

The Lottery Ticket Hypothesis Explained!

GauGAN Explained!

GauGAN Explained!

AutoML with Hyperband

AutoML with Hyperband

DL Podcast #3 | Yannic Kilcher | Population-Based Search

DL Podcast #3 | Yannic Kilcher | Population-Based Search

Weakly Supervised Pretraining

Weakly Supervised Pretraining

Image Data Augmentation for Deep Learning

Image Data Augmentation for Deep Learning

Unsupervised Data Augmentation

Unsupervised Data Augmentation

Wide ResNet Explained!

Wide ResNet Explained!

RevNet: Backpropagation without Storing Activations

RevNet: Backpropagation without Storing Activations

GANs with Fewer Labels

GANs with Fewer Labels

BigBiGAN Unsupervised Learning!

BigBiGAN Unsupervised Learning!

Self-Supervised Learning

Self-Supervised Learning

Multi-Task Self-Supervised Learning

Multi-Task Self-Supervised Learning

Self-Supervised GANs

Self-Supervised GANs

Population Based Training

Population Based Training

Show, Attend and Tell

Show, Attend and Tell

Siamese Neural Networks

Siamese Neural Networks

WaveGAN Explained!

WaveGAN Explained!

VAE-GAN Explained!

VAE-GAN Explained!

Evolution in Neural Architecture Search!

Evolution in Neural Architecture Search!

AI Research Weekly Update August 18th, 2019

AI Research Weekly Update August 18th, 2019

Weight Agnostic Neural Networks Explained!

Weight Agnostic Neural Networks Explained!

AI Research Weekly Update August 25th, 2019

AI Research Weekly Update August 25th, 2019

Neuroevolution of Augmenting Topologies (NEAT)

Neuroevolution of Augmenting Topologies (NEAT)

AI Research Weekly Update September 1st, 2019

AI Research Weekly Update September 1st, 2019

Randomly Wired Neural Networks

Randomly Wired Neural Networks

More on: Reading ML Papers

View skill →

Automatic Literature Review with GPT-3 - I embedded and indexed all of arXiv into a search engine!

Automatic Literature Review with GPT-3 - I embedded and indexed all of arXiv into a search engine!

Marcos Lopez Caniego - ESASky's JupyterLab widget| JupyterCon 2020

Marcos Lopez Caniego - ESASky's JupyterLab widget| JupyterCon 2020

Obsidian Zotero Integration Plugin | Streamline Your Research Paper Workflow 📝️

Obsidian Zotero Integration Plugin | Streamline Your Research Paper Workflow 📝️

This FULLY FREE Research Agent can BUILD Reports in Minutes!!!

This FULLY FREE Research Agent can BUILD Reports in Minutes!!!

Claude 3.7 Sonnet API | Build a Research Assistant

Claude 3.7 Sonnet API | Build a Research Assistant

I Built An Obsidian AI Research Assistant with Oz...

I Built An Obsidian AI Research Assistant with Oz...

Related AI Lessons

The ABCs of reading medical research and review papers these days

Learn to critically evaluate medical research papers by accepting nothing at face value, believing no one blindly, and checking everything

#1 DevLog Meta-research: I Got Tired of Tab Chaos While Reading Research Papers.

Learn to manage research paper tabs efficiently and apply meta-research techniques to improve productivity

How to Set Up a Karpathy-Style Wiki for Your Research Field

Learn to set up a Karpathy-style wiki for your research field to organize and share knowledge effectively

The Non-Optimality of Scientific Knowledge: Path Dependence, Lock-In, and The Local Minimum Trap

Scientific knowledge may be stuck in a local minimum, hindering optimal progress, and understanding this concept is crucial for advancing research

Chapters (12)

Introduction

1:16 Idea of Vision-Language Models

2:40 Overview of Vokenization

3:38 Voken Examples

4:45 Weak Supervision

6:00 Image Retrieval for Supervision

7:47 What is Grounded Language?

8:25 Issues with Existing Datasets

10:28 Exciting Results for Vision-Language!

13:07 Multi-Modal Learning

14:45 On Meaing, Form, and Understanding

16:04 Information Retrieval in NLP

From the Lab: Text Diffusion and Elastic Reasoning | Nemotron Labs

NVIDIA Developer