Llama 3.2: Best Multimodal Model Yet? (Vision Test)

Mervin Praison · Beginner ·👁️ Computer Vision ·1y ago
Llama 3.2 Vision Model is one of the top-performing models you can run locally on your computer for free! In this video, we put Llama 3.2's multimodal capabilities to the test. 🧑‍💻🌐 From image recognition, CAPTCHA solving, QR code scanning, to text extraction, we examine how well the 90-billion parameter and 11-billion parameter models perform. 🔍 What's in this video? Testing image analysis and modification suggestions 🛋️ Attempting CAPTCHA and QR code recognition 📸 Finding Wally and identifying people in an image 🤔 Extracting tables and generating HTML/CSS code for designs 💻 🚀 The results show some impressive strengths but also reveal a few limitations! Curious about Llama 3.2's capabilities? Watch the full video to find out. 🔗 Links: Patreon: https://patreon.com/MervinPraison Ko-fi: https://ko-fi.com/mervinpraison Discord: https://discord.gg/nNZu5gGT59 Twitter / X : https://twitter.com/mervinpraison GPU for 50% of it's cost: https://bit.ly/mervin-praison Coupon: MervinPraison (A6000, A5000) 0:00 - Introduction to Llama 3.2 Vision model 0:18 - Test setup using Together.ai 0:40 - Image analysis and simplification suggestions 1:26 - CAPTCHA text extraction 1:53 - Traffic light CAPTCHA test (failed) 2:21 - QR code URL extraction attempt (failed) 2:35 - "Where's Waldo?" test (failed) 3:01 - Person identification test (failed) 3:18 - Table extraction from image 3:54 - HTML/CSS code generation from image 4:45 - Generated code output review 5:11 - Overall performance summary 5:23 - Conclusion
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Mervin Praison · Mervin Praison · 0 of 60

← Previous Next →
1 Build GCP Infra using Pulumi in YAML format
Build GCP Infra using Pulumi in YAML format
Mervin Praison
2 How to Convert a Pulumi YAML File to Python Format
How to Convert a Pulumi YAML File to Python Format
Mervin Praison
3 Speed Up AWS EKS: A Complete Guide to Performance Tuning & Debugging!
Speed Up AWS EKS: A Complete Guide to Performance Tuning & Debugging!
Mervin Praison
4 Learn GCP GKE to AWS EKS Migration in Just 5 Minutes: Quick Guide
Learn GCP GKE to AWS EKS Migration in Just 5 Minutes: Quick Guide
Mervin Praison
5 AWS & Kubernetes: The Definitive Guide to Data Persistence with PV and PVC
AWS & Kubernetes: The Definitive Guide to Data Persistence with PV and PVC
Mervin Praison
6 ChatGPT Voice Conversation RELEASED! It's AMAZING!! (Demo)
ChatGPT Voice Conversation RELEASED! It's AMAZING!! (Demo)
Mervin Praison
7 How to Install Mistral 7B in Minutes: Quick & Easy Guide! ✅
How to Install Mistral 7B in Minutes: Quick & Easy Guide! ✅
Mervin Praison
8 Code Llama Install Locally: 🐍💻 Elevate Your Python Skills!
Code Llama Install Locally: 🐍💻 Elevate Your Python Skills!
Mervin Praison
9 Orca Mini: Your Ultimate Guide to Install and Test on Mac & Linux 💻
Orca Mini: Your Ultimate Guide to Install and Test on Mac & Linux 💻
Mervin Praison
10 Quick & Easy Vicuna Setup on Mac and Linux 💻
Quick & Easy Vicuna Setup on Mac and Linux 💻
Mervin Praison
11 Quick Guide: Llama2 Local Installation and ChatGPT with pip! Python🛠️
Quick Guide: Llama2 Local Installation and ChatGPT with pip! Python🛠️
Mervin Praison
12 Query PDFs Like a Pro with Local GPT: Full Setup Guide! 📜
Query PDFs Like a Pro with Local GPT: Full Setup Guide! 📜
Mervin Praison
13 LM Studio: EASIEST way to Run Large Language Models Locally!
LM Studio: EASIEST way to Run Large Language Models Locally!
Mervin Praison
14 AMAZING ChatGPT Vision is OUT! 🤯 14+ Examples (Step-by-Step) FULL Tutorial
AMAZING ChatGPT Vision is OUT! 🤯 14+ Examples (Step-by-Step) FULL Tutorial
Mervin Praison
15 Unbelievable! Build ANY App Instantly with Smol AI! 😲🔥
Unbelievable! Build ANY App Instantly with Smol AI! 😲🔥
Mervin Praison
16 Amazing! AutoGen Made Easy: A Step-by-Step Beginners Guide 📚
Amazing! AutoGen Made Easy: A Step-by-Step Beginners Guide 📚
Mervin Praison
17 How to Set Up LoLLMS and Run LLMs Locally! 🚀 Step-by-Step Tutorial
How to Set Up LoLLMS and Run LLMs Locally! 🚀 Step-by-Step Tutorial
Mervin Praison
18 GPT4All: INSANE Way to Run Large Language Models Locally! 😲 Step-By-Step Tutorial
GPT4All: INSANE Way to Run Large Language Models Locally! 😲 Step-By-Step Tutorial
Mervin Praison
19 Incredible AI-Powered NPCs in Unity Game Engine: Step by Step Tutorial!🤯
Incredible AI-Powered NPCs in Unity Game Engine: Step by Step Tutorial!🤯
Mervin Praison
20 MemGPT 🧠 LLM as Operating System. It's INSANE! Step-by-Step Tutorial 🤯
MemGPT 🧠 LLM as Operating System. It's INSANE! Step-by-Step Tutorial 🤯
Mervin Praison
21 Text Generation Web UI: MIND-BLOWING Way to Run LLM Locally! 🤯
Text Generation Web UI: MIND-BLOWING Way to Run LLM Locally! 🤯
Mervin Praison
22 Unlock the INSANE Power of OpenAI GPT-4 with C#/.NET! 😲
Unlock the INSANE Power of OpenAI GPT-4 with C#/.NET! 😲
Mervin Praison
23 Integrate Langchain and Ollama for Local AI Power 🤯 Indeed POWERFUL!
Integrate Langchain and Ollama for Local AI Power 🤯 Indeed POWERFUL!
Mervin Praison
24 ChatDev: INSANE Virtual AI Agents! Future of Software Development 😲
ChatDev: INSANE Virtual AI Agents! Future of Software Development 😲
Mervin Praison
25 Query PDFs Using Mistral: Unlock INSANE Power! 🤯
Query PDFs Using Mistral: Unlock INSANE Power! 🤯
Mervin Praison
26 AutoGen + Open-Source LLMs: UNBELIEVABLE! Step-by-Step Tutorial You Can't Miss! 🤯
AutoGen + Open-Source LLMs: UNBELIEVABLE! Step-by-Step Tutorial You Can't Miss! 🤯
Mervin Praison
27 AutoGen + Text Generation WebUI: Unbelievable 100% Local Private Setup 🤯
AutoGen + Text Generation WebUI: Unbelievable 100% Local Private Setup 🤯
Mervin Praison
28 MemGPT: Amazing! External Context for LLM #ai #llm #memgpt  #generativeai #mem #gpt #openai #chatgpt
MemGPT: Amazing! External Context for LLM #ai #llm #memgpt #generativeai #mem #gpt #openai #chatgpt
Mervin Praison
29 GeniA: Kubernetes + AI for MIND-BLOWING Operational Efficiency! 🤯 FULL Tutorial
GeniA: Kubernetes + AI for MIND-BLOWING Operational Efficiency! 🤯 FULL Tutorial
Mervin Praison
30 VertexAI Meets LangChain for Mind-Blowing AI Conversations! 😲 Step by Step Tutorial
VertexAI Meets LangChain for Mind-Blowing AI Conversations! 😲 Step by Step Tutorial
Mervin Praison
31 Simplified ChatGPT API Setup on Node.js for Newbies! 😍 Step by Step Tutorial
Simplified ChatGPT API Setup on Node.js for Newbies! 😍 Step by Step Tutorial
Mervin Praison
32 Autogen: Ollama integration 🤯 Step by Step Tutorial. Mind-blowing!
Autogen: Ollama integration 🤯 Step by Step Tutorial. Mind-blowing!
Mervin Praison
33 LiteLLM: One-Function Call to ANY Large Language Model! 🤯 UNBELIEVABLE!
LiteLLM: One-Function Call to ANY Large Language Model! 🤯 UNBELIEVABLE!
Mervin Praison
34 ChatGPT Chatbot in Less Time Than You Think! 🚀😎 Step-by-Step Tutorial
ChatGPT Chatbot in Less Time Than You Think! 🚀😎 Step-by-Step Tutorial
Mervin Praison
35 LiteLLM Chatbot: Build Your Own in MINUTES! INSANE! 🤖🔥
LiteLLM Chatbot: Build Your Own in MINUTES! INSANE! 🤖🔥
Mervin Praison
36 Create Chatbot: Turn ANY Open-Source LLM into a Conversation Pro! 🤖
Create Chatbot: Turn ANY Open-Source LLM into a Conversation Pro! 🤖
Mervin Praison
37 Create Chatbot: Ollama Integration Made UNBELIEVABLY Easy! 🎉
Create Chatbot: Ollama Integration Made UNBELIEVABLY Easy! 🎉
Mervin Praison
38 LlamaIndex + ChatGPT: Ingest Data and Experience UNBELIEVABLE Query Results! 🌟
LlamaIndex + ChatGPT: Ingest Data and Experience UNBELIEVABLE Query Results! 🌟
Mervin Praison
39 INSANE! OpenAgents: Automated Data Analysis with Kaggle 🤯
INSANE! OpenAgents: Automated Data Analysis with Kaggle 🤯
Mervin Praison
40 React.js LLM Agent for Next-Gen Coding using ChatGPT 🚀 Mind-Blowing 🤯
React.js LLM Agent for Next-Gen Coding using ChatGPT 🚀 Mind-Blowing 🤯
Mervin Praison
41 MemGPT + Any LLM 🚀 100% Local & Private Integration Unveiled! Unlimited Memory
MemGPT + Any LLM 🚀 100% Local & Private Integration Unveiled! Unlimited Memory
Mervin Praison
42 MemGPT  + AutoGen 🧠🤖 Unlimited Memory & Autonomous AI Agents! INSANE🤯
MemGPT + AutoGen 🧠🤖 Unlimited Memory & Autonomous AI Agents! INSANE🤯
Mervin Praison
43 AutoGen + Google's Palm LLM & More! Revolutionary AI Integration 🚀
AutoGen + Google's Palm LLM & More! Revolutionary AI Integration 🚀
Mervin Praison
44 MemGPT & LM Studio Integration Revealed! 🔥 Next-Level AI
MemGPT & LM Studio Integration Revealed! 🔥 Next-Level AI
Mervin Praison
45 🚀 AutoLLM: Unlock the Power of 100+ Language Models! Step-by-Step Tutorial
🚀 AutoLLM: Unlock the Power of 100+ Language Models! Step-by-Step Tutorial
Mervin Praison
46 AutoLLM & Gradio Integration You Won't Believe! 🤯 Mind-Blowing
AutoLLM & Gradio Integration You Won't Believe! 🤯 Mind-Blowing
Mervin Praison
47 AutoLLM & FastAPI Tutorial: Query 100+ Language Models! 😱
AutoLLM & FastAPI Tutorial: Query 100+ Language Models! 😱
Mervin Praison
48 Quivr: LLM's Second Brain - Transforming Data Management & Advanced Query with AI! 🤯
Quivr: LLM's Second Brain - Transforming Data Management & Advanced Query with AI! 🤯
Mervin Praison
49 AutoGen & MemGPT with Local LLM: A Complete Setup Tutorial! 🧠 AMAZING 🤯
AutoGen & MemGPT with Local LLM: A Complete Setup Tutorial! 🧠 AMAZING 🤯
Mervin Praison
50 LocalAI: Free, Open Source OpenAI Alternative 🚀 INSANE 🤯 Step-by-Step Tutorial
LocalAI: Free, Open Source OpenAI Alternative 🚀 INSANE 🤯 Step-by-Step Tutorial
Mervin Praison
51 Yarn Mistral 7B 128k LARGE context window, Small size 🤯 INSANE 🚀 Setup Tutorial!
Yarn Mistral 7B 128k LARGE context window, Small size 🤯 INSANE 🚀 Setup Tutorial!
Mervin Praison
52 Zephyr-7B: The Small and Mighty LLM 🤯 Step by Step Tutorial! 📘
Zephyr-7B: The Small and Mighty LLM 🤯 Step by Step Tutorial! 📘
Mervin Praison
53 Promptfoo: How to Test Your LLM ? 🚀  VERY EASY!
Promptfoo: How to Test Your LLM ? 🚀 VERY EASY!
Mervin Praison
54 Pydantic: How to Validate LLM Responses? 🚀 Quality Response. VERY EASY!!!!
Pydantic: How to Validate LLM Responses? 🚀 Quality Response. VERY EASY!!!!
Mervin Praison
55 Pydantic: FORCE Your AI to Respond Back in UPPERCASE! 🤯 Step-by-Step Tutorial 🔥
Pydantic: FORCE Your AI to Respond Back in UPPERCASE! 🤯 Step-by-Step Tutorial 🔥
Mervin Praison
56 Pydantic: How to use LLM to convert unstructured data to structured data?
Pydantic: How to use LLM to convert unstructured data to structured data?
Mervin Praison
57 AutoGen Function Calling: INSANE 🚀 Custom Integrations! Step-by-Step Tutorial 🤯
AutoGen Function Calling: INSANE 🚀 Custom Integrations! Step-by-Step Tutorial 🤯
Mervin Praison
58 OpenAI Assistants API + Python 🤖 How to get started? (FULL Tutorial) 🤯 INSANE
OpenAI Assistants API + Python 🤖 How to get started? (FULL Tutorial) 🤯 INSANE
Mervin Praison
59 GPT-4 Vision API 🤯 INSANE Video Recognition Powers! Step-by-Step Tutorial 🚀
GPT-4 Vision API 🤯 INSANE Video Recognition Powers! Step-by-Step Tutorial 🚀
Mervin Praison
60 GPT-4 Vision API 🚀 The Future of Image Recognition! 🤯 Step-by-Step Tutorial
GPT-4 Vision API 🚀 The Future of Image Recognition! 🤯 Step-by-Step Tutorial
Mervin Praison

Related AI Lessons

Inside SAM 3D: how Meta turns a single image into 3D
Learn how Meta's SAM 3D technology turns a single image into 3D, revolutionizing the field of computer vision
Medium · Machine Learning
Inside SAM 3D: how Meta turns a single image into 3D
Learn how Meta's SAM 3D technology generates 3D models from single images, revolutionizing the field of computer vision
Medium · Deep Learning
Demystifying CNNs: How Convolutional Filters and Max-Pooling Actually Work
Learn how Convolutional Neural Networks (CNNs) use convolutional filters and max-pooling to recognize images
Medium · Data Science
Your "Biometric Age Check" Isn't Verifying Identity — And Defense Lawyers Know It
Biometric age checks don't verify identity, a crucial distinction for developers in computer vision and biometrics
Dev.to AI

Chapters (13)

Introduction to Llama 3.2 Vision model
0:18 Test setup using Together.ai
0:40 Image analysis and simplification suggestions
1:26 CAPTCHA text extraction
1:53 Traffic light CAPTCHA test (failed)
2:21 QR code URL extraction attempt (failed)
2:35 "Where's Waldo?" test (failed)
3:01 Person identification test (failed)
3:18 Table extraction from image
3:54 HTML/CSS code generation from image
4:45 Generated code output review
5:11 Overall performance summary
5:23 Conclusion
Up next
Best Mac Mini Alternatives for Running OpenClaw 24/7 in 2026
Tin Rovic
Watch →