Llama 3.2: Best Multimodal Model Yet? (Vision Test)
Llama 3.2 Vision Model is one of the top-performing models you can run locally on your computer for free! In this video, we put Llama 3.2's multimodal capabilities to the test. 🧑💻🌐 From image recognition, CAPTCHA solving, QR code scanning, to text extraction, we examine how well the 90-billion parameter and 11-billion parameter models perform.
🔍 What's in this video?
Testing image analysis and modification suggestions 🛋️
Attempting CAPTCHA and QR code recognition 📸
Finding Wally and identifying people in an image 🤔
Extracting tables and generating HTML/CSS code for designs 💻
🚀 The results show some impressive strengths but also reveal a few limitations! Curious about Llama 3.2's capabilities? Watch the full video to find out.
🔗 Links:
Patreon: https://patreon.com/MervinPraison
Ko-fi: https://ko-fi.com/mervinpraison
Discord: https://discord.gg/nNZu5gGT59
Twitter / X : https://twitter.com/mervinpraison
GPU for 50% of it's cost: https://bit.ly/mervin-praison Coupon: MervinPraison (A6000, A5000)
0:00 - Introduction to Llama 3.2 Vision model
0:18 - Test setup using Together.ai
0:40 - Image analysis and simplification suggestions
1:26 - CAPTCHA text extraction
1:53 - Traffic light CAPTCHA test (failed)
2:21 - QR code URL extraction attempt (failed)
2:35 - "Where's Waldo?" test (failed)
3:01 - Person identification test (failed)
3:18 - Table extraction from image
3:54 - HTML/CSS code generation from image
4:45 - Generated code output review
5:11 - Overall performance summary
5:23 - Conclusion
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Mervin Praison · Mervin Praison · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Build GCP Infra using Pulumi in YAML format
Mervin Praison
How to Convert a Pulumi YAML File to Python Format
Mervin Praison
Speed Up AWS EKS: A Complete Guide to Performance Tuning & Debugging!
Mervin Praison
Learn GCP GKE to AWS EKS Migration in Just 5 Minutes: Quick Guide
Mervin Praison
AWS & Kubernetes: The Definitive Guide to Data Persistence with PV and PVC
Mervin Praison
ChatGPT Voice Conversation RELEASED! It's AMAZING!! (Demo)
Mervin Praison
How to Install Mistral 7B in Minutes: Quick & Easy Guide! ✅
Mervin Praison
Code Llama Install Locally: 🐍💻 Elevate Your Python Skills!
Mervin Praison
Orca Mini: Your Ultimate Guide to Install and Test on Mac & Linux 💻
Mervin Praison
Quick & Easy Vicuna Setup on Mac and Linux 💻
Mervin Praison
Quick Guide: Llama2 Local Installation and ChatGPT with pip! Python🛠️
Mervin Praison
Query PDFs Like a Pro with Local GPT: Full Setup Guide! 📜
Mervin Praison
LM Studio: EASIEST way to Run Large Language Models Locally!
Mervin Praison
AMAZING ChatGPT Vision is OUT! 🤯 14+ Examples (Step-by-Step) FULL Tutorial
Mervin Praison
Unbelievable! Build ANY App Instantly with Smol AI! 😲🔥
Mervin Praison
Amazing! AutoGen Made Easy: A Step-by-Step Beginners Guide 📚
Mervin Praison
How to Set Up LoLLMS and Run LLMs Locally! 🚀 Step-by-Step Tutorial
Mervin Praison
GPT4All: INSANE Way to Run Large Language Models Locally! 😲 Step-By-Step Tutorial
Mervin Praison
Incredible AI-Powered NPCs in Unity Game Engine: Step by Step Tutorial!🤯
Mervin Praison
MemGPT 🧠 LLM as Operating System. It's INSANE! Step-by-Step Tutorial 🤯
Mervin Praison
Text Generation Web UI: MIND-BLOWING Way to Run LLM Locally! 🤯
Mervin Praison
Unlock the INSANE Power of OpenAI GPT-4 with C#/.NET! 😲
Mervin Praison
Integrate Langchain and Ollama for Local AI Power 🤯 Indeed POWERFUL!
Mervin Praison
ChatDev: INSANE Virtual AI Agents! Future of Software Development 😲
Mervin Praison
Query PDFs Using Mistral: Unlock INSANE Power! 🤯
Mervin Praison
AutoGen + Open-Source LLMs: UNBELIEVABLE! Step-by-Step Tutorial You Can't Miss! 🤯
Mervin Praison
AutoGen + Text Generation WebUI: Unbelievable 100% Local Private Setup 🤯
Mervin Praison
MemGPT: Amazing! External Context for LLM #ai #llm #memgpt #generativeai #mem #gpt #openai #chatgpt
Mervin Praison
GeniA: Kubernetes + AI for MIND-BLOWING Operational Efficiency! 🤯 FULL Tutorial
Mervin Praison
VertexAI Meets LangChain for Mind-Blowing AI Conversations! 😲 Step by Step Tutorial
Mervin Praison
Simplified ChatGPT API Setup on Node.js for Newbies! 😍 Step by Step Tutorial
Mervin Praison
Autogen: Ollama integration 🤯 Step by Step Tutorial. Mind-blowing!
Mervin Praison
LiteLLM: One-Function Call to ANY Large Language Model! 🤯 UNBELIEVABLE!
Mervin Praison
ChatGPT Chatbot in Less Time Than You Think! 🚀😎 Step-by-Step Tutorial
Mervin Praison
LiteLLM Chatbot: Build Your Own in MINUTES! INSANE! 🤖🔥
Mervin Praison
Create Chatbot: Turn ANY Open-Source LLM into a Conversation Pro! 🤖
Mervin Praison
Create Chatbot: Ollama Integration Made UNBELIEVABLY Easy! 🎉
Mervin Praison
LlamaIndex + ChatGPT: Ingest Data and Experience UNBELIEVABLE Query Results! 🌟
Mervin Praison
INSANE! OpenAgents: Automated Data Analysis with Kaggle 🤯
Mervin Praison
React.js LLM Agent for Next-Gen Coding using ChatGPT 🚀 Mind-Blowing 🤯
Mervin Praison
MemGPT + Any LLM 🚀 100% Local & Private Integration Unveiled! Unlimited Memory
Mervin Praison
MemGPT + AutoGen 🧠🤖 Unlimited Memory & Autonomous AI Agents! INSANE🤯
Mervin Praison
AutoGen + Google's Palm LLM & More! Revolutionary AI Integration 🚀
Mervin Praison
MemGPT & LM Studio Integration Revealed! 🔥 Next-Level AI
Mervin Praison
🚀 AutoLLM: Unlock the Power of 100+ Language Models! Step-by-Step Tutorial
Mervin Praison
AutoLLM & Gradio Integration You Won't Believe! 🤯 Mind-Blowing
Mervin Praison
AutoLLM & FastAPI Tutorial: Query 100+ Language Models! 😱
Mervin Praison
Quivr: LLM's Second Brain - Transforming Data Management & Advanced Query with AI! 🤯
Mervin Praison
AutoGen & MemGPT with Local LLM: A Complete Setup Tutorial! 🧠 AMAZING 🤯
Mervin Praison
LocalAI: Free, Open Source OpenAI Alternative 🚀 INSANE 🤯 Step-by-Step Tutorial
Mervin Praison
Yarn Mistral 7B 128k LARGE context window, Small size 🤯 INSANE 🚀 Setup Tutorial!
Mervin Praison
Zephyr-7B: The Small and Mighty LLM 🤯 Step by Step Tutorial! 📘
Mervin Praison
Promptfoo: How to Test Your LLM ? 🚀 VERY EASY!
Mervin Praison
Pydantic: How to Validate LLM Responses? 🚀 Quality Response. VERY EASY!!!!
Mervin Praison
Pydantic: FORCE Your AI to Respond Back in UPPERCASE! 🤯 Step-by-Step Tutorial 🔥
Mervin Praison
Pydantic: How to use LLM to convert unstructured data to structured data?
Mervin Praison
AutoGen Function Calling: INSANE 🚀 Custom Integrations! Step-by-Step Tutorial 🤯
Mervin Praison
OpenAI Assistants API + Python 🤖 How to get started? (FULL Tutorial) 🤯 INSANE
Mervin Praison
GPT-4 Vision API 🤯 INSANE Video Recognition Powers! Step-by-Step Tutorial 🚀
Mervin Praison
GPT-4 Vision API 🚀 The Future of Image Recognition! 🤯 Step-by-Step Tutorial
Mervin Praison
More on: CV Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Inside SAM 3D: how Meta turns a single image into 3D
Medium · Machine Learning
Inside SAM 3D: how Meta turns a single image into 3D
Medium · Deep Learning
Demystifying CNNs: How Convolutional Filters and Max-Pooling Actually Work
Medium · Data Science
Your "Biometric Age Check" Isn't Verifying Identity — And Defense Lawyers Know It
Dev.to AI
Chapters (13)
Introduction to Llama 3.2 Vision model
0:18
Test setup using Together.ai
0:40
Image analysis and simplification suggestions
1:26
CAPTCHA text extraction
1:53
Traffic light CAPTCHA test (failed)
2:21
QR code URL extraction attempt (failed)
2:35
"Where's Waldo?" test (failed)
3:01
Person identification test (failed)
3:18
Table extraction from image
3:54
HTML/CSS code generation from image
4:45
Generated code output review
5:11
Overall performance summary
5:23
Conclusion
🎓
Tutor Explanation
DeepCamp AI