Image Recognition with LLaVa in Python

NeuralNine · Beginner ·👁️ Computer Vision ·2y ago

Key Takeaways

This video teaches image recognition with LLaVa in Python, utilizing the Ollama platform for local development and deployment. It covers the basics of setting up and using LLaVa for image labeling and recognition tasks.

Full Transcript

what is going on guys welcome back in this video today we're going to learn how to use lava or the large language and vision assistant locally in order to do easy image recognition in Python so let us get right into [Music] ited all right so we're going to learn how to use lava locally on our system to do image recognition in Python today and in order to have lava running on our system we're going to use a tool called ol you can get it by going to ama.com you can just download it here for Mac Linux and windows on Linux it's just a simple curl command and then all you have to do to get a model onto your system is you have to open up the command line and you have to type ol Lama pull and then the name of the model with a specification of the parameter size uh if applicable so you can go to models here you can scroll through the different models that we have here you can already see lava here if you don't find it you can just filter by name and type lava for example then you can click on it and you can see we have 7 billion 13 billion and 34 billion I'm going to go with a 13 billion because I think this one is too large for my system and this one is less capable so I'm going to go with the uh middle here and basically all you have to do is you have to say AMA pull and then lava and in my case now colon 13 billion so this will then um pull the model onto your system and then that's basically all you need to do in order to have lava on your system the rest is now just using the olama package in Python to communicate with that model um and to to basically provide it with some images and some text and then get a response from it so that's all going to be done with the python package for this we say pip 3 install o Lama and then we can go right into the coding now for this video I've prepared four copyright free images image one 2 three and four and these are the images that we're going to provide to Lava and then we're going to ask certain questions about them like how many docks do you see in this image or what do you see here or uh maybe I mean in this case probably we're not going to ask about the programming language because the code is not very readable but uh chances are we're not always going to get the perfect responses this is not like a massively powerful model but it does a quite decent job at recognizing what is in the image and describing it somewhat decently so we're going to start by saying import o Lama and to send a basic request what we're going to do is we're going to just say response equals ol Lama chat and here now we're going to provide first of all the model that we want to use in our case this is going to be Lava 13 billion uh of course if you used 7 billion you have to provide 7 billion here and then we're going to provide a message history we're going to say messages is equal to a list and we're going to say here R is going to be user then the content is going to be the text prompt something like describe this image and then the third part here is going to be images and the images are going to just be um a collection of paths to the images so in my case now I'm just going to provide here uh Point slash image one. JPEG and that is basically my prompt so we're asking the model to describe the image we're passing the image and that's literally all we have to do to get the image recognition going so we can print then response and from the response we want to get the message field or the message key value pair and then we want to get from this the content of the message so that we get the text response of the model so I can run this now and we're hopefully going to see a decent description of this image here it should be something like a field of crop or I don't know uh now we have server disconnected without response all right now there seems to be some issue with me using the 13 billion parameter model and recording at the same time it works when I'm not recording but if I start using it and then start recording the recording doesn't even start it seems to be some issue with the GPU uh capacity so I'm going to just remove the 13 billion here I'm going to just use lava which is the 7 billion model uh of course this needs to be pulled separately so AMA pull Lo Jaa and uh then this should work so let's see if this describes the image accurately it should say something like oh there you go the image shows a Serene rural scene in the foreground there's a field of golden yellow crops that have likely been harvested or are in the process of ripening okay very detailed here uh definitely it recognizes the the core of the image so let's go ahead and try something else let's go and say we want to use image 2 and uh image 2 is basically a laptop so let's see if it gives us some information about the programming language I don't think that's too easy I think this is HTML though because we can see the tags here um the image is a photograph featuring an individual working on a laptop appears to be typing on the screen there's a visual representation of a graph or chart displaying various data points okay that's not true this is now hallucination so it doesn't even recognize that this is code not even just a programming language it doesn't recognize that this is coding at all um but it recognizes that the table has a marble pattern I think that's correct above the table there's a surface uh yeah I don't know if that's true let's maybe try again and see if we get something else if it recognizes that this is coding uh laptop computer working on a laptop displaying financial data chart or dashboard no not not quite let's try a third time oh there you go uh displaying code with color synta syntax highlighting the language um uses the track pad okay this is pretty good so the third attempt was quite accurate all right so let's try now also the third one and then I want to do something uh specific for the fourth one so the third one is just the pool uh depicts an indor pool area rectangular with light blue color okay this works fine now for the fourth image I want to now count the number of docks in here and you can see we have two dogs and two cats so um uh I'm curious to see if this works so let's see if we say uh how many let me just see what the prompt is I prepared here how many dogs are in this image and then we're going to go with image 4 there are three docks in the image okay that's not quite correct there are three dogs in this image now this is now the difference I think between 7 billion and 13 billion because when I tested this before recording the video with 13 billion it almost always recognized that there are two dogs in the image so what I'm going to try to do now maybe it will crash the recording is I'm going to try to just see if I get uh if I can get the 13 billion uh model running if not we're just going to accept it but with the 30 billion parameter model it was able to recognize two docks and not more than two docks most of the time so I'm going to try again probably this is going to crash because yeah because I'm recording but you can try at home if you have enough vram and enough RAM but something that we can do besides that is we can ask it to to give us keywords okay it doesn't work so yeah unfortunately but but it did work with a 13 billion parameter model I was able to get two as the default answer now sometimes it would say one or three but most of the time it would say two um now what we can do however for any picture is we can use this to automate some process like hashtags or keywords of the image so we can say something like provide five keywords uh describing the image separated by commas now the 13 billion parameter model that de quite well let's see if we can do the same thing with the 7 billion parameter model there you go dog cat pet cute animals this is pretty good now let's see what happens if I do the same thing on image three pool swimming pool indoor pool gym recreation great let's do it for image two laptop programming Workman okay these are just four so it failed here let's try again technology computer coding workspace again just four now we get five okay and they're actually quite decent um now maybe we can try to just see how capable the model is so we can say something like come on why can I not there you go what programming language is displayed on the laptop now it's probably going to say python because python is the most default answer for everything uh I tried this a couple of times there you go it says python I think python is just a default answer every time you ask for programming language unless it really knows um not clear enough to confidently identify I mean this is a good response to be honest but yeah most of the time it will say python even though it's pretty clearly HTML I think seems so PHP HTML something like this all right but this is how you can use lava locally on your system uh with AMA and Alo on a server if you deploy it um to easily automate these procedures if you have a bunch of images that you want to annotate you want to add some labels to them you want to add keywords to them you want to use hashtags or something for Instagram you can do that easily with a local model like lava and if you have a more uh powerful Hardware than I have or if you're not recording you can probably also use more complex models and this will lead to better results so that's it for today's video I hope you enjoyed it and hope you learned something if so let me know by hitting a like button and leaving a com comment in the comment section down below and of course don't forget to subscribe to this Channel and hit the notification Bell to not miss a single future video for free other than that thank you much for watching see you in the next video and bye

Original Description

In this video we learn how to easily do image recognition and labeling in Python using LLaVa and Ollama locally. Ollama: https://ollama.com ◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾ 📚 Programming Books & Merch 📚 🐍 The Python Bible Book: https://www.neuralnine.com/books/ 💻 The Algorithm Bible Book: https://www.neuralnine.com/books/ 👕 Programming Merch: https://www.neuralnine.com/shop 💼 Services 💼 💻 Freelancing & Tutoring: https://www.neuralnine.com/services 🌐 Social Media & Contact 🌐 📱 Website: https://www.neuralnine.com/ 📷 Instagram: https://www.instagram.com/neuralnine 🐦 Twitter: https://twitter.com/neuralnine 🤵 LinkedIn: https://www.linkedin.com/company/neuralnine/ 📁 GitHub: https://github.com/NeuralNine 🎙 Discord: https://discord.gg/JU4xr8U3dm
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from NeuralNine · NeuralNine · 0 of 60

← Previous Next →
1 Visualizing Stock Data With Candlestick Charts in Python
Visualizing Stock Data With Candlestick Charts in Python
NeuralNine
2 Python Beginner Tutorial #1 - Installation and First Program
Python Beginner Tutorial #1 - Installation and First Program
NeuralNine
3 Python Beginner Tutorial #2 - Variables and Data Types
Python Beginner Tutorial #2 - Variables and Data Types
NeuralNine
4 Python Beginner Tutorial #3 - Operators and User Input
Python Beginner Tutorial #3 - Operators and User Input
NeuralNine
5 Python Beginner Tutorial #4 - If Statements and Conditions
Python Beginner Tutorial #4 - If Statements and Conditions
NeuralNine
6 Python Beginner Tutorial #5 - Loops
Python Beginner Tutorial #5 - Loops
NeuralNine
7 Python Beginner Tutorial #6 - Sequences and Collections
Python Beginner Tutorial #6 - Sequences and Collections
NeuralNine
8 Python Beginner Tutorial #7 - Functions
Python Beginner Tutorial #7 - Functions
NeuralNine
9 Python Beginner Tutorial #8 - Exception Handling
Python Beginner Tutorial #8 - Exception Handling
NeuralNine
10 Python Beginner Tutorial #9 - File Operations
Python Beginner Tutorial #9 - File Operations
NeuralNine
11 Python Beginner Tutorial #10 - String Functions
Python Beginner Tutorial #10 - String Functions
NeuralNine
12 Python Intermediate Tutorial #1 - Classes and Objects
Python Intermediate Tutorial #1 - Classes and Objects
NeuralNine
13 Python Intermediate Tutorial #2 - Inheritance
Python Intermediate Tutorial #2 - Inheritance
NeuralNine
14 Python Intermediate Tutorial #3 - Multithreading
Python Intermediate Tutorial #3 - Multithreading
NeuralNine
15 Python Intermediate Tutorial #4 - Synchronizing Threads
Python Intermediate Tutorial #4 - Synchronizing Threads
NeuralNine
16 Python Intermediate Tutorial #5 - Events and Daemon Threads
Python Intermediate Tutorial #5 - Events and Daemon Threads
NeuralNine
17 Python Intermediate Tutorial #6 - Queues
Python Intermediate Tutorial #6 - Queues
NeuralNine
18 Python Intermediate Tutorial #7 - Sockets and Network Programming
Python Intermediate Tutorial #7 - Sockets and Network Programming
NeuralNine
19 Python Intermediate Tutorial #8 - Database Programming
Python Intermediate Tutorial #8 - Database Programming
NeuralNine
20 Python Intermediate Tutorial #9 - Recursion
Python Intermediate Tutorial #9 - Recursion
NeuralNine
21 Python Intermediate Tutorial #10 - XML Processing
Python Intermediate Tutorial #10 - XML Processing
NeuralNine
22 Python Intermediate Tutorial #11 - Logging
Python Intermediate Tutorial #11 - Logging
NeuralNine
23 Python Data Science Tutorial #1 - Anaconda and PyCharm Setup
Python Data Science Tutorial #1 - Anaconda and PyCharm Setup
NeuralNine
24 Python Data Science Tutorial #2 - NumPy Arrays
Python Data Science Tutorial #2 - NumPy Arrays
NeuralNine
25 Python Data Science Tutorial #3 - Numpy Functions
Python Data Science Tutorial #3 - Numpy Functions
NeuralNine
26 Python Data Science Tutorial #4 - Plotting Functions With Matplotlib
Python Data Science Tutorial #4 - Plotting Functions With Matplotlib
NeuralNine
27 Python Data Science Tutorial #5 - Subplots and Multiple Windows
Python Data Science Tutorial #5 - Subplots and Multiple Windows
NeuralNine
28 Python Data Science Tutorial #6 - Matplotlib Styling
Python Data Science Tutorial #6 - Matplotlib Styling
NeuralNine
29 Python Data Science Tutorial #7 - Bar Charts with Matplotlib
Python Data Science Tutorial #7 - Bar Charts with Matplotlib
NeuralNine
30 Python Data Science Tutorial #8 - Pie Charts with Matplotlib
Python Data Science Tutorial #8 - Pie Charts with Matplotlib
NeuralNine
31 Python Data Science Tutorial #9 - Plotting Histograms with Matplotlib
Python Data Science Tutorial #9 - Plotting Histograms with Matplotlib
NeuralNine
32 Python Data Science Tutorial #10 - Scatter Plots with Matplotlib
Python Data Science Tutorial #10 - Scatter Plots with Matplotlib
NeuralNine
33 Python Data Science Tutorial #11 - 3D Plotting with Matplotlib
Python Data Science Tutorial #11 - 3D Plotting with Matplotlib
NeuralNine
34 Python Data Science Tutorial #12 - Pandas Series
Python Data Science Tutorial #12 - Pandas Series
NeuralNine
35 Python Data Science Tutorial #13 - Pandas Data Frames
Python Data Science Tutorial #13 - Pandas Data Frames
NeuralNine
36 Python Data Science Tutorial #14 - Pandas Statistics
Python Data Science Tutorial #14 - Pandas Statistics
NeuralNine
37 Python Data Science Tutorial #15 - Pandas Sorting and Functions
Python Data Science Tutorial #15 - Pandas Sorting and Functions
NeuralNine
38 Python Data Science Tutorial #16 - Pandas Merging Data Frames
Python Data Science Tutorial #16 - Pandas Merging Data Frames
NeuralNine
39 Python Data Science Tutorial #17 - Pandas Queries
Python Data Science Tutorial #17 - Pandas Queries
NeuralNine
40 Python Machine Learning Tutorial #1 - What is Machine Learning?
Python Machine Learning Tutorial #1 - What is Machine Learning?
NeuralNine
41 Python Machine Learning Tutorial #2 - Linear Regression
Python Machine Learning Tutorial #2 - Linear Regression
NeuralNine
42 Python Machine Learning Tutorial #3 - K-Nearest Neighbors Classification
Python Machine Learning Tutorial #3 - K-Nearest Neighbors Classification
NeuralNine
43 Python Machine Learning #4 - Support Vector Machines
Python Machine Learning #4 - Support Vector Machines
NeuralNine
44 Python Machine Learning Tutorial #5 - Decision Trees and Random Forest Classification
Python Machine Learning Tutorial #5 - Decision Trees and Random Forest Classification
NeuralNine
45 Python Machine Learning Tutorial #6 - K-Means Clustering
Python Machine Learning Tutorial #6 - K-Means Clustering
NeuralNine
46 Python Machine Learning Tutorial #7 - Neural Networks
Python Machine Learning Tutorial #7 - Neural Networks
NeuralNine
47 Python Machine Learning Tutorial #8 - Handwritten Digit Recognition with Tensorflow
Python Machine Learning Tutorial #8 - Handwritten Digit Recognition with Tensorflow
NeuralNine
48 Generating Poetic Texts with Recurrent Neural Networks in Python
Generating Poetic Texts with Recurrent Neural Networks in Python
NeuralNine
49 Stock Portfolio Visualization with Matplotlib in Python
Stock Portfolio Visualization with Matplotlib in Python
NeuralNine
50 Analyzing Coronavirus with Python (COVID-19)
Analyzing Coronavirus with Python (COVID-19)
NeuralNine
51 Making Text Images Readable Again with Python and OpenCV
Making Text Images Readable Again with Python and OpenCV
NeuralNine
52 Neural Networks Simply Explained (Theory)
Neural Networks Simply Explained (Theory)
NeuralNine
53 Motion Filtering with OpenCV in Python
Motion Filtering with OpenCV in Python
NeuralNine
54 Top 5 Programming Languages To Learn in 2020
Top 5 Programming Languages To Learn in 2020
NeuralNine
55 Simple TCP Chat Room in Python
Simple TCP Chat Room in Python
NeuralNine
56 Image Classification with Neural Networks in Python
Image Classification with Neural Networks in Python
NeuralNine
57 Edge Detection with OpenCV in Python
Edge Detection with OpenCV in Python
NeuralNine
58 S&P 500 Web Scraping with Python
S&P 500 Web Scraping with Python
NeuralNine
59 Simple Sentiment Text Analysis in Python
Simple Sentiment Text Analysis in Python
NeuralNine
60 Introduction - Algorithms & Data Structures #1
Introduction - Algorithms & Data Structures #1
NeuralNine

This video provides a beginner-friendly introduction to image recognition with LLaVa in Python, covering the setup and usage of Ollama for local development. Viewers will learn how to build and deploy image recognition models using LLaVa. The video is suitable for those with basic Python programming knowledge and an interest in computer vision and machine learning.

Key Takeaways
  1. Install required libraries and dependencies
  2. Set up Ollama for local development
  3. Import LLaVa and load image data
  4. Train and deploy image recognition models
  5. Test and evaluate model performance
💡 LLaVa can be used for image recognition tasks, and Ollama provides a convenient platform for local development and deployment.

Related AI Lessons

When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…
Learn how to build an AI-powered exam monitoring system using Computer Vision and DeepFace to assist professional certification exams
Medium · Python
When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…
Build an AI-powered exam monitoring system using Computer Vision and Deep Learning to enhance professional certification exams
Medium · Deep Learning
When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…
Build an AI-powered exam monitoring system using Computer Vision and Deep Learning to enhance exam security and integrity
Medium · Cybersecurity
Your Face Is About to Become Your Phone Number
Indonesia's mandatory facial verification for SIM cards is a massive test for biometric identity verification at scale, with implications for developers in computer vision and biometrics
Dev.to AI
Up next
Marketing management for ugc net| Important topics of marketing management ugc net commerce dec 2023
Bhoomi Learning Centre~Dr. Muskan
Watch →