Model Tradeoffs and the Future of Computer Vision
Key Takeaways
The video discusses model trade-offs in computer vision, including considerations such as model performance, speed, and size, as well as deployment environments, and explores the future of computer vision with advancements in compute capabilities and frameworks like OpenVINO and TensorRT.
Full Transcript
hey there it's joseph from rebel flow today's discussion we're going to focus on evaluating different model architectures why is it you choose one model over another what sort of trade-offs do you have to consider um i mean you want to open this up with what is it that you should be thinking about when comparing one machine learning model versus another yeah so i think there are just kind of a few broad categories um and then we'll kind of dive in and start to take apart each one of these categories in isolation but the main things you need to be considering are uh how fast is your model going to trade how fast is it going to inference how well is your model going to perform um and then also after you've kind of weighed all of those in in place you need to start thinking about deployment so what's the model size of your your model like in terms of just raw storage um and then also where can you deploy your model so in computer vision that becomes really relevant where uh you're thinking do you need real-time object detection or real-time detection or or you uh it's some latency okay can you do it on the server side where can you actually put your model once you're done so those i think are kind of the main components that you're weighing as you're thinking about different model architectures yeah yeah i think it makes a ton of sense and the components there that are interrelated is really like model performance um in terms of its speed and its accuracy and model size and how that determines as you're mentioning deployment environment and otherwise right so if you have a large model you would in terms of number of parameters and therefore in terms of size of that model in terms of the number of megabytes that it takes up you would generally expect that model to be more accurate right because it has greater granularity it can be fine-tuned to be a bit more performant but also likely slower performing right inferences takes a bit longer and it takes more computational power and in that that side of things i mean you're probably going to be considering server-side deployment i don't know like large models that you're going to be able to get away with deploying on device um but maybe i'm wrong i mean like if we're thinking about that as one of the the trade-offs like where is the line like what which architectures might fall into one corner versus another is this gonna change over time like how do you think about that yeah certainly and i mean it's definitely an exciting time to uh to be in computer division from that standpoint and that the compute capabilities um of different uh different compute engines like gpus and mobidius bpus and uh there are tpus and there's all the different use now that you can be um uh deploying things to you know it's just uh vastly increasing so the size of these models that we're able to actually deploy to the edge um is is ever increasing and not only on the hardware side but also on the software side there are frameworks like openvino and uh nvidia's tensor rt which are speeding up uh the the speed of computations on these things so um you're actually starting to get even more performant models that are larger that you can uh deploy at faster speeds um so that's definitely kind of changing the game and reawakening architectures that previously were kind of uh monolithic and were unable to be to be wielded but certainly the dichotomies here that that joseph has been elaborating on on the fact that a larger model is going to perform better and it's going to infer slower is just kind of always going to be a brief fact and state of the art for a lot of these tasks will always be uh with these extremely large models that are trained on you know many many tpus yeah i mean efficient debt comes to mind there right i mean that's why efficient release d0 to d7 um and yeah i mean the of course if you have the resources of a google it's just fundamentally different than um than elsewhere um i'm gonna ask sort of a question so like if you think about this and you extend out um 10 20 30 years um what does vision look like as a result of these considerations like does that mean that like you're right like as as models become more performant as size becomes less of an issue as we can start to do real time with higher quality um i mean today that looks like the warring frameworks if you will of you know tensor rt and openvino and whatnot from nvidia and intel respectively but you extend it over like ignore those details paper over those just imagine that we're 10 years 20 years 30 years in the future what do you think like yeah state of vision looks like yeah so i mean i think i'll just start with kind of reflecting on the present day um just in the in the few years um you know that that i've seen this these sort of things evolving and i think it's just incredible today that there's uh gpu resources available on say like google collab um for only ten dollars a month you can be accessing a tesla v100 you know and that's just given you by default now and that that's insane you know that uh that that kind of compute is out there and available and i think naturally that's going to kind of only increase as time goes by so the size of the models that are going to be able to deploy the speed that they're going to be able to train and their their performance will just rapidly increase say in 20 to 30 years so that means that the amount of data that is required to learn certain tasks and the you know the size of the data set will be able to shrink you know to be able to achieve these tasks so you'll be able to bootstrap a lot quicker to new areas and new domains um which i think is exciting and you know certain certain problems that we thought were um unreachable before like self-driving cars for example will be um you know for reality we'll be able to achieve those things i don't know what what you think yeah i mean i think those are that's entirely accurate around the the what like i mean the cost of training you're continuing to fall the amount of data that's collected and stored what is it doubles every every couple years the but the implications for that i mean given current circumstances um surrounding the rise in inevitability of remote work not just for you know office jobs but for like remote sensing jobs or like sending drones forward or like i mean in ten ten years we'll probably have uh our first broad scale augmented reality use cases i mean the vision isn't just like if you think about it like right now it's kind of like vision is is is a separate component part it's like okay i gotta get my phone and i start to do vision stuff but like in a very short period of time it'll just be ingrained in an enhancing part of the way that we experience the world and the way that businesses and products etc can experience the world and so the real-time understanding of the real world around us as well as a computer cam with the context that a computer can provide is just going to be um in a lot of ways like trying to predict uber when there first was you know a backlink in the 90s right like when you had arpanet who would have thought about uber as like the same way that we're talking about vision now versus now versus what's capable and it makes for some really really exciting capabilities that um really excited basically be a part of building the future yeah it's it's really an amazing uh amazing time to be be part of things awesome so i mean from evaluating model trade-offs to the future of vision um i really enjoyed this one thanks so much yeah thanks for being here guys uh don't forget to like and subscribe below for more videos like this and uh we'll see you on the next video
Original Description
Choosing a model depends on a number of factors, which are discussed by the fireside in this video. Naturally, these considerations lead one to think about the future of computer vision.
Let us know your thoughts below!
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Roboflow · Roboflow · 42 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
▶
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
YOLOv3 PyTorch Notebook Tutorial
Roboflow
How to Train YOLOv4 on a Custom Dataset (PyTorch)
Roboflow
How to Train YOLOv5 on a Custom Dataset
Roboflow
How to Use the Roboflow Dataset Health Check
Roboflow
What is Mean Average Precision (mAP)?
Roboflow
How to Use the Roboflow Model Library
Roboflow
How to Train EfficientDet in TensorFlow 2 Object Detection
Roboflow
How to Train YOLO v4 Tiny (Darknet) on a Custom Dataset
Roboflow
Ask the Roboflow Team Anything - Episode 1
Roboflow
Exploring The COCO Dataset
Roboflow
Community Spotlight: Improving Uno with Computer Vision
Roboflow
Mosaic Data Augmentation - Deep Dive
Roboflow
Hands on with the OAK-1
Roboflow
Glenn Jocher: What is New in YOLO v5?
Roboflow
How to Use Amazon Rekognition Custom Labels and Roboflow to Build an Object Detection Model
Roboflow
An Interview with Brandon Gilles, Luxonis Founder and OAK Chief Architect
Roboflow
How to Train a Custom Mobile Object Detection Model (with YOLOv4 Tiny and TensorFlow Lite)
Roboflow
Tackling the Small Object Problem in Object Detection
Roboflow
Fast.ai v2 Released - What's New?
Roboflow
Teaser: Roboflow Train (1-Click Computer Vision AutoML)
Roboflow
How to Train a Custom Resnet34 Image Classification Model
Roboflow
How to Label Images for Object Detection with CVAT
Roboflow
Deploy YOLOv5 to Jetson Xavier NX at 30 FPS
Roboflow
Elisha Odemakinde Hosts Roboflow ML Engineer, Jacob Solawetz
Roboflow
Getting Started with VoTT - Computer Vision Annotation
Roboflow
How to Manage Classes in Object Detection (Rename, Combine, Balance)
Roboflow
How to Train YOLOv4 on a Custom Dataset in Darknet
Roboflow
Is Grayscale a Preprocessing or Augmentation Step in Computer Vision?
Roboflow
Getting Started with Image Data Augmentation
Roboflow
Glenn Jocher: Image Augmentation in YOLO v5 and Beyond
Roboflow
GA Hosts Roboflow - Healthcare and AI
Roboflow
How do self driving cars know when to stop?
Roboflow
What is PASCAL VOC XML?
Roboflow
AutoML Showdown: Google vs Amazon vs Microsoft
Roboflow
How is computer vision changing manufacturing?
Roboflow
The Alphabet in American Sign Language
Roboflow
Luxonis OAK-D: Computer Vision on Device
Roboflow
How to Train a Custom Faster R-CNN Model with Facebook AI's Detectron2 | Use Your Own Dataset
Roboflow
TensorFlow vs PyTorch: Fireside
Roboflow
Occlusion Techniques in Computer Vision
Roboflow
A Customizable Web Application for Your Computer Vision Model
Roboflow
Model Tradeoffs and the Future of Computer Vision
Roboflow
Designing an Augmented Reality Board Game App
Roboflow
YOLOv4 - Advanced Tactics
Roboflow
How to Use CreateML and Build a Computer Vision iPhone App | AR Object Detection
Roboflow
Fireside Chat: Computer Vision in Agriculture
Roboflow
Scaled-YOLOv4 Tops EfficientDet: Research Rundown
Roboflow
What is Image Preprocessing?
Roboflow
Building a Community of Creators with BlkArthouse and Von Deon
Roboflow
How to Train Scaled-YOLOv4 to Detect Custom Objects
Roboflow
Intro to Computer Vision: Fireside
Roboflow
The Best Way to Annotate Images for Object Detection
Roboflow
The Computer Vision Process: Fireside
Roboflow
How to Annotate Images with Your Team Using Roboflow
Roboflow
Introducing the Roboflow Object Count Histogram
Roboflow
How Fast is the M1 at Machine Learning? Benchmarking Apple's M1 and Intel's Chips
Roboflow
CLIP: OpenAI's amazing new zero-shot image classifier
Roboflow
How I hacked my Nest camera to run custom models
Roboflow
Getting Started with the Roboflow Inference API
Roboflow
Transfer Learning in Computer Vision | What, How, Why
Roboflow
More on: Modern CV Models
View skill →Related Reads
📰
📰
📰
📰
PANet Paper Walkthrough: When Feature Pyramids Go Bottom-Up
Towards Data Science
CCTV Action Recognition: Comprehensive Fine-Tuning & Real-Time Deployment Guide
Medium · Python
I built a background remover that keeps the fine hair edges
Dev.to · KunStudio
I Built a Python Package to Solve My Own CV Frustration — 7K Downloads in a Week
Medium · Machine Learning
🎓
Tutor Explanation
DeepCamp AI