Model Tradeoffs and the Future of Computer Vision

Roboflow · Intermediate ·👁️ Computer Vision ·5y ago

Key Takeaways

The video discusses model trade-offs in computer vision, including considerations such as model performance, speed, and size, as well as deployment environments, and explores the future of computer vision with advancements in compute capabilities and frameworks like OpenVINO and TensorRT.

Full Transcript

hey there it's joseph from rebel flow today's discussion we're going to focus on evaluating different model architectures why is it you choose one model over another what sort of trade-offs do you have to consider um i mean you want to open this up with what is it that you should be thinking about when comparing one machine learning model versus another yeah so i think there are just kind of a few broad categories um and then we'll kind of dive in and start to take apart each one of these categories in isolation but the main things you need to be considering are uh how fast is your model going to trade how fast is it going to inference how well is your model going to perform um and then also after you've kind of weighed all of those in in place you need to start thinking about deployment so what's the model size of your your model like in terms of just raw storage um and then also where can you deploy your model so in computer vision that becomes really relevant where uh you're thinking do you need real-time object detection or real-time detection or or you uh it's some latency okay can you do it on the server side where can you actually put your model once you're done so those i think are kind of the main components that you're weighing as you're thinking about different model architectures yeah yeah i think it makes a ton of sense and the components there that are interrelated is really like model performance um in terms of its speed and its accuracy and model size and how that determines as you're mentioning deployment environment and otherwise right so if you have a large model you would in terms of number of parameters and therefore in terms of size of that model in terms of the number of megabytes that it takes up you would generally expect that model to be more accurate right because it has greater granularity it can be fine-tuned to be a bit more performant but also likely slower performing right inferences takes a bit longer and it takes more computational power and in that that side of things i mean you're probably going to be considering server-side deployment i don't know like large models that you're going to be able to get away with deploying on device um but maybe i'm wrong i mean like if we're thinking about that as one of the the trade-offs like where is the line like what which architectures might fall into one corner versus another is this gonna change over time like how do you think about that yeah certainly and i mean it's definitely an exciting time to uh to be in computer division from that standpoint and that the compute capabilities um of different uh different compute engines like gpus and mobidius bpus and uh there are tpus and there's all the different use now that you can be um uh deploying things to you know it's just uh vastly increasing so the size of these models that we're able to actually deploy to the edge um is is ever increasing and not only on the hardware side but also on the software side there are frameworks like openvino and uh nvidia's tensor rt which are speeding up uh the the speed of computations on these things so um you're actually starting to get even more performant models that are larger that you can uh deploy at faster speeds um so that's definitely kind of changing the game and reawakening architectures that previously were kind of uh monolithic and were unable to be to be wielded but certainly the dichotomies here that that joseph has been elaborating on on the fact that a larger model is going to perform better and it's going to infer slower is just kind of always going to be a brief fact and state of the art for a lot of these tasks will always be uh with these extremely large models that are trained on you know many many tpus yeah i mean efficient debt comes to mind there right i mean that's why efficient release d0 to d7 um and yeah i mean the of course if you have the resources of a google it's just fundamentally different than um than elsewhere um i'm gonna ask sort of a question so like if you think about this and you extend out um 10 20 30 years um what does vision look like as a result of these considerations like does that mean that like you're right like as as models become more performant as size becomes less of an issue as we can start to do real time with higher quality um i mean today that looks like the warring frameworks if you will of you know tensor rt and openvino and whatnot from nvidia and intel respectively but you extend it over like ignore those details paper over those just imagine that we're 10 years 20 years 30 years in the future what do you think like yeah state of vision looks like yeah so i mean i think i'll just start with kind of reflecting on the present day um just in the in the few years um you know that that i've seen this these sort of things evolving and i think it's just incredible today that there's uh gpu resources available on say like google collab um for only ten dollars a month you can be accessing a tesla v100 you know and that's just given you by default now and that that's insane you know that uh that that kind of compute is out there and available and i think naturally that's going to kind of only increase as time goes by so the size of the models that are going to be able to deploy the speed that they're going to be able to train and their their performance will just rapidly increase say in 20 to 30 years so that means that the amount of data that is required to learn certain tasks and the you know the size of the data set will be able to shrink you know to be able to achieve these tasks so you'll be able to bootstrap a lot quicker to new areas and new domains um which i think is exciting and you know certain certain problems that we thought were um unreachable before like self-driving cars for example will be um you know for reality we'll be able to achieve those things i don't know what what you think yeah i mean i think those are that's entirely accurate around the the what like i mean the cost of training you're continuing to fall the amount of data that's collected and stored what is it doubles every every couple years the but the implications for that i mean given current circumstances um surrounding the rise in inevitability of remote work not just for you know office jobs but for like remote sensing jobs or like sending drones forward or like i mean in ten ten years we'll probably have uh our first broad scale augmented reality use cases i mean the vision isn't just like if you think about it like right now it's kind of like vision is is is a separate component part it's like okay i gotta get my phone and i start to do vision stuff but like in a very short period of time it'll just be ingrained in an enhancing part of the way that we experience the world and the way that businesses and products etc can experience the world and so the real-time understanding of the real world around us as well as a computer cam with the context that a computer can provide is just going to be um in a lot of ways like trying to predict uber when there first was you know a backlink in the 90s right like when you had arpanet who would have thought about uber as like the same way that we're talking about vision now versus now versus what's capable and it makes for some really really exciting capabilities that um really excited basically be a part of building the future yeah it's it's really an amazing uh amazing time to be be part of things awesome so i mean from evaluating model trade-offs to the future of vision um i really enjoyed this one thanks so much yeah thanks for being here guys uh don't forget to like and subscribe below for more videos like this and uh we'll see you on the next video

Original Description

Choosing a model depends on a number of factors, which are discussed by the fireside in this video. Naturally, these considerations lead one to think about the future of computer vision. Let us know your thoughts below!
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Roboflow · Roboflow · 42 of 60

1 YOLOv3 PyTorch Notebook Tutorial
YOLOv3 PyTorch Notebook Tutorial
Roboflow
2 How to Train YOLOv4 on a Custom Dataset (PyTorch)
How to Train YOLOv4 on a Custom Dataset (PyTorch)
Roboflow
3 How to Train YOLOv5 on a Custom Dataset
How to Train YOLOv5 on a Custom Dataset
Roboflow
4 How to Use the Roboflow Dataset Health Check
How to Use the Roboflow Dataset Health Check
Roboflow
5 What is Mean Average Precision (mAP)?
What is Mean Average Precision (mAP)?
Roboflow
6 How to Use the Roboflow Model Library
How to Use the Roboflow Model Library
Roboflow
7 How to Train EfficientDet in TensorFlow 2 Object Detection
How to Train EfficientDet in TensorFlow 2 Object Detection
Roboflow
8 How to Train YOLO v4 Tiny (Darknet) on a Custom Dataset
How to Train YOLO v4 Tiny (Darknet) on a Custom Dataset
Roboflow
9 Ask the Roboflow Team Anything - Episode 1
Ask the Roboflow Team Anything - Episode 1
Roboflow
10 Exploring The COCO Dataset
Exploring The COCO Dataset
Roboflow
11 Community Spotlight: Improving Uno with Computer Vision
Community Spotlight: Improving Uno with Computer Vision
Roboflow
12 Mosaic Data Augmentation - Deep Dive
Mosaic Data Augmentation - Deep Dive
Roboflow
13 Hands on with the OAK-1
Hands on with the OAK-1
Roboflow
14 Glenn Jocher: What is New in YOLO v5?
Glenn Jocher: What is New in YOLO v5?
Roboflow
15 How to Use Amazon Rekognition Custom Labels and Roboflow to Build an Object Detection Model
How to Use Amazon Rekognition Custom Labels and Roboflow to Build an Object Detection Model
Roboflow
16 An Interview with Brandon Gilles, Luxonis Founder and OAK Chief Architect
An Interview with Brandon Gilles, Luxonis Founder and OAK Chief Architect
Roboflow
17 How to Train a Custom Mobile Object Detection Model (with YOLOv4 Tiny and TensorFlow Lite)
How to Train a Custom Mobile Object Detection Model (with YOLOv4 Tiny and TensorFlow Lite)
Roboflow
18 Tackling the Small Object Problem in Object Detection
Tackling the Small Object Problem in Object Detection
Roboflow
19 Fast.ai v2 Released - What's New?
Fast.ai v2 Released - What's New?
Roboflow
20 Teaser: Roboflow Train (1-Click Computer Vision AutoML)
Teaser: Roboflow Train (1-Click Computer Vision AutoML)
Roboflow
21 How to Train a Custom Resnet34 Image Classification Model
How to Train a Custom Resnet34 Image Classification Model
Roboflow
22 How to Label Images for Object Detection with CVAT
How to Label Images for Object Detection with CVAT
Roboflow
23 Deploy YOLOv5 to Jetson Xavier NX at 30 FPS
Deploy YOLOv5 to Jetson Xavier NX at 30 FPS
Roboflow
24 Elisha Odemakinde Hosts Roboflow ML Engineer, Jacob Solawetz
Elisha Odemakinde Hosts Roboflow ML Engineer, Jacob Solawetz
Roboflow
25 Getting Started with VoTT - Computer Vision Annotation
Getting Started with VoTT - Computer Vision Annotation
Roboflow
26 How to Manage Classes in Object Detection (Rename, Combine, Balance)
How to Manage Classes in Object Detection (Rename, Combine, Balance)
Roboflow
27 How to Train YOLOv4 on a Custom Dataset in Darknet
How to Train YOLOv4 on a Custom Dataset in Darknet
Roboflow
28 Is Grayscale a Preprocessing or Augmentation Step in Computer Vision?
Is Grayscale a Preprocessing or Augmentation Step in Computer Vision?
Roboflow
29 Getting Started with Image Data Augmentation
Getting Started with Image Data Augmentation
Roboflow
30 Glenn Jocher: Image Augmentation in YOLO v5 and Beyond
Glenn Jocher: Image Augmentation in YOLO v5 and Beyond
Roboflow
31 GA Hosts Roboflow - Healthcare and AI
GA Hosts Roboflow - Healthcare and AI
Roboflow
32 How do self driving cars know when to stop?
How do self driving cars know when to stop?
Roboflow
33 What is PASCAL VOC XML?
What is PASCAL VOC XML?
Roboflow
34 AutoML Showdown: Google vs Amazon vs Microsoft
AutoML Showdown: Google vs Amazon vs Microsoft
Roboflow
35 How is computer vision changing manufacturing?
How is computer vision changing manufacturing?
Roboflow
36 The Alphabet in American Sign Language
The Alphabet in American Sign Language
Roboflow
37 Luxonis OAK-D: Computer Vision on Device
Luxonis OAK-D: Computer Vision on Device
Roboflow
38 How to Train a Custom Faster R-CNN Model with Facebook AI's Detectron2 | Use Your Own Dataset
How to Train a Custom Faster R-CNN Model with Facebook AI's Detectron2 | Use Your Own Dataset
Roboflow
39 TensorFlow vs PyTorch: Fireside
TensorFlow vs PyTorch: Fireside
Roboflow
40 Occlusion Techniques in Computer Vision
Occlusion Techniques in Computer Vision
Roboflow
41 A Customizable Web Application for Your Computer Vision Model
A Customizable Web Application for Your Computer Vision Model
Roboflow
Model Tradeoffs and the Future of Computer Vision
Model Tradeoffs and the Future of Computer Vision
Roboflow
43 Designing an Augmented Reality Board Game App
Designing an Augmented Reality Board Game App
Roboflow
44 YOLOv4 - Advanced Tactics
YOLOv4 - Advanced Tactics
Roboflow
45 How to Use CreateML and Build a Computer Vision iPhone App | AR Object Detection
How to Use CreateML and Build a Computer Vision iPhone App | AR Object Detection
Roboflow
46 Fireside Chat: Computer Vision in Agriculture
Fireside Chat: Computer Vision in Agriculture
Roboflow
47 Scaled-YOLOv4 Tops EfficientDet: Research Rundown
Scaled-YOLOv4 Tops EfficientDet: Research Rundown
Roboflow
48 What is Image Preprocessing?
What is Image Preprocessing?
Roboflow
49 Building a Community of Creators with BlkArthouse and Von Deon
Building a Community of Creators with BlkArthouse and Von Deon
Roboflow
50 How to Train Scaled-YOLOv4 to Detect Custom Objects
How to Train Scaled-YOLOv4 to Detect Custom Objects
Roboflow
51 Intro to Computer Vision: Fireside
Intro to Computer Vision: Fireside
Roboflow
52 The Best Way to Annotate Images for Object Detection
The Best Way to Annotate Images for Object Detection
Roboflow
53 The Computer Vision Process: Fireside
The Computer Vision Process: Fireside
Roboflow
54 How to Annotate Images with Your Team Using Roboflow
How to Annotate Images with Your Team Using Roboflow
Roboflow
55 Introducing the Roboflow Object Count Histogram
Introducing the Roboflow Object Count Histogram
Roboflow
56 How Fast is the M1 at Machine Learning? Benchmarking Apple's M1 and Intel's Chips
How Fast is the M1 at Machine Learning? Benchmarking Apple's M1 and Intel's Chips
Roboflow
57 CLIP: OpenAI's amazing new zero-shot image classifier
CLIP: OpenAI's amazing new zero-shot image classifier
Roboflow
58 How I hacked my Nest camera to run custom models
How I hacked my Nest camera to run custom models
Roboflow
59 Getting Started with the Roboflow Inference API
Getting Started with the Roboflow Inference API
Roboflow
60 Transfer Learning in Computer Vision | What, How, Why
Transfer Learning in Computer Vision | What, How, Why
Roboflow

The video discusses model trade-offs in computer vision and explores the future of the field with advancements in compute capabilities and frameworks. Viewers can learn to evaluate model trade-offs and understand deployment considerations for computer vision models.

Key Takeaways
  1. Evaluate model performance and size trade-offs
  2. Consider deployment environments for computer vision models
  3. Explore advancements in compute capabilities and frameworks like OpenVINO and TensorRT
  4. Optimize model performance and size for deployment
  5. Choose appropriate model architectures for computer vision tasks
💡 The size and performance of computer vision models are increasingly important considerations for deployment, and advancements in compute capabilities and frameworks are changing the game for model development and deployment.

Related Reads

📰
PANet Paper Walkthrough: When Feature Pyramids Go Bottom-Up
Learn how PANet's bottom-up feature pyramid approach improves feature extraction by shortening the path between low-level and high-level features
Towards Data Science
📰
CCTV Action Recognition: Comprehensive Fine-Tuning & Real-Time Deployment Guide
Learn to fine-tune and deploy a hybrid Deep Learning model for CCTV action recognition using MobileNetV2 and Python
Medium · Python
📰
I built a background remover that keeps the fine hair edges
Learn how to build a background remover that preserves fine hair edges, a challenging task in image processing
Dev.to · KunStudio
📰
I Built a Python Package to Solve My Own CV Frustration — 7K Downloads in a Week
Learn how to create a Python package to simplify computer vision pipelines and achieve 7K downloads in a week
Medium · Machine Learning
Up next
Marketing management for ugc net| Important topics of marketing management ugc net commerce dec 2023
Bhoomi Learning Centre~Dr. Muskan
Watch →