The 4 main tasks in the production ML lifecycle
Skills:
ML Pipelines90%
Key Takeaways
The 4 main tasks in the production ML lifecycle, including MLOps and human-centric data science tools, are discussed by Outerbounds with Shreya Shankar
Full Transcript
in in the chat so I'm interested in what you discovered in terms of I mean at a at a basic level of packing but pattern recognition just what what the main tasks that people do in the production machine learning life cycle are what did you what did you find out I think a good answer to this is what the textbook says and they found that it's different from what the textbook says and what the textbook will tell you for machine learning it's first you collect data step one step two is you train a model step three is you evaluate that model on a holdout data set make sure there's no overfitting and then step four is you deploy um and the what we found is that we can still categorize into four steps um and maybe the data collection part is similar except for it's more of a look like every I don't know a week or so we want to collect new data we want to make sure there's some QA on that data or something but the rest of the steps the last three steps are totally different um the Second Step what I said before model training is actually experimentation in general whether it be training new models whether it be trying to Source new data or adding new features there's a lot of ways you can think about improving a model um and a lot of the participants actually preferred to look into finding new data gave you signal or making features more fresh instead of stale features that they had before so that's kind of step two stage three in the process was we call evaluation and deployment so evaluation is not a one and done thing what happens is evaluation is kind of done maybe on a holdout data set at first and then it's deployed to a small fraction of users and then when the model shows a little promise there increasingly it's deployed to more and more users as we learn more about what it can do what it can't do what failure modes exist how do we go and Patch problems until we've kind of gotten to the full population um so key takeaway is evaluation is not a one-time thing it is a loop on evaluation and deployment a multi-stage deployment and then the latest the step four we found was this overall monitoring and response stage which was when you do have these models in production what is their what is their live performance um if you see the performance dropping what are the bugs where are the bugs how do we respond to them quickly whether that be actually trying to go do root cause analysis or simply retraining the model there is a stage around making sure that there's little downtime for these services so we do we have those four stages shown in the first figure in the paper and and it was interesting to several of us authors that they don't match the textbook I think that's like kind of a narrative that we want people to take away absolutely and let me ask are these um different steps I mean there's an iterative Loop happening there but are they kind of coupled in someone because you could imagine monitoring and validation yeah aren't always separable right totally um I'll say that monitoring is kind of done across the state or kind of in all of the stages um people monitoring their training jobs people monitoring data collection a lot of the times there's human in the loop processes to collect and label data to verify some quality whenever you see a failure um on the ground to go and collect examples that look like that failure so you can go back and augment your validation sets so in that sense there's monitoring all over the place but the one stage that we found was that was super iterative in itself was um evaluation and deployment um evaluations data sets never stay the same they're always changing they're always growing especially in tasks or domains where failures have such a high cost like autonomous vehicles are a great example of a failure is a really high cost when we observe one we need to make sure that we have no more failures like that so how do we go and invest efforts into making sure our foldout data sets whenever you evaluate your bottles new models in the future they're also robust to the problem yeah and I suppose drilling down even a bit more into data collection and I think this is one thing that that you're getting at in textbook data validation you're given a holdout set right as opposed to be actively getting that data and validating it and making sure it's the right data then using it as validation set there are feedback delays there all of these totally totally um I feel like I was never taught this in a machine learning class nobody taught me how to actually evaluate a model it's not like I need a big checklist um I just want to know if I were to be a machine learning engineer for a month not just one time how do I what do I do about my validation set do I keep it the same do I grow it when do I add to it what do I add to it I think this in itself super interesting problems for people to consider
Original Description
A clip from our fireside chat "Operationalizing ML -- Patterns and Pain Points from MLOps Practitioners" with Shreya Shankar. You can find the full conversation here: https://youtu.be/7zB6ESFto_U
Find out more about how we think about MLOps, OSS, and human-centric data science tools here: https://outerbounds.com/
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Playlist UU5h8Ji6Lm1RyAZopnCpDq7Q · Outerbounds · 20 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
▶
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Metaflow GUI for monitoring machine learning workflows
Outerbounds
Metaflow Cards [no sound]
Outerbounds
Fireside chat #1: How to Produce Sustainable Business Value with Machine Learning
Outerbounds
Fireside chat #2: MadeWithML.com -- Teaching Practical Machine Learning
Outerbounds
Metaflow on Kubernetes and Argo Workflows [no sound]
Outerbounds
Fireside chat #3: Reasonable Scale Machine Learning -- You're not Google and it's totally OK
Outerbounds
Metaflow Tags: Programmatic Tagging
Outerbounds
Metaflow Tags: Basic Tagging
Outerbounds
Metaflow Tags: Tags in CI/CD
Outerbounds
Metaflow Tags: Tags and Namespaces
Outerbounds
Metaflow Tags: Tags and Continuous Training
Outerbounds
Fireside chat #4: Machine Learning and User Experience -- Building ML Products for People
Outerbounds
Fireside Chat #5: Machine Learning + Infrastructure for Humans
Outerbounds
Metaflow Sandbox Demo: Free Data Science Infrastructure In the Browser
Outerbounds
Metaflow on Azure
Outerbounds
Fireside Chat #6: Operationalizing ML -- Patterns and Pain Points from MLOps Practitioners
Outerbounds
ML engineering vs traditional software engineering: similarities and differences
Outerbounds
Why data scientists love and hate notebooks: velocity and validation
Outerbounds
What even is a 10x ML engineer?
Outerbounds
The 4 main tasks in the production ML lifecycle
Outerbounds
Is the premise of data-centric AI flawed?
Outerbounds
The 3 factors that Determine the success of ML projects
Outerbounds
Fireside Chat #7: How to Build an Enterprise Machine Learning Platform from Scratch
Outerbounds
Run Metaflow on any cloud: Google Cloud, Azure, or AWS [no sound]
Outerbounds
Metaflow on GCP
Outerbounds
Fireside Chat #8: Navigating the Full Stack of Machine Learning
Outerbounds
How to Build a Full-Stack Recommender System
Outerbounds
Modernize your Airflow deployments with Metaflow - zero-cost migration [no sound]
Outerbounds
Easy Airflow DAGs for ML and data science with Metaflow [no sound]
Outerbounds
Fireside chat #9: Language Processing: From Prototype to Production
Outerbounds
How to build end-to-end recommender systems at reasonable scale
Outerbounds
Full-Stack Machine Learning with Metaflow on CoRise
Outerbounds
Natural Language Processing meets MLOps
Outerbounds
Fireside Chat #10: Large Language Models: Beyond Proofs of Concept
Outerbounds
What even are Large Language Models?
Outerbounds
How to get started with LLMs today
Outerbounds
LLMs in production
Outerbounds
Accessing secrets securely in Metaflow [no audio]
Outerbounds
Fireside Chat #11: The Open-Source Modern Data Stack
Outerbounds
Fireside chat #12: Kubernetes for Data Scientists
Outerbounds
Behind the Screen: How Amazon Prime Video ships RecSys models 4x faster
Outerbounds
Fireside chat #13: Supply Chain Security in Machine Learning
Outerbounds
Quick Delivery, Quicker ML: DeliveryHero's Metaflow Story
Outerbounds
Crafting General Intelligence: LLM Fine-tuning with Metaflow at Adept.ai
Outerbounds
Fuelling Decisions: How DTN Powers Gas Pricing and Data Science Collaboration
Outerbounds
From Kitchen to Doorstep: Optimizing Data Science Velocity at Deliveroo
Outerbounds
Building a GenAI Ready ML Platform with Metaflow at Autodesk
Outerbounds
Media Transcoding for 10 Million users and beyond with Metaflow at Epignosis
Outerbounds
Telematics with Metaflow: How Nirvana Insurance built a large-scale Risk Estimation platform
Outerbounds
Fireside chat #14: Generative AI and Machine Learning for Film, TV, and Gaming
Outerbounds
The Past, Present, and Future of Generative AI
Outerbounds
Building Production Systems with Generative AI, Machine Learning, and Data
Outerbounds
A Custom Fine-Tuned LLM in Action (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 5)
Outerbounds
Building Live Production Systems with RAG (LLMs & RAG: An Interactive Guided Tour Part 4)
Outerbounds
Better Relevancy with RAG (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 3)
Outerbounds
Working with OSS LLMs (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 2)
Outerbounds
Hitting OpenAI and Other Vendor APIs (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 1)
Outerbounds
Production Systems with Generative AI (LLMs, RAG, & Fine-Tuning: An Interactive Guided Tour Part 0)
Outerbounds
LLMs in Practice: A Guide to Recent Trends and Techniques
Outerbounds
Metaflow for distributed high-performance computing and large-scale AI training
Outerbounds
More on: ML Pipelines
View skill →
🎓
Tutor Explanation
DeepCamp AI