Hybrid Data Science Teams @SurveyMonkey

MLOps.community · Intermediate ·📐 ML Fundamentals ·6y ago

Key Takeaways

The video discusses the structure and collaboration of hybrid data science teams, specifically the roles and responsibilities of machine learning engineers and data scientists in managing model artifacts, versioning, and retraining, with a focus on the advantages of working together in tandem from the beginning of a project.

Full Transcript

JH has a question here about the line between the machine learning engineer and data scientists he's wondering creating managing model artifacts using repository for this model versioning retraining is this the role of ml engineer or a data scientist how do ml engineers and data scientists work together and when does the ml engineer take over or are they continuously working together because data scientists aren't often keen on software development or versioning yeah I know it's a great question I think each organization works differently so I can talk specifically from Survey Monkey is a few years ago it used to be like oh yeah I did a scientists do all the model development to give us the model artifact I'm gonna go like ok no here you go handoff that did not work work very well for us in the long run and since then we work together in tandem from the very beginning and that allows our data scientists to pick up on some software development it allows our ml engineers to understand the model at a very very low level so that makes sure that we can know from the get-go like oh actually you know we don't have this data available in our feature store real time like it it only gets populated once a day so like are you ok with that is that fine and our data scientists know this the requirements or limitations upfront or like oh yeah you know we can't support a 15 layer neural net we can only really support 10 layer ones at the moment so you know whatever you're looking to do we can't do that and because those conversations are happening earlier on our data scientists know what the restrictions are we know what the restrictions are and from there on we can really make sure that we don't have any I'm like blockers that we didn't foresee kind of appearing and as a result to be specific data scientists still are probably the leaves on the model development and that male engineers are more so the leads on the Purdue aspects but there's definitely some crossover and more and more crossover that it happens over time and I would say that over time I'd say I've gotten way more adept on the data science side of stuff and her data scientists have become way more adept also on the software engineering and you never know it's going along and if anything were to happen to our services I'm sure they'll be able to you know be able to chip in and make sure things are going smoothly

Original Description

MLOps Community Meetup #4 In the 4th online meetup for our MLOps.community We spoke with Shubhi Jain, Machine Learning Engineer and all-around great guy! In this Clip he talks about how data science teams are structured and why its advantageous. This is an excerpt taken from the longer conversation that can be found here: https://youtu.be/oq1g4s2dUHE Every organization is leveraging machine learning (ML) to provide increasing value to their customers and understand their business. You may have created models too. But, how do you scale this process now? In this case study, we looked at how to pinpoint inefficiencies in your ML data flow, how SurveyMonkey tackled this, and how to make your data more usable to accelerate ML model development. Shubhi Jain is a machine learning engineer at SurveyMonkey where he develops and implements machine learning systems for its products and teams. Occasionally, he’ll create YouTube videos about Machine Learning in collaboration with Springboard, an e-learning platform. He’s always excited to bring his expertise and passion for Data and AI systems to the rest of the industry. In his free time, Shubhi likes hiking with his dog and accelerating his hearing loss at live music shows. This was a virtual fireside chat between Shubhi Jain, Demetrios Brinkmann and the MLOps community. Relevant links can be found below. Join our MLOps slack community: https://bit.ly/3aOTwgR Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Shubhi Jain on Linkedin: https://www.linkedin.com/in/shubhankarjain/
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from MLOps.community · MLOps.community · 23 of 60

1 Our 1st MLOps Meetup // Luke Marsden // MLOps Meetup #1
Our 1st MLOps Meetup // Luke Marsden // MLOps Meetup #1
MLOps.community
2 Remote Collaboration as a Data Scientist
Remote Collaboration as a Data Scientist
MLOps.community
3 MLOps Manifesto with Luke Marsden from Dotscience
MLOps Manifesto with Luke Marsden from Dotscience
MLOps.community
4 MLOps lifecycle description
MLOps lifecycle description
MLOps.community
5 What Does Best in Class AI/ML Governance Look Like in Fin Services? // Charles Radclyffe // MLOps #2
What Does Best in Class AI/ML Governance Look Like in Fin Services? // Charles Radclyffe // MLOps #2
MLOps.community
6 Life purpose and too many spreadsheets
Life purpose and too many spreadsheets
MLOps.community
7 Explainability, Black boxes and EU white paper on reproducibility
Explainability, Black boxes and EU white paper on reproducibility
MLOps.community
8 Hierarchy of Machine Learning Needs // Phil Winder // MLOps Meetup #3
Hierarchy of Machine Learning Needs // Phil Winder // MLOps Meetup #3
MLOps.community
9 Automatically Retrain Machine Learning Models? Are best practices worth it?
Automatically Retrain Machine Learning Models? Are best practices worth it?
MLOps.community
10 Building an MLOps Team? Key ideas to keep in mind
Building an MLOps Team? Key ideas to keep in mind
MLOps.community
11 Hierarchy of MLOps Needs
Hierarchy of MLOps Needs
MLOps.community
12 Bare necessities for getting an ML model into production
Bare necessities for getting an ML model into production
MLOps.community
13 MLOps and Monitoring
MLOps and Monitoring
MLOps.community
14 How Phil Winder got into Data Science and Software Engineering
How Phil Winder got into Data Science and Software Engineering
MLOps.community
15 Provenance and Reproducibility in Machine Learning; what is it and why you need it?
Provenance and Reproducibility in Machine Learning; what is it and why you need it?
MLOps.community
16 Friction Between Data Scientists and Software Engineers
Friction Between Data Scientists and Software Engineers
MLOps.community
17 MLOps Problems in different size companies
MLOps Problems in different size companies
MLOps.community
18 ML tooling in large companies
ML tooling in large companies
MLOps.community
19 ML Platforms - The build vs buy question
ML Platforms - The build vs buy question
MLOps.community
20 ML Services Gateway at SurveyMonkey
ML Services Gateway at SurveyMonkey
MLOps.community
21 Message buses, Async and sync architecture
Message buses, Async and sync architecture
MLOps.community
22 MLOps #4: Shubhi Jain - Building an ML Platform @SurveyMonkey
MLOps #4: Shubhi Jain - Building an ML Platform @SurveyMonkey
MLOps.community
Hybrid Data Science Teams @SurveyMonkey
Hybrid Data Science Teams @SurveyMonkey
MLOps.community
24 How do you handle ML version control at SurveyMonkey
How do you handle ML version control at SurveyMonkey
MLOps.community
25 Doing ML with Personal Information
Doing ML with Personal Information
MLOps.community
26 Evolution of the ML feature store @SurveyMonkey
Evolution of the ML feature store @SurveyMonkey
MLOps.community
27 Developing a Machine Learning Feature Store
Developing a Machine Learning Feature Store
MLOps.community
28 Auto retrain ML models is not the question
Auto retrain ML models is not the question
MLOps.community
29 3 key parts to Machine Learning monitoring
3 key parts to Machine Learning monitoring
MLOps.community
30 MLOps Meetup #6: Mid-Scale Production Feature Engineering with Dr. Venkata Pingali
MLOps Meetup #6: Mid-Scale Production Feature Engineering with Dr. Venkata Pingali
MLOps.community
31 MLOps meetup #5 High Stakes ML: Active Failures, Latent Factors with Flavio Clesio
MLOps meetup #5 High Stakes ML: Active Failures, Latent Factors with Flavio Clesio
MLOps.community
32 MLOps: Airflow Pros and Cons
MLOps: Airflow Pros and Cons
MLOps.community
33 Specific challenges in Machine Learning
Specific challenges in Machine Learning
MLOps.community
34 Current State Of Machine Learning
Current State Of Machine Learning
MLOps.community
35 Humans in the Loop are a defining factor in Machine Learning
Humans in the Loop are a defining factor in Machine Learning
MLOps.community
36 Learning from real life Machine Learning failures
Learning from real life Machine Learning failures
MLOps.community
37 Survivorship Bias in machine learning tutorials
Survivorship Bias in machine learning tutorials
MLOps.community
38 Swiss Cheese model in Machine Learning
Swiss Cheese model in Machine Learning
MLOps.community
39 Resume driven development in Machine learning & software engineering
Resume driven development in Machine learning & software engineering
MLOps.community
40 Who has the highest standards in ML?
Who has the highest standards in ML?
MLOps.community
41 Venkata Pingali of Scribble Data Thoughts on the Current State of Machine Learning
Venkata Pingali of Scribble Data Thoughts on the Current State of Machine Learning
MLOps.community
42 Dependable data and being able to Trust in your Data with Venkata Pengali of Scribble Data
Dependable data and being able to Trust in your Data with Venkata Pengali of Scribble Data
MLOps.community
43 Speed, Trust, Evolution and Scale in MLOps
Speed, Trust, Evolution and Scale in MLOps
MLOps.community
44 More difficult transition for data scientists to become ML engineers
More difficult transition for data scientists to become ML engineers
MLOps.community
45 How many models in prod til I need a dedicated ML platform?
How many models in prod til I need a dedicated ML platform?
MLOps.community
46 Deeper thinking from data scientists around platform blackholes
Deeper thinking from data scientists around platform blackholes
MLOps.community
47 Checkpointing, metadata, and confidence in your data
Checkpointing, metadata, and confidence in your data
MLOps.community
48 Adjacent usecases and multistep feature engineering
Adjacent usecases and multistep feature engineering
MLOps.community
49 Standardization of Machine Learning tools like in Software Engineering with Venkata Pingali
Standardization of Machine Learning tools like in Software Engineering with Venkata Pingali
MLOps.community
50 Reproducability flaws in end to end Machine Learning debugging
Reproducability flaws in end to end Machine Learning debugging
MLOps.community
51 3rd wave of data scientists
3rd wave of data scientists
MLOps.community
52 MLOps meetup #7 Alex Spanos // TrueLayer 's MLOps Pipeline
MLOps meetup #7 Alex Spanos // TrueLayer 's MLOps Pipeline
MLOps.community
53 MLOps Meetup #8 Optimizing Your ML Workflow with Kubeflow 1.0
MLOps Meetup #8 Optimizing Your ML Workflow with Kubeflow 1.0
MLOps.community
54 Are Kubeflow and Airflow complementary?
Are Kubeflow and Airflow complementary?
MLOps.community
55 Why Kubeflow gained so much traction=open community
Why Kubeflow gained so much traction=open community
MLOps.community
56 Who decides the dirrection of Kubeflow
Who decides the dirrection of Kubeflow
MLOps.community
57 What do Kubeflow and Arrikto do and how do they work together?
What do Kubeflow and Arrikto do and how do they work together?
MLOps.community
58 Versioning your ML steps with Kubeflow
Versioning your ML steps with Kubeflow
MLOps.community
59 Machine Learning Lifecycles//Perception vs Reality
Machine Learning Lifecycles//Perception vs Reality
MLOps.community
60 Kubeflow vs SageMaker in Machine Learning
Kubeflow vs SageMaker in Machine Learning
MLOps.community

The video teaches the importance of collaboration between machine learning engineers and data scientists in managing model artifacts and versioning, and how this collaboration can help identify and address potential blockers early on. It also highlights the advantages of working together in tandem from the beginning of a project. By watching this video, viewers can learn how to structure and manage hybrid data science teams effectively.

Key Takeaways
  1. Identify the roles and responsibilities of machine learning engineers and data scientists
  2. Determine the requirements for model development and deployment
  3. Set up a repository for model versioning and retraining
  4. Establish a collaboration process between machine learning engineers and data scientists
  5. Monitor and address potential blockers and limitations
💡 Collaboration between machine learning engineers and data scientists from the beginning of a project can help identify and address potential blockers early on, leading to more effective and efficient model development and deployment.

Related AI Lessons

The Python Dictionary Trick That Makes Interviewers Smile
Learn the Python dictionary trick that impresses interviewers and improves your coding skills
Dev.to · Ameer Abdullah
I Compared 50 Python Courses. Here Are My Top 5 Recommendations for 2026
Discover the top 5 Python courses for 2026, curated from a comparison of 50 courses, to enhance your programming skills and career prospects
Medium · Python
Machine learning for beginners #5
Learn the basics of machine learning through the analysis of self-driving cars and understand how ML is applied in real-world scenarios
Medium · AI
Beyond the Elephant: On Manifolds, Projections, and the Hidden Assumptions of Neural Geometry
Learn how neural geometry relies on manifolds, projections, and hidden assumptions to understand complex data, and why it matters for AI development
Medium · AI
Up next
Is Python Dead in 2026?| Truth About Python in AI Era | 90 Days Roadmap @FameWorldEducationalHub
FAME WORLD EDUCATIONAL HUB
Watch →