Reproducability flaws in end to end Machine Learning debugging

MLOps.community · Beginner ·🎯 Management & AI-Era Leadership ·6y ago

Skills: ML Pipelines90%ML Maths Basics60%

Key Takeaways

Venkata Pingali discusses the importance of reproducibility in end-to-end Machine Learning debugging, highlighting the need for consistency and alignment across all pieces of the puzzle, from data collection to modeling.

Full Transcript

you cannot have um one piece in a puzzle being reproducible and other pieces are not being reproduced that doesn't work so they have to harmonize and align the rest of the pieces whether it is the data collection piece or the the modeling piece to be consistent and achieve the overall end-to-end objectives and this is still somewhat of a a new uh area and we uh even though we don't uh ask them the customers have uh encouraged us to be a lot more prescriptive encourage us to tell their teams to be you know to organize their work a certain way simply because we we have the opportunity to see many companies and many groups yeah and it makes total sense you don't want one thing to be very shiny and nice and then the rest is put together by a string and it's not working and you have problems with it so let's see how that would work and consistent with what flavio was saying right although this is high risk stuff let me give you a copyright uh example to help understand the situation see that there's a customer of mine um uh that actually is taking inventory bets on the products they have to decide whether to uh uh you know keep nine units or 50 units of a certain product and for that they have to do forecasting and the understand the demand and so on now uh you know it was a uh recently uh they they started noticing uh some unusual behavior of the models um then when they started to investigating the first thing that they go to is that through the modeling code that's like it learned you know whatever and there's a bit of notebook and from there they start tracing it back all the way and it turned out that what was happening is that the java application code that was there at the which was the source of a lot of this data uh they made some implicit uh decisions um about how to handle products from some geography and not from and how to handle products from a different geography there was some element there but this end-to-end uh debugging process um you know they they they struggled to debug actually what was what was the model that was actually put into production the precise code because the code itself was moving very fast um when they were able to come to the scribble platform itself which whose work ends at the data set generation from here we could go back and say this is exactly what we uh where whatever sources of the data was there because as a matter of routine which we keep track of the metadata we have a linear search all of those kinds of things it was clearly apparent that you cannot have this end-to-end debugging ability with black holes in the middle and in this case they were able to find the they had locked the data set that they had used for the modeling so from there we were able to change it down to the java source and then fix it and it is not you know this happens actually quite frequently so i believe that reproducibility and explainability will be driven not so much by the asks of third parties like regulatory authorities but just because you have the need to understand and debug your own data science systems that you have built if you don't understand what begs it is taking when and why you won't be able to manage the risk wow that's that's really thought provoking the change will come from the internal side as opposed to regulations

Original Description

What is the current state of Machine Learning? In our 6th meetup, we spoke with the CEO of Scribble Data Venkata Pingali. In this video he talks to us about his feelings about the current state of Machine Learning ecosystem. This is taken from a longer conversation that can be found here: https://www.youtube.com/watch?v=1CcYuVVwOGg Scribble Data helps build and operate production feature engineering platforms for sub-fortune 1000 firms. The output of the platforms is consumed by data science and analytical teams. In this talk we discuss how we understand the problem space, and the architecture of the platform that we built for preparing trusted model-ready datasets that are reproducible, auditable, and quality checked, and the lessons learned in the process. We touch upon topics like classes of consumers, disciplined data transformation code, metadata and lineage, state management, and namespaces. This system and discussion complement work done on data science platforms such as Domino and Dotscience. Dr. Venkata Pingali is Co-Founder and CEO of Scribble Data, an ML Engineering company with offices in India and Canada. Scribble’s flagship enterprise product, Enrich, enables organizations to address 10x analytics/data science use cases through trusted production datasets. Before starting Scribble Data, Dr. Pingali was VP of Analytics at a data consulting firm and CEO of an energy analytics firm. He has a BTech from IIT Mumbai and a PhD from USC in Computer Science. This was a virtual fireside chat between Venkata Pengali, Demetrios Brinkmann and the MLOps community. Relevant links can be found below. Join our MLOps slack community: https://bit.ly/3aOTwgR and register for the next meetup here. Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Venkata on Linkedin: https://www.linkedin.com/in/pingali/

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from MLOps.community · MLOps.community · 50 of 60

← Previous Next →

Our 1st MLOps Meetup // Luke Marsden // MLOps Meetup #1

Our 1st MLOps Meetup // Luke Marsden // MLOps Meetup #1

MLOps.community

Remote Collaboration as a Data Scientist

Remote Collaboration as a Data Scientist

MLOps.community

MLOps Manifesto with Luke Marsden from Dotscience

MLOps Manifesto with Luke Marsden from Dotscience

MLOps.community

MLOps lifecycle description

MLOps lifecycle description

MLOps.community

What Does Best in Class AI/ML Governance Look Like in Fin Services? // Charles Radclyffe // MLOps #2

What Does Best in Class AI/ML Governance Look Like in Fin Services? // Charles Radclyffe // MLOps #2

MLOps.community

Life purpose and too many spreadsheets

Life purpose and too many spreadsheets

MLOps.community

Explainability, Black boxes and EU white paper on reproducibility

Explainability, Black boxes and EU white paper on reproducibility

MLOps.community

Hierarchy of Machine Learning Needs // Phil Winder // MLOps Meetup #3

Hierarchy of Machine Learning Needs // Phil Winder // MLOps Meetup #3

MLOps.community

Automatically Retrain Machine Learning Models? Are best practices worth it?

Automatically Retrain Machine Learning Models? Are best practices worth it?

MLOps.community

Building an MLOps Team? Key ideas to keep in mind

Building an MLOps Team? Key ideas to keep in mind

MLOps.community

Hierarchy of MLOps Needs

Hierarchy of MLOps Needs

MLOps.community

Bare necessities for getting an ML model into production

Bare necessities for getting an ML model into production

MLOps.community

MLOps and Monitoring

MLOps and Monitoring

MLOps.community

How Phil Winder got into Data Science and Software Engineering

How Phil Winder got into Data Science and Software Engineering

MLOps.community

Provenance and Reproducibility in Machine Learning; what is it and why you need it?

Provenance and Reproducibility in Machine Learning; what is it and why you need it?

MLOps.community

Friction Between Data Scientists and Software Engineers

Friction Between Data Scientists and Software Engineers

MLOps.community

MLOps Problems in different size companies

MLOps Problems in different size companies

MLOps.community

ML tooling in large companies

ML tooling in large companies

MLOps.community

ML Platforms - The build vs buy question

ML Platforms - The build vs buy question

MLOps.community

ML Services Gateway at SurveyMonkey

ML Services Gateway at SurveyMonkey

MLOps.community

Message buses, Async and sync architecture

Message buses, Async and sync architecture

MLOps.community

MLOps #4: Shubhi Jain - Building an ML Platform @SurveyMonkey

MLOps #4: Shubhi Jain - Building an ML Platform @SurveyMonkey

MLOps.community

Hybrid Data Science Teams @SurveyMonkey

Hybrid Data Science Teams @SurveyMonkey

MLOps.community

How do you handle ML version control at SurveyMonkey

How do you handle ML version control at SurveyMonkey

MLOps.community

Doing ML with Personal Information

Doing ML with Personal Information

MLOps.community

Evolution of the ML feature store @SurveyMonkey

Evolution of the ML feature store @SurveyMonkey

MLOps.community

Developing a Machine Learning Feature Store

Developing a Machine Learning Feature Store

MLOps.community

Auto retrain ML models is not the question

Auto retrain ML models is not the question

MLOps.community

3 key parts to Machine Learning monitoring

3 key parts to Machine Learning monitoring

MLOps.community

MLOps Meetup #6: Mid-Scale Production Feature Engineering with Dr. Venkata Pingali

MLOps Meetup #6: Mid-Scale Production Feature Engineering with Dr. Venkata Pingali

MLOps.community

MLOps meetup #5 High Stakes ML: Active Failures, Latent Factors with Flavio Clesio

MLOps meetup #5 High Stakes ML: Active Failures, Latent Factors with Flavio Clesio

MLOps.community

MLOps: Airflow Pros and Cons

MLOps: Airflow Pros and Cons

MLOps.community

Specific challenges in Machine Learning

Specific challenges in Machine Learning

MLOps.community

Current State Of Machine Learning

Current State Of Machine Learning

MLOps.community

Humans in the Loop are a defining factor in Machine Learning

Humans in the Loop are a defining factor in Machine Learning

MLOps.community

Learning from real life Machine Learning failures

Learning from real life Machine Learning failures

MLOps.community

Survivorship Bias in machine learning tutorials

Survivorship Bias in machine learning tutorials

MLOps.community

Swiss Cheese model in Machine Learning

Swiss Cheese model in Machine Learning

MLOps.community

Resume driven development in Machine learning & software engineering

Resume driven development in Machine learning & software engineering

MLOps.community

Who has the highest standards in ML?

Who has the highest standards in ML?

MLOps.community

Venkata Pingali of Scribble Data Thoughts on the Current State of Machine Learning

Venkata Pingali of Scribble Data Thoughts on the Current State of Machine Learning

MLOps.community

Dependable data and being able to Trust in your Data with Venkata Pengali of Scribble Data

Dependable data and being able to Trust in your Data with Venkata Pengali of Scribble Data

MLOps.community

Speed, Trust, Evolution and Scale in MLOps

Speed, Trust, Evolution and Scale in MLOps

MLOps.community

More difficult transition for data scientists to become ML engineers

More difficult transition for data scientists to become ML engineers

MLOps.community

How many models in prod til I need a dedicated ML platform?

How many models in prod til I need a dedicated ML platform?

MLOps.community

Deeper thinking from data scientists around platform blackholes

Deeper thinking from data scientists around platform blackholes

MLOps.community

Checkpointing, metadata, and confidence in your data

Checkpointing, metadata, and confidence in your data

MLOps.community

Adjacent usecases and multistep feature engineering

Adjacent usecases and multistep feature engineering

MLOps.community

Standardization of Machine Learning tools like in Software Engineering with Venkata Pingali

Standardization of Machine Learning tools like in Software Engineering with Venkata Pingali

MLOps.community

Reproducability flaws in end to end Machine Learning debugging

Reproducability flaws in end to end Machine Learning debugging

MLOps.community

3rd wave of data scientists

3rd wave of data scientists

MLOps.community

MLOps meetup #7 Alex Spanos // TrueLayer 's MLOps Pipeline

MLOps meetup #7 Alex Spanos // TrueLayer 's MLOps Pipeline

MLOps.community

MLOps Meetup #8 Optimizing Your ML Workflow with Kubeflow 1.0

MLOps Meetup #8 Optimizing Your ML Workflow with Kubeflow 1.0

MLOps.community

Are Kubeflow and Airflow complementary?

Are Kubeflow and Airflow complementary?

MLOps.community

Why Kubeflow gained so much traction=open community

Why Kubeflow gained so much traction=open community

MLOps.community

Who decides the dirrection of Kubeflow

Who decides the dirrection of Kubeflow

MLOps.community

What do Kubeflow and Arrikto do and how do they work together?

What do Kubeflow and Arrikto do and how do they work together?

MLOps.community

Versioning your ML steps with Kubeflow

Versioning your ML steps with Kubeflow

MLOps.community

Machine Learning Lifecycles//Perception vs Reality

Machine Learning Lifecycles//Perception vs Reality

MLOps.community

Kubeflow vs SageMaker in Machine Learning

Kubeflow vs SageMaker in Machine Learning

MLOps.community

Venkata Pingali emphasizes the need for reproducibility in end-to-end Machine Learning debugging, highlighting the importance of consistency and alignment across all pieces of the puzzle. He shares an example of a customer who struggled to debug their ML model due to inconsistencies in their data pipeline.

Key Takeaways

Identify the need for reproducibility in ML debugging
Align all pieces of the puzzle, from data collection to modeling
Use a prescriptive approach to organize ML workflows
Keep track of metadata and data sources
Perform linear searches to debug ML models

💡 Reproducibility and explainability will be driven by the need to understand and debug internal data science systems, rather than just regulatory requirements.

🔒 Pro feature: Ask AI to explain this lesson →

More on: ML Pipelines

View skill →

Building a Dog Breed Identifier App from scratch - DogNet

Building a Dog Breed Identifier App from scratch - DogNet

Aladdin Persson

Complete Dockers For Data Science Tutorial In One Shot

Complete Dockers For Data Science Tutorial In One Shot

Part 6 | Deploy ML Model on Kubernetes | Auto-Scaling with HPA and Monitoring with Prometheus

Part 6 | Deploy ML Model on Kubernetes | Auto-Scaling with HPA and Monitoring with Prometheus

Abonia Sojasingarayar

Vertex Pipelines: Qwik Start

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Automate R scripts with GitHub Actions: Deploy a model

AI and ERP: Hype vs. Reality

Digital Transformation with Eric Kimberling