Look At Your ****ing Data ๐ // Kenny Daniel // MLOps Podcast #292
Skills:
Data Literacy70%
Look At Your ****ing Data ๐ // MLOps Podcast 292 with Kenny Daniel, Founder of Hyperparam.
Join the Community: https://go.mlops.community/YTJoinIn
Get the newsletter: https://go.mlops.community/YTNewsletter
// Abstract
In this episode, we talk with Kenny Daniel, founder of Hyperparam, to explore why actually looking at your data is the most high-leverage move you can make for building state-of-the-art models. It used to be that the first step of data science was to get familiar with your data. However, as modern LLM datasets have gotten larger, dataset exploration tools have not kept up. Kenny makes the case that user interfaces have been under-appreciated in the Python-centric world of AI, and new tools are needed to enable advances in machine learning. Our conversation also dives into new methods of using LLM models themselves to assist data engineers in actually looking at their data.
// Bio
Kenny has been working in AI for over 20 years. First in academia as a ML Ph.D. student at USC (before it was cool). Kenny then co-founded Algorithmia to solve the problem of hosting and distribution of ML models running on GPUs (also before it was cool). Algortihmia was an early pioneer of the MLOps space and was acquired by DataRobot in 2021. Kenny is currently the founder and CEO of Hyperparam, building new tools to make AI dataset curation orders of magnitude more efficient.
// Related Links
Website: https://hyperparam.app
Working with the Apache Parquet file format blog: https://blog.getdaft.io/p/working-with-the-apache-parquet-file
~~~~~~~~ โ๏ธConnect With Us โ๏ธ ~~~~~~~
Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore
Join our Slack community [https://go.mlops.community/slack]
Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)]
Sign up for the next meetup: [https://go.mlops.community/register]
MLOps Swag/Merch: [https://shop.mlops.community/]
Connect w
Watch on YouTube โ
(saves to browser)
Sign in to unlock AI tutor explanation ยท โก30
Playlist
Uploads from MLOps.community ยท MLOps.community ยท 0 of 60
โ Previous
Next โ
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Our 1st MLOps Meetup // Luke Marsden // MLOps Meetup #1
MLOps.community
Remote Collaboration as a Data Scientist
MLOps.community
MLOps Manifesto with Luke Marsden from Dotscience
MLOps.community
MLOps lifecycle description
MLOps.community
What Does Best in Class AI/ML Governance Look Like in Fin Services? // Charles Radclyffe // MLOps #2
MLOps.community
Life purpose and too many spreadsheets
MLOps.community
Explainability, Black boxes and EU white paper on reproducibility
MLOps.community
Hierarchy of Machine Learning Needs // Phil Winder // MLOps Meetup #3
MLOps.community
Automatically Retrain Machine Learning Models? Are best practices worth it?
MLOps.community
Building an MLOps Team? Key ideas to keep in mind
MLOps.community
Hierarchy of MLOps Needs
MLOps.community
Bare necessities for getting an ML model into production
MLOps.community
MLOps and Monitoring
MLOps.community
How Phil Winder got into Data Science and Software Engineering
MLOps.community
Provenance and Reproducibility in Machine Learning; what is it and why you need it?
MLOps.community
Friction Between Data Scientists and Software Engineers
MLOps.community
MLOps Problems in different size companies
MLOps.community
ML tooling in large companies
MLOps.community
ML Platforms - The build vs buy question
MLOps.community
ML Services Gateway at SurveyMonkey
MLOps.community
Message buses, Async and sync architecture
MLOps.community
MLOps #4: Shubhi Jain - Building an ML Platform @SurveyMonkey
MLOps.community
Hybrid Data Science Teams @SurveyMonkey
MLOps.community
How do you handle ML version control at SurveyMonkey
MLOps.community
Doing ML with Personal Information
MLOps.community
Evolution of the ML feature store @SurveyMonkey
MLOps.community
Developing a Machine Learning Feature Store
MLOps.community
Auto retrain ML models is not the question
MLOps.community
3 key parts to Machine Learning monitoring
MLOps.community
MLOps Meetup #6: Mid-Scale Production Feature Engineering with Dr. Venkata Pingali
MLOps.community
MLOps meetup #5 High Stakes ML: Active Failures, Latent Factors with Flavio Clesio
MLOps.community
MLOps: Airflow Pros and Cons
MLOps.community
Specific challenges in Machine Learning
MLOps.community
Current State Of Machine Learning
MLOps.community
Humans in the Loop are a defining factor in Machine Learning
MLOps.community
Learning from real life Machine Learning failures
MLOps.community
Survivorship Bias in machine learning tutorials
MLOps.community
Swiss Cheese model in Machine Learning
MLOps.community
Resume driven development in Machine learning & software engineering
MLOps.community
Who has the highest standards in ML?
MLOps.community
Venkata Pingali of Scribble Data Thoughts on the Current State of Machine Learning
MLOps.community
Dependable data and being able to Trust in your Data with Venkata Pengali of Scribble Data
MLOps.community
Speed, Trust, Evolution and Scale in MLOps
MLOps.community
More difficult transition for data scientists to become ML engineers
MLOps.community
How many models in prod til I need a dedicated ML platform?
MLOps.community
Deeper thinking from data scientists around platform blackholes
MLOps.community
Checkpointing, metadata, and confidence in your data
MLOps.community
Adjacent usecases and multistep feature engineering
MLOps.community
Standardization of Machine Learning tools like in Software Engineering with Venkata Pingali
MLOps.community
Reproducability flaws in end to end Machine Learning debugging
MLOps.community
3rd wave of data scientists
MLOps.community
MLOps meetup #7 Alex Spanos // TrueLayer 's MLOps Pipeline
MLOps.community
MLOps Meetup #8 Optimizing Your ML Workflow with Kubeflow 1.0
MLOps.community
Are Kubeflow and Airflow complementary?
MLOps.community
Why Kubeflow gained so much traction=open community
MLOps.community
Who decides the dirrection of Kubeflow
MLOps.community
What do Kubeflow and Arrikto do and how do they work together?
MLOps.community
Versioning your ML steps with Kubeflow
MLOps.community
Machine Learning Lifecycles//Perception vs Reality
MLOps.community
Kubeflow vs SageMaker in Machine Learning
MLOps.community
More on: Data Literacy
View skill โRelated AI Lessons
โก
โก
โก
โก
RAG Chunking Is Not About Length โ It Is About Preserving Meaning
Medium ยท AI
The Future of RAG: Dead, Evolvingโฆ or Becoming the Brain of AI?
Medium ยท Machine Learning
Smart Routing, Transfer Family Ingestion, and Voice Chat โ Permission-Aware RAG v4.2
Dev.to ยท Yoshiki Fujiwara(่คๅ ๅๅบ)@AWS Community Builder
Most Companies Doing GenAI Are Really Just Doing RAG: RAGOps Explained for analysts
Medium ยท RAG
๐
Tutor Explanation
DeepCamp AI