Fireside Chat #8: Navigating the Full Stack of Machine Learning
Skills:
ML Maths Basics90%Prompt Craft80%Supervised Learning80%Fine-tuning LLMs70%Unsupervised Learning60%
Key Takeaways
Ethan Rosenthal discusses navigating the full stack of machine learning, covering topics such as recommendation systems, model deployment, and machine learning engineering, with tools like Metaflow and large language models.
Full Transcript
thank you all right welcome everybody it's Hugo bound Anderson here I am very excited to be here today with Eason Ethan Rosenthal from Square um to talk about navigating the full stack of machine learning um we're gonna take a we're going to start in a couple of minutes uh and in the meantime if you could introduce yourself in the chat let us know what your interest in machine learning full stack data science all of these things are um you know uh maybe what industry you work in and where you're at that would be super super interesting um for us that'll be great and we'll get started in a couple of minutes and yes um we've already got one comment DP side and Spinner has said please put an extra Log on the Fire it's cold here um we'll definitely do so things are gonna things are gonna heat up today DB um so get excited for that all right everyone Hugo Ben Anderson here from out of bounds Welcome to our final fireside chat for the year where I will be speaking with Ethan Rosenthal from Square about navigating the full stack of machine learning um it'll be great if on YouTube you could let us know where you're calling in from what your interest in full stack machine learning are um what type of machine learning you work in and what what vertical and such things would be super interesting um and that'll allow us to tailor our conversation to the audience as well and we'll get started in a couple of minutes so we have David W David wolf looking to build a model for personal projects great Theo Howard um remembers Hugo's classes from data Camp um awesome and uh Theo works at the Wisconsin School of Business helping instructors with online course content fantastic all right everybody Hugo Brown Anderson here boy am I excited to be chatting with Ethan Rosenthal today from square and from life and from New York City um Ethan and I also have diverging yet converging trajectories both having worked in basic science research um he may say he worked in in real science being physics whereas I worked in biology although I did work in biophysics um and we have a question will this be recorded this will be recorded it's currently being recorded um this will stay in perpetuity on our YouTube channel or until um YouTube gets bought by a megalomaniac billionaire and then uh attempts to have a different business model I'm just kidding of course that would never happen um so we have GPS signed spin up working in Tech bio monitoring very cool um Todd Garner's a student learning ML and AI Sonam Gupta Bay Area California um fantastic Works in ml and natural language processing extensively here to understand um the full stack um to be successful in finding a job as an ml engineer um ran from Boston using a moon stack fantastic um Theo's also an artist loves the controversy around Ai and art that's really hot right now um we have um Matthew Hess a full stack Dev considering focusing on data science career path um Jarrod working in ml applied to vision and remote sensing very cool um lots of lots of interesting stuff and I honestly think oh let's take it's two past the hour so let's take one more minute um we've already got 40 people here which is super exciting particularly given it's December and really the holiday season is upon us and that has been um you know a pretty pretty wild year on a lot of fronts so let's take um a couple more minutes if you could introduce yourself in the chat let us know what what your interest is and who you are and where you're calling in from um I'm dialing in from Sydney Australia uh Ethan um is in New York City um we are zolfia uh dito who works in healthcare fantastic um and we'll just take one more minute oh hey youjesh just said hey Hugo hey you just um you just works at a a fascinating company called called moneyline and they're using metaflow um which we work on of course to power all of their machine learning work so hey you Jesh um and of course everyone if any questions come up um during the conversation put them in the YouTube chat we'll do our best to get to them um if we're we're often not able to get to all of them which we is why we haven't asked me anything an async one on our community slack afterwards and I'll put the link in to that um but um yeah please ask as many questions as possible that are relevant um whatever however for some definition of relevance um all right Ethan let's um let's do this um I think people didn't come here to listen to me wax lyrical um so why don't you turn your camera on or stop sharing my screen and we'll jump we'll jump in hey Ethan what's up man uh not much how you doing I'm I'm really well um and it is Thursday here which means getting closer to the weekend which feels good it's about there I mean like Fridays are kind of chill days anyway right so it's like it's like almost it's almost a week I prefer Friday to Sunday Sunday I'm already stressing about Monday so I think because we're constantly living in the past and the future like Friday's a great day I'm just sitting there fantasizing about the weekend essentially yeah same thing Friday Friday and Saturday are the are the best days a couple couple more people just said David David Beckham um is a data scientist focused on personalization and recommendation uh systems um in La um which is great and actually that's that's interesting I am you and I were introduced to each other originally by yakapoe I would have got that slightly it's not slightly wrong um but he is someone who works in in recommendation systems a lot and thinking through and reasoning through and building recommendation systems at what he calls it reasonable scale right uh yeah I uh that's actually how I got my start doing data science was in recommendations definitely at reasonable scale not at not at very large scales so amazing and I am actually so excited to I'm going to put this thing in the chat uh with jakapo we've just built a tutorial on recommendation systems using metaflow so I've just put that in in in the chat but um one thing we might get to at some point is thinking through what he means by by reasonable scale and for those joining if you haven't heard this term the the idea is that you know a lot of the conversation around machine learning deep learning AI all of these things data science uh are dominated by a handful of very large companies um who do it very well um such as Google and you can think Fang and a couple of others and essentially but what does the long tail what does EV the 99 we are the 99 of machine Learners what is the and you're in New York city so you know you're and you're in SoHo you're not so close to to Wall Street but you're closer than most um so before we actually dive in and I'm really interested to hear hear about your start particularly as you do come from science science so to speak um but my nose up in the air science yes absolutely physics at Columbia University science right yeah um Upper Manhattan whatever that means north of 110th Street the upper Upper West Side yes you can't do that Morningside is okay um so I I did I I did want to say um so I mentioned this we will have uh we'll try to answer as many questions um now as possible but if any of your questions remain unanswered as they may um please do come and join our a ask me anything in on our slack I'm posting the link to that there's a channel called AMA guests um and before we jump in for those who don't know what we do at out of bounds we essentially work on infrastructure and productivity tools for data scientists that allow them to focus on the top layers of the stack on building models and doing science while having easy access to the bottom layers such as compute and data and versioning and orchestration and we do most of this um through the open source framework metaflow at the moment but we're very excited to be working on a platform product among other things as well um I have to do the obligatory if you enjoying this hit subscribe I don't even know like wherever wherever that that thing is hit subscribe and share with your friends if you think yeah right across um and but we're here today not to talk about any of this stuff but to chat with Ethan who is a data scientist at square has worked as a data consultant and used to be a scientist scientist um who completed his PhD in physics at Columbia uh University um we're here to discuss the wild west of full stack machine learning and just how to make sense of this really large space all the feature stores metric layers model monitoring and more with a view to deciphering how to think about this what mental models tools abstraction layers are most helpful in delivering actual Roi to businesses using machine learning um but that's enough out of me you Ethan are an engineering an AI engineering manager at Square like myself used to work in basic science research can you tell me a bit about your journey from working in basic research to Ai and ml engineering sure yeah so I I got my start in physics uh majored in in college and then came up to Columbia to do a PhD in physics and never really planned a program for a living like I I took one programming class in college like in the math department it was like a numerical algorithms class where they they taught us Matlab and I programmed in Matlab and grad school um but then yeah you know wanted to get a job wanted to stay in New York City and so I graduated in like 2015 which was uh right around the time that kind of data science was was starting to pop off and everything else and so uh I spent my last year grad school teaching myself python like trying to decide between R and python at the time um there's also like I guess back then like people were worrying about Scala as well they thought that that was like the future of data science which thankfully did did not become that um but yeah I ended up uh I did a boot camp um or a fellowship uh with this organization inside data science which was uh for PhD students who were interested in data science back then like a lot of people I mean still like the definition of data science I think is a little fuzzy but it was even fuzzier back then um and so did a boot camp with them they uh got like my foot in the door at a couple different companies to interview and then ended up working at this company Birchbox in in New York City back in 2015. So and correct me if I'm wrong but insight and I might actually be wrong is a fellowship as you said it's tailored towards stem graduate people who've done graduate school in a stem field helping them transition to Industry um but their model is that it's free for the student but the companies that recruit um then pay the tuition essentially which is an interesting incentive system you can have boot camps that aren't incentivized to get any people jobs besides for social proof essentially whereas this they get their car they get their money by getting you a job is that am I right yeah it was like a perfect alignment of incentives yeah so I was a very poor grad student at the time so there was no way that I was gonna like pay money to like you know pay tens of thousands of dollars a year to do some boot camp um but yeah they're their setup at the time was it was entirely free it was a six or a seven week program and they had a 100 job placement rate so everybody who had gone through this program had gotten a job making like not a grad student seller anymore at companies that were hiring data scientists and sure enough that's exactly what happened with me I mean I went through the program it was very difficult very stressful everything else but um ended up interviewing at some companies through them and yeah the Birchbox hired me and they paid some sort of recruiting feedback to Insight but cost me nothing and got me my first job um yeah and I got that training for free that's amazing and that's where you were working on recommendation systems that is yeah so uh I went to Birchbox which was one of the first kind of like box based retail companies um there uh were a whole bunch of these back then there still are some now but the the way that Birchbox worked was um you could subscribe and get a box of five Beauty samples sent to you every month in the mail and so you could try out different beauty products and then they also had a an online store where you could order a full-size version of the product and so um it's actually fascinating work because uh I I did kind of half recommendation systems where we want to make sure that we're sending products to people that they're going to like um as well as kind of online recommendations on the on the website so like through like typical e-commerce stores you know given that you've purchased these products what other products might you like to purchase uh so that was like half of the work but the other half of the work was a integer like optimization problems so we had hundreds of thousands of boxes that we would be sending out each month and a box has to have five items uh five items in it but uh you know you have different quantities of those items in your warehouse and certain customers can get certain items certain customers cannot get certain items so if you don't have curly hair you should not receive a curly haired shampoo or something like that and so this whole thing becomes this giant optimization problem where how can we assemble the optimal boxes for people subject to all sorts of constraints on how many units do we have in the warehouse who can get what and everything else so it's a very interesting first job yeah amazing and I I do feel like um thinking through what you learned before going on the job recommendation systems are things we talk about a lot but it it's not actually it's highly non-trick you can't just learn how to build a recommendation system kind of like you can a computer vision algorithm in a notebook or or this type of thing so was there a pretty steep learning curve in terms of figuring out how it actually works in practice in a real job in a real company super steep uh because especially I would say that still there's no standard textbook on recommendation systems like if you want to learn deep learning there are standard deep learning books that that have been put out now if you want to learn kind of Standards supervised learning problems like you can you can pick up elements of statistical learning and dig through the equations and learn logistic regression you know from a super deep perspective but with recommendation systems you know a lot of them I guess there's some ACT there was some academic work at the time of course but a lot of this was driven by industry which uh is maybe doesn't have the same incentives to put out educational materials around how these things work and you know necessarily these recommendation systems are really a function of of the environment that they operate in um and like you can only really test them when you actually test them in production and things like that and so I I definitely had a hard time trying to wrap my head around everything um but spent a lot of time trying to like read up on the old Netflix prize and things like that um and like digging into some there's very few open source libraries back then for learning about the stuff and kind of like sparse blog posts that people had read um and so definitely a steep learning curve I ended up writing for to help me to understand what I had learned and I ended up writing like a series of blog posts to try to like solidify my understanding of the field back then um and since then the field has only only grown and changed and and everything else and all of your blog posts on your website they are Ethan rosenthal.com so great and I'm actually going to just put that in the chat as well um we we already have a bunch of interesting questions VM Boston's question is wondering how much a data scientist who does ML should invest in full stack ml or leave it to the experts who can deploy the model now that that is what we're here to talk about so the entire hour will be dedicated to that but I think maybe thinking about your career and your trajectory early on as an early stage data scientist coming from Academia um presumably your skill set played to building models and feature engineering and this type of stuff not necessarily you know doing cluster configurations and um like cron jobs and scheduled deployments and all of these these types of things so maybe in in your first job as a data scientist what what was the focus on and what other people did you need maybe with more platform engineering skills and that type of stuff yeah so that first job even when I was trying to decide where to go which uh ended up being an easy decision because I got one job offer from my first job so I took that job uh but you know it was high on my list already originally but like you know we when I was doing that boot camp we had a bunch of companies come in and tell us what data science like what data science meant at their companies and even back then there was kind of this split between people that are doing more analytics and then people that are doing more uh kind of what we would Now call machine learning engineering and I don't feel like that was a very popular term back in 2015 and so back then I found myself gravitating more more towards like putting models into production and I was interested in that and uh this role seemed to have some of that and so I would say that at my first job for the recommendation systems there was some things that I could do like I I was definitely allowed to train models for example um I kind of had free range over like a big cluster that existed in a data center for me to train models and things like that but the actual uh model deployment process was uh sometimes a bit difficult and definitely outside of my wheelhouse um I would also say that back then the you know it kind of depends what you're working on in terms of what are the kind of production requirements required of somebody to to push their code out so you know for something like recommendation systems it's not and if you're at a reasonable scale it's not the end of the world if somebody gets served a bad recommendation I mean if Amazon puts out a bad algorithm you know because they're at such a scale then maybe that that is even going to cut into their margins but but at our scale uh it was kind of lower stakes and so I was allowed to kind of play in that world a bit even like even though I did not have like a background in software engineering uh like any code that you look at that you wrote in the past you know you should shudder whenever you look at your old code because ideally you've learned something since then and if I look back at the code that I wrote then it was probably not production quality code uh I didn't write tests you know any of these other kind of Hallmarks of modern software development uh but that was okay at the time uh one because the it was kind of low stakes that I was operating in and two because I think that nobody really expected a kind of production software engineering out of machine learning people back at that point in time it was hard enough to find anybody who could just kind of program up some of these algorithms to begin with um and so like production level software engineering would have just been a very nice to have absolutely so I I want to learn more about your journey to being an AI engineering manager but we've used the term ml engineering enough for machine learning engineering to maybe reason through what what it actually means so what does machine learning what does a machine learning engineer do in your experience yeah it's a good question I would say that maybe a definition is somebody who builds probably machine learning models maybe it could be like a statistical model and the the boundaries between statistics and machine learning are pretty fuzzy but I would say that they they build this model and it gets used um in some sort of an automated fashion is probably the the best definition that I have so like you might have a statistician who is building a model and then they you know run inference on this model they interrogate the model in order to try to understand some behavior that exists but this is more of like an offline process that they're doing but I would say that a machine learning engineer ends up they build the model and maybe the model serves predictions behind an API maybe the model runs once a day and it generates some predictions that get written into a database and then somebody else ends up using those predictions um maybe yeah but I think the the model gets used in some sort of an automated fashion is probably the best definition that I have so then in some ways it is a data scientist who's specifically builds models that are deployed to production or something along those lines yeah yeah you know but they're not necessary because I think maybe the term engineer is so overloaded we have ml Engineers data Engineers platform engineers and it's more on the kind of the scientific side focusing on the top levels of the stack that we'll we'll get to yeah yeah I think so and that's like at Square we have the term machine learning engineer and I think that what I've said largely matches what what those people do and there might be you know at a bigger company like Square there is a platform team where they are maybe responsible for building out a platform to allow for a serving of models with low latency and high concurrency and everything else that you might care about at scale but the machine learning engineer they are the ones who are building the actual model and probably the ones who are responsible for what the model like how the model impacts the ecosystem that it operates in so models generating predictions maybe somebody's making decisions on those predictions and so how do you kind of uh track that process that's that's probably the job of the person who built the model great and so Sonam Gupta actually has a a fantastic question by that definition Sona asks ml engineer would be the one handling the models in production after data scientists have built them so I I think what I want to say is it's the ml engineer who builds them and they are the data scientists in in that sense but are they responsible for for maintenance then I suppose is another way to reframe that question uh yes but I think it depends what maintenance we're talking about um so like when I've worked at startups maybe like oftentimes at a startup you know your job encompasses you more roles uh you don't have as much specialization and so maybe at a startup you you have to build the API that you deploy your model behind and so then you are responsible for everything from like the actual API and compute layer all the way up to the how is the model performing and things like that does the model need to be updated with fresh data or something like that um what I found at say somewhere like square or somewhere with where you have a platform team maybe the platform team is responsible for making sure that the API continues to work like it doesn't you know run out of memory or something like that but the uh the machine learning engineer they're responsible for making sure that the model is doing its job so it is exactly generating high quality predictions um yeah yeah and you may have data Engineers who are responsible and accountable for the data pipeline actually working correctly to feed into the the feature engineering and machine learning models yep yep everything is unfortunately connected and uh assigning responsibility to each of these parts is uh is important to make sure that the whole pipeline runs absolutely so we've already jumped into some fascinating aspects of what the full stack looks like I am now interested in just returning back to your journey and how you went from your first job um to being an AI engineering manager at Square oh pardon me sorry um yeah so maybe hopping along I went from that first job to a second job which was at a very similar type of company I think I one thing to note about this second job was that I so I joined there as a it was a again a box-based retail company so similar problems I joined there as the first data scientist um and I built some things but then I was very lucky at that job because uh we hired some software Engineers that I got to work alongside and that was kind of the first time that I really had other people reviewing my code and uh turned out I had a lot to learn um but like being in that situation where you can get like very good feedback on what you're doing and learn best practices from other people was uh was extremely helpful and so um built out a bunch of things there uh left to do some freelancing for a while um again focused on kind of productionizing machine learning at various various company days of different sizes and then like six weeks before the pandemic hit so like January 2020 I ended up starting at square and so I joined Square when I joined Square I was on the risk team building models to detect fraud um and I can talk some about that that had its own like very fascinating like machine learning Engineering Process behind it and then Midway uh through my time at Square I ended up switching to my current team uh which is called the conversations team where I started out as an individual contributor and now I manage a small team of of Engineers fantastic and I think that's a lovely segue into you know I did want to talk about the types of um business problems that square uses data science and machine learning for so if you want to jump into those and then we can kind of discuss you know at a high level what how Square will use the output of data science in ml and then jump into the weeds as to what actually happens yeah yeah so I think like the oldest machine learning team I think at square is is the risk team that I originally joined so square is commonly known as a company that processes payments so you know if you've ever gone to a coffee shop you might have seen like our little devices that uh you know you can you can swipe your card at and so that that's where the company started are those are little white things they started as like the little white things that you could plug into the headphone jack of your iPhone way back in the day where you could actually back when people used to swipe cards yeah and and now now there's a whole Suite of of Hardware that that is sold um but yeah I think uh like Risk is is a big issue for the company so uh you know you can imagine all sorts of uh payment fraud that might occur you know if people steal somebody's credit card um you know and then use it as a square business and square can be on the hook and there's other uh pretty large sources of risk that the company takes on especially because we're focused on small businesses which have not been traditionally served by larger Banks because they tend to be riskier to deal with um and so uh yeah so the the Frog team uh that's like the oldest team at the company uh which is where like a lot of the kind of platform level tools ended up being built out and so that world is like you can imagine lots of classification models you know is this a fraudulent payment versus not I feel like this and like spam are like the classic uh classification applications that you learn about in class um so so that's where it started but then there's I don't know there are a lot of machine learning teams now at square there's like maybe 20 or something like that and so uh it's kind of spread throughout the company so everything from like lifetime value prediction of the businesses that sign up with us looking at their churn uh we have a team that does like lending to small businesses and there's a lot of work around that um all the way to my weird little team that exists kind of in like a corner of the company so we build a messaging hub for square business owners so if you're a business owner that uses Square all text messages and emails that you send out to your customers or that customers send to you all exist in one place uh and so you as the business owner can go in and you can view your conversations with all of your different customers and you know message with them and things like that um and so the broader Team Works on this product my small team that I manage we build I guess I'll call them like smart features for this and so uh perhaps unsurprisingly it's a lot of natural language processing and things like that so we have a chat bot if you want to let's say you like book an appointment at a hair salon you might get a text message confirming your appointment and that text message will say if you need to cancel or reschedule then you can respond here to the square assistant and so we have a chat bot that will talk with you in natural language to update your appointment if you need to as well as uh you know maybe you've seen you know Google will suggest a response for you to send back to an email that you get we have a similar feature that we've built for the business owners so they get a customer inquiry we can suggest a response for them to send back and we're using uh kind of like the the hot thing of the moment right now large language models uh in order to do that and so so yeah um lots lots of applications here yeah there are a bunch of fascinating applications and in fact Sanam Gupta just had a question are there any use cases of NLP and square if yes can you share any examples and without even seeing that you you answered it in in real time um we have a bunch of other very interesting questions that I hope to get to um uh but I do before getting into some of those um I'm interested in if you can tell us about how you measure success of machine learning projects and if they're out of any of the ones you just said feel free to talk about them or or more generally yeah yeah uh so I actually I I joined the risk team I I I wanted to join the rest team when I came here and I did because I liked that their definition of success was very clear so you know I mentioned that I used to work in recommendation systems and you can run some a b tests with recommendation systems to convince yourself that you have uh you know proved value for the company that you've made the company money and things like that but sometimes that especially when you're at reasonable scale you might need very long time periods to convince yourself that the improvements that you've made to an algorithm have uh like given large impact to the company and so I I was very I was very interested in in the risk team here because it's at a large scale and we're dealing with money and like the best way to kind of measure success is is with money usually yeah and so on the risk team uh we could do things like that so we we could run a b tests where maybe you start sending some payments to one model and other payments go to another model and then you can measure how much loss did we incur due to fraud from either model and so you know that that ends up just being a fairly straightforward way I mean it's it's not perfectly straightforward because with fraud your losses are like technically unbounded so if you have a bad actor who figures out that they can steal your money they're going to steal as much money as possible um and so it's uh it ends up being a bit of like a causal inference problem to actually do it right but that's that ends up being a very nice way to do this um on the chatbot side of things uh we the chatbot can kind of back out and ask the business owner for help if it's not able to solve a problem for somebody and so we can measure basically like how often does it complete its goals that it has and so that ends up being kind of a very clear measure of of at least that model's success cool I am I do want to jump into oling um but before that you did you know pique my interest mentioning causal inference I'm wondering if you can tell us a bit more about the role of causal inference in machine learning and more generally in in data science yeah I I will admit to not being super well-versed in this um but I think like if I were to put on my like thought leadery you know futurist hat or or what have you I do think that um this is going to become more and more important for data scientists slash machine learning Engineers um like one of the things that I noticed when I worked on the risk team was that we had a lot of great tooling which we can chat about for building models quickly and deploying those in order to like Target fraud like fraud attack vectors and things like that and so it was that actually that problem wasn't so hard the hard part was actually understanding the impact of your model in the ecosystem that it that it operated within especially when your model is impacting that ecosystem so for example if you choose to block a payment from going through uh then you might not get any ground truth to know if that was truly a fraudulent payment and so being smart about the actions that you choose and being able to model those actions and kind of infer what would have happened if you had or had not taken those actions ends up being pretty important and ends up being a causal inference problem yeah absolutely um well let's jump into tooling now as you know and I've written about we've talked about previously we live and work in an age of feature stores model monitoring metrics layers experiment trackers I mean the list goes on and on and on how do you think about separating the wheat from the chaff and particularly with an eye towards what tools and types of tools to bring in at Square yeah um so to be clear I well actually uh maybe for some for some background my team kind of we we end up basically bringing in our own tools and choosing what we use we for various historical reasons we actually don't use the platform level tools that that exist at the company and so there's tools that the company has chosen to bring in which I think is a very different decision than the tools that an individual team needs to use um I think to start the question is do you actually need these tools is this actually a pain point for you right now um so I think like feature stores are really hot and like I've been pretty interested in feature Stores um I got interested in them when I was on the fraud team because we had an internal feature store and maybe you can just Define what a feature stories as well good call um so I would say that a feature store is ideally a feature store is a way to generate it's a it's kind of like a database to store historical features that you would want to use to train your model so if you think of like a conventional classification model let's think of like scikit-learn we have our X design Matrix our Matrix of of you know samples and features and a feature store would uh be a place where you can store all of the features that are associated with a given sample um and you can then train your model off of this but then at the same time the feature store will also serve up those features in real time when you need to possibly perform real-time predictions with your model um and this way you can kind of guarantee that the way that your model is trained matches the way that your model gets used in production because this is like a classic failure mode of models where you train your model off of historical data but then you maybe you can't serve up those features in real time with low enough latency so like an example at Square would be maybe we want to understand how like what was the fraction of times that this card was declined in the last 60 seconds because maybe if somebody's trying to use this maybe they're using an API to try to make payments with this card at lots of different merchants and it's getting declined at a lot of them then that ends up being a good signal that this is a risky event and you know this might be fraudulent but to be able to track that data historically you need to be able to know at the time that somebody attempted that payment what was the trailing 60 seconds of events what did those look like and then also in real time you need to be able to calculate that number at kind of every point in time um and so a feature store ends up being the way to do this um and in in fraud it ends up being extremely important because those types of features are hugely predictive of these sorts of events that occur but in other areas you don't you don't really need a feature store so for example my current team we largely just deal with text so like what was the text of a message that was sent we don't you don't need a feature store for that you just need a database um and or on the flip side uh if you if you don't need to have super high latency with the features that you're serving if you don't need to have them to be really accurate at the time that you're serving them then maybe you can just hack your own feature store this is what I did at one of these reasonable scale companies that I worked at I would like once a day run a giant SQL query and lo like to calculate some features and load that into memory and then use that to to serve these and it was good enough it was good enough for the problem at hand and so yeah that's I think that's maybe one thing is understanding what what exactly do you need um the other thing is that usually usually people are thinking about these things after they've already gotten started um I think it's it's kind of rare to have like a perfectly Clean Slate to start your entire ml Ops Journey or something like that and so looking at where you are and what are your current pain points I think is uh and like are they really big pain points is is another another piece to this um and then I think one other thing to talk about is like uh I think being cognizant of the fact that there's a cost to everything and so you know I feel like you often hear from people saying oh why don't you just use X or why don't you just use Y and that will solve your problems um but to use any new tool you're gonna have to learn how to use the tool and it's a specific tool the field is Young right now and so the probability that somebody new joining your team is going to also under understand this tool might be low and so and like any tool ends up being an abstraction layer that you are adding into your system and so uh is it is it going to be worth it um is this a large generalizable problem that you see across lots and lots of different models or is this just a specific thing that you need for this one task if it's something specific for this one test then maybe just hack your own solution for that task and then if you find that this is starting to become like a larger problem then then maybe you need like a dedicated tool for this I like that a lot and I like the idea of bringing it back to what problems you're trying to solve and then considering the costs of adopting new tools both personal costs and or professional personal costs and organizational cultural costs I suppose also people are generally interested in like when people are learning about this type of stuff like wow to get a job do I need to know about all the monitoring model monitoring tools and experiment trackers and metrics layers and all of this so what advice would you give to people and what what they need to know about the space more generally I would say no they don't need to know about any of these things um I think like I I would guess that there are lots and lots of people who are not using experiment trackers um it would be helpful if there were more uh I just started using one like a year or two ago uh and it's been great but you know people got by for a long time beforehand without using it and I don't think like for to pick on experiment trackers I don't think that that's like a a skill I can't imagine somebody interviewing about that skill and wanting to know do you know how to use weights and biases or Comet or or something like that um when I think of like the it's kind of like foundational skills that you need to have and those are not very often associated with a tool I would argue like you know programming uh do you know SQL that's great I would you know if you don't know if you know redshift but you don't know snowflake I feel like that is that is perfectly okay I I know when I was uh when I was first looking up job postings coming out of grad school you would like the problem with the job postings is that they often post like this is our stack like these are the technologies that you use or they're very bad and they they want you to have experience with like this database or something like that and I I have my doubts that you know if you only have experience with mySQL and not postgres that unless you're getting really deep into that database it's probably perfectly fine as long as you know SQL uh yeah and so I think I mean there are you know certain areas like maybe this is a pie torch only uh company and so and if it's a very hardcore deep learning role then they want you to know Pi torch before you come in even if you already know tensorflow but yep um but yeah cool um Michael ward has just written in the chat Ethan you're hitting all the right notes to me about limiting the number of tools because each one has a cost so that's that that's cool and someone called Joshua Ainsley says hi Ethan been a while um Josh has a question around learnings for moving from an IC to a manager now I'm going to shelf that question because that's something I want to come back to later but I just want to specifically say that we're going to um we're going to come back to that um hopefully uh P has a question uh so raised hands Emoji very good um what are the essential ml libraries that that we need to know and use to help land a full-time data science job um I think they're the same ones that have existed for a while so like if you're on the python stack scikit-learn is still very very popular we ask interview questions here where you're allowed to use scikit-learn and so if you know how to use that then it's a lot better than having to program logistic regression from scratch and so I feel like it's basically the pi data stack scikit-learn pandas numpy I think continue to be like the workhorses of a lot of this if you need to know deep learning then tensorflow or Pi torch are probably fine um and yeah I don't know I I think those are it um spark is starting to get popular I still don't know spark so I've somehow I had to use it a tiny bit at my first job it was really painful and I've somehow managed to avoid it this entire time and so I hope that some of you can as well yeah and I don't know about Landing jobs per se but something which allows you to do um fast boosted trees XG boost is one example there are others as well but yeah I think if you've got I mean I always say that if you can build random forests boosted trees and do a bit of deep learning that will help you answer most machine learning questions right so if you know some scikit-learn some deep learning of some sort and some boosted trees I think that 100 they're big fans of XD boost at Square awesome um and yeah but the the stack I mean numpy and pandas and matplotlib and a few of these other very like foundational Pi data Scipio packages uh incredibly incredibly useful and I think just being comfortable with using those libraries uh I know pandas can be kind of complicated I still get confused by the API and everything else but uh kind of the more comfortable you are and the the quicker you can kind of slice and dice your data then it's all that's uh just going to pay dividends and definitely be helpful during interviews to move through the interview first absolutely um and related knowing your way around Jupiter ecosystem notebooks and lab and can be incredibly useful and yeah a bunch of basic bash and terminal stuff as well yeah my team they're big fans of bash and make files um yeah I'm not going to tell anybody to go out and start building your own make files but uh you're yeah I think a lot of a lot of terminal stuff you will come across and it's it can only be helpful to get fast and Nimble with that absolutely and then we're going to get to this later but some basic software engineering skills I mean I would um I would say maybe top two or three a Version Control um refactoring I don't know what else what else good to know how to write tests tests yeah absolutely um yep testing testing code also data testing can be incredibly useful um using something like Pi test or something like that can be can be really cool our world is harder than it's our testing world is much harder than the software Engineers testing world so because it involves a lot of the real world yeah yeah data changes so yeah data does change I love I love that two-word two-word sentence such Clarity and precision um I want to move on to kind of what the full stack of machine learning and data science looks like to kind of seed this conversation what I want to do is just show a figure that I show every now and then um just as a conversation starter um this is one vision of the full stack of machine learning where we have data at the bottom you need some compute resources then as your flows start to become or your projects become larger you need to think about orchestration then versioning and not just versioning code but versioning code data models thinking about deployment maybe feature engineering comes in here somewhere but maybe that's baked into modeling here um and one of the reasons I wanted to show this figure and it comes back to a lot of the questions that people have asked in the chat um and that we discussed earlier about what data scientists need need to know scientists generally care more about the top layers of the stack whereas infrastructure and platform Engineers really care um about the bottom oh I've someone said we've lost audio um you can still hear me Ethan correct yeah I can hear you um so and I can hear you um I'm just going to type has anyone else lost audio also um we'll just keep talking and if a lot of people say they have we'll figure it out um okay um okay everyone else seems to be able to hear fine um so I was going to tell Jill that it must be them but they can't hear me so um you know um so but back to the topic at hand scientists really want to be thinking a lot more about the top layers of this stack but have easy access to the bottom layers of the stack that's one that's how we tend to think about it at manaflow and out of bounds I'm wondering how this resonates with you what what you'd add to it and how you think about the the full stack more generally yeah I think that uh data scientists are are largely like that here as well uh where you know you want to focus on the modeling but you want to have access to all of the superpowers that the cloud gives you right so I want to be able to I don't want the scale of my data to matter I don't want to have to wait for what I'm doing so I want I want things to be fast which might mean that I want things to work in parallel I want to be able to store my data as much data as I want you know um and yeah you want to focus on the modeling I do think that unfortunately nowadays well unfortunately right now and to be clear I haven't used metaflow so maybe this this solves everything but the uh the the more you try to avoid uh the lower parts of that stack the more you try to avoid touching the cloud touching compute and things like that uh the harder time you're just gonna have because uh because inevitably like our abstraction layers are not very good right now and so like I you know my team is on AWS right now when I was on the fraud team here they were on Google Cloud um and so I've like worked across both clouds we do have some platform level tools that uh start to do a good job of abstracting away like the compute layers and things like that but still inevitably you kind of run up into these like nasty edges to all of this and so um when you start to think of things like permissions especially at a big company where security and networking are very important things then you start to bump up and into permissions which ends up like pulling you down into that those those lower layers of the stack um but anyway uh in terms of like what I like how I think about it I do think that that orchestration layer that you had I think that that's like one highly important part I think the like a big part of machine learning work is the fact that you have to work local like you want to kind of write your code locally but you can't really do your work unless you're in the cloud and so you know software engineers like a lot of times they can they can write their application code locally they can spin it up locally they can test it run all of their tests locally and then they you know can kind of deploy it up into the cloud but in our world like training your model on test data locally is you know you can find some bugs and you can write some tests and things like that but you can't actually train your model often unless you're doing it somewhere up in the cloud and so that uh but like it's kind of difficult to work in the cloud and so you want to work locally but then you need to deploy everything up into the cloud and I think minimizing that that cost of switching from local to Cloud uh is important and is very difficult right now um yeah but that orchestration layer really really is it it's like all right I want to do all these things but I want to do them up in the cloud in like some agnostic way where maybe this code runs on this computer this code runs on this other computer and everything else um and so that that ends up being kind of a huge a huge part of it absolu
Original Description
Ethan Rosenthal is a data scientist at Square, has worked as a data consultant, and used to be a scientist scientist with a PhD in Physics from Columbia University. In this fireside chat, Ethan joins Hugo Bowne-Anderson, Outerbounds’ Head of Developer Relations, to discuss the wild west of full stack machine learning and how to make sense of all the feature stores, metric layers, model monitoring, and more with a view to deciphering what mental models, tools, and abstraction layers are most helpful in delivering actual ROI using ML.
After attending, you’ll know
- How to think about the full stack of machine learning in a principled way;
- What the most important layers in the ML stack are for data scientists;
- How to separate the wheat from the chaff when thinking about which tools and abstraction layers to adopt for your team;
And much more! The fireside chat will be followed by an AMA with Ethan and Hugo at slack.outerbounds.co.
00:00 Prelude
06:16 The fireside chat begins!
10:43 Paths into data science, ML, and AI
15:36 Learning to build Recommendation Systems on the job
18:02 Software engineering skills for early career data scientists
21:17 What exactly is ML Engineering?
25:34 Path from IC data scientist to AI engineering manager
27:32 ML at Square -- fraud detection, risk, conversation AI, NLP, and more!
31:48 How to measure the success of ML projects
35:30 Feature stores, model monitoring, metrics layers, experiment trackers: what is happening!
44:24 Essential ML libraries for data scientists and MLEs
48:01 What is the full stack of machine learning?
56:11 Software engineering and data science? Two totally different types of programming!
1:01:47 The data science / manager pendulum
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Playlist UU5h8Ji6Lm1RyAZopnCpDq7Q · Outerbounds · 26 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
▶
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Metaflow GUI for monitoring machine learning workflows
Outerbounds
Metaflow Cards [no sound]
Outerbounds
Fireside chat #1: How to Produce Sustainable Business Value with Machine Learning
Outerbounds
Fireside chat #2: MadeWithML.com -- Teaching Practical Machine Learning
Outerbounds
Metaflow on Kubernetes and Argo Workflows [no sound]
Outerbounds
Fireside chat #3: Reasonable Scale Machine Learning -- You're not Google and it's totally OK
Outerbounds
Metaflow Tags: Programmatic Tagging
Outerbounds
Metaflow Tags: Basic Tagging
Outerbounds
Metaflow Tags: Tags in CI/CD
Outerbounds
Metaflow Tags: Tags and Namespaces
Outerbounds
Metaflow Tags: Tags and Continuous Training
Outerbounds
Fireside chat #4: Machine Learning and User Experience -- Building ML Products for People
Outerbounds
Fireside Chat #5: Machine Learning + Infrastructure for Humans
Outerbounds
Metaflow Sandbox Demo: Free Data Science Infrastructure In the Browser
Outerbounds
Metaflow on Azure
Outerbounds
Fireside Chat #6: Operationalizing ML -- Patterns and Pain Points from MLOps Practitioners
Outerbounds
ML engineering vs traditional software engineering: similarities and differences
Outerbounds
Why data scientists love and hate notebooks: velocity and validation
Outerbounds
What even is a 10x ML engineer?
Outerbounds
The 4 main tasks in the production ML lifecycle
Outerbounds
Is the premise of data-centric AI flawed?
Outerbounds
The 3 factors that Determine the success of ML projects
Outerbounds
Fireside Chat #7: How to Build an Enterprise Machine Learning Platform from Scratch
Outerbounds
Run Metaflow on any cloud: Google Cloud, Azure, or AWS [no sound]
Outerbounds
Metaflow on GCP
Outerbounds
Fireside Chat #8: Navigating the Full Stack of Machine Learning
Outerbounds
How to Build a Full-Stack Recommender System
Outerbounds
Modernize your Airflow deployments with Metaflow - zero-cost migration [no sound]
Outerbounds
Easy Airflow DAGs for ML and data science with Metaflow [no sound]
Outerbounds
Fireside chat #9: Language Processing: From Prototype to Production
Outerbounds
How to build end-to-end recommender systems at reasonable scale
Outerbounds
Full-Stack Machine Learning with Metaflow on CoRise
Outerbounds
Natural Language Processing meets MLOps
Outerbounds
Fireside Chat #10: Large Language Models: Beyond Proofs of Concept
Outerbounds
What even are Large Language Models?
Outerbounds
How to get started with LLMs today
Outerbounds
LLMs in production
Outerbounds
Accessing secrets securely in Metaflow [no audio]
Outerbounds
Fireside Chat #11: The Open-Source Modern Data Stack
Outerbounds
Fireside chat #12: Kubernetes for Data Scientists
Outerbounds
Behind the Screen: How Amazon Prime Video ships RecSys models 4x faster
Outerbounds
Fireside chat #13: Supply Chain Security in Machine Learning
Outerbounds
Quick Delivery, Quicker ML: DeliveryHero's Metaflow Story
Outerbounds
Crafting General Intelligence: LLM Fine-tuning with Metaflow at Adept.ai
Outerbounds
Fuelling Decisions: How DTN Powers Gas Pricing and Data Science Collaboration
Outerbounds
From Kitchen to Doorstep: Optimizing Data Science Velocity at Deliveroo
Outerbounds
Building a GenAI Ready ML Platform with Metaflow at Autodesk
Outerbounds
Media Transcoding for 10 Million users and beyond with Metaflow at Epignosis
Outerbounds
Telematics with Metaflow: How Nirvana Insurance built a large-scale Risk Estimation platform
Outerbounds
Fireside chat #14: Generative AI and Machine Learning for Film, TV, and Gaming
Outerbounds
The Past, Present, and Future of Generative AI
Outerbounds
Building Production Systems with Generative AI, Machine Learning, and Data
Outerbounds
A Custom Fine-Tuned LLM in Action (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 5)
Outerbounds
Building Live Production Systems with RAG (LLMs & RAG: An Interactive Guided Tour Part 4)
Outerbounds
Better Relevancy with RAG (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 3)
Outerbounds
Working with OSS LLMs (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 2)
Outerbounds
Hitting OpenAI and Other Vendor APIs (LLMs, RAG, and Fine-Tuning: An Interactive Guided Tour Part 1)
Outerbounds
Production Systems with Generative AI (LLMs, RAG, & Fine-Tuning: An Interactive Guided Tour Part 0)
Outerbounds
LLMs in Practice: A Guide to Recent Trends and Techniques
Outerbounds
Metaflow for distributed high-performance computing and large-scale AI training
Outerbounds
More on: ML Maths Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
DevOps Took 10 Years to Mature.
Medium · DevOps
Praesto: A Kubernetes Operator for Node-Local ML Model Caching with CSI
Medium · DevOps
Beyond `ollama run`: Production-Ready DeepSeek R1 Deployment with vLLM and Nginx
Dev.to · Shannon Dias
MCP Health Check: Building Production Monitoring for Your MCP Server — What I Learned After 84 Production Outages
Dev.to AI
Chapters (14)
Prelude
6:16
The fireside chat begins!
10:43
Paths into data science, ML, and AI
15:36
Learning to build Recommendation Systems on the job
18:02
Software engineering skills for early career data scientists
21:17
What exactly is ML Engineering?
25:34
Path from IC data scientist to AI engineering manager
27:32
ML at Square -- fraud detection, risk, conversation AI, NLP, and more!
31:48
How to measure the success of ML projects
35:30
Feature stores, model monitoring, metrics layers, experiment trackers: what is h
44:24
Essential ML libraries for data scientists and MLEs
48:01
What is the full stack of machine learning?
56:11
Software engineering and data science? Two totally different types of programmin
1:01:47
The data science / manager pendulum
🎓
Tutor Explanation
DeepCamp AI