Interactive Machine Learning Systems with Alekh Agarwal - #17

The TWIML AI Podcast with Sam Charrington · Beginner ·📰 AI News & Updates ·9y ago

Skills: LLM Foundations80%Agent Foundations70%Tool Use & Function Calling60%RAG Basics50%

Key Takeaways

The video discusses Interactive Machine Learning Systems with Alekh Agarwal, covering topics such as active learning, reinforcement learning, contextual bandits, and their applications in real-world scenarios, including personalized recommendations and resource allocation. The conversation highlights the importance of interactive learning, its challenges, and potential solutions, showcasing tools like MSN and GitHub.

Full Transcript

[Music] hello and welcome to another episode of twiml talk the podcast where I interview interesting people doing interesting things in machine learning and artificial intelligence I'm your host Sam charington once again thanks so much to everyone who sent sent in their favorite quote from last week's podcast your stickers are on the way we've had a blast receiving and reviewing all of your quotes don't forget to send us your favorite quote from today's show as well because this contest will continue while our sticker Supply lasts you can do that via a comment on the show notes page or a comment or post on our Facebook Twitter Youtube or SoundCloud Pages if you've been listening to the podcast for a while you know that I really enjoy hearing from listeners and I appreciate all of the comments and feedback I get from all of you now I haven't mentioned this in a while but one of the most important ways you can provide feedback is through an iTunes review according to our stats the vast majority of our listeners come from IOS and iTunes and many of them one can only suppose find the podcast via the iTunes directory that's why your reviews there are so important the more and better they are the more people will want to check out the podcast so please take a moment to review the show on iTunes we'll have our link to the iTunes page in the show notes so you can click right through from there and you don't need to be a regular iTunes user to leave a review of course if iTunes just isn't your thing and you've got other ways you prefer to spread the word about the podcast those are appreciated as well thanks so much for spreading the word the interview you'll hear on today's show was recorded last fall at the O'Reilly AI conference in New York City on the subject of that conference it's returning to New York in June and this time around we'll be giving away passes to two lucky tml listeners stay tuned because that giveaway is coming soon if you happen to be in New York City now I want to call your attention to another event that will be taking place this one next week that event is called the future Labs AI Summit and it'll be held at NYU on Wednesday afternoon speakers at the event will include Facebook and nyu's Yan laon and nyu's Gary Marcus I'll drop a link to this one in the show notes check it out if you're in the city it sounds like a great event finally I'd like to take a minute to remind you to check out my upcoming event the future of data Summit which will be held May 15th and 16th in Las Vegas Nevada if you haven't already checked out the event I really encourage you to take a look the person I had in mind when I created this event is someone who's in a role where they're responsible for helping to chart an organization's data strategy or someone who wants to be in that kind of role or someone who needs to understand how all of this will come together this isn't the place where we'll go deep on neuronet pun intended or the latest research paper rather this is an interdisciplinary Summit where you'll get to hear from and engage with experts presenting on various aspects of our data Centric future you'll hear from Asaf araqi an expert in Big Data infrastructure at Intel on the coming advances in hardware and what they'll allow us to do in machine learning AI analytics and more you'll hear about how cloud iot and Big Data shift the cyber security threat landscape and how we can secure these systems from IBM's Global Executive Security adviser Diana Kelly I'll be leading a discussion on data privacy and algorithmic ethics with Accenture ai's ran shry and Erikson's Jonathan King Eric sammer founder and CTO Rana will talk about the role of AI in optimizing the Enterprise data center so-called AI Ops and Endeavor VR founder Amy peek will talk about the emerging role of virtual and augmented reality in the Enterprise these are just a few of the speakers I've lined up for you and I'll be announcing more shortly to learn more about the summit visit twiml ai.com futureof dat now about today's show this week my guest is Alec agarwall Alec is a researcher with Microsoft research in New York City where his work is focused on interactive machine learning as I mentioned before Alec and I recorded this show at the O'Reilly AI conference where he delivered a talk called Interactive Learning Systems why now and how Interactive Learning Systems are different from traditional supervised machine learning systems and that they need to explore and learn from their environments this is an exciting area of research search and one that really interests me personally in part because it incorporates Concepts like Active Learning reinforcement learning contextual Bandits and much more if you're interested in this topic when you're done with this show you should listen to the show I did with Georgia Tech Charles Isbell if you haven't already that was show number four the notes for this show can be found at twim ai.com talk7 and now on to the [Music] show hey everyone I'm here with uh Alec agal with Microsoft research and we're here at day two of the O'Reilly AI conference uh Alec just did a great presentation on Interactive Learning Systems and he was kind enough to join us to uh talk a little bit about that presentation Alec why don't you start off by introducing yourself yeah sure uh so I'm alal I'm a researcher at Microsoft research actually here in New York City um I've been here for uh four years prior to that I was doing my PhD at UC Berkeley and my work really touches upon a lot of teams in AI but one that has particular uh and in machine learning but one that has particularly uh been of Interest lately is um what I call Interactive machine learning so um think that thing about problems where um the machine learning algorithm is not just uh learning from a static pool of data that was um hand annotated and collected by somebody else but think about really uh how the algorithm has to interact within a larger system within a larger environment to um collect that data to gather learning cues and then incorporate learning cues into the model uh in order to improve over time um and this leads to uh several kind of um paradigms in machine learning things like Active Learning uh reinforcement learning or um you know subsets of reinforcement learning like contextual B standards and so on all of which um I worked on and were uh touched upon in the talk today great great so you mentioned uh the machine learning uh learning from the environment and one of the ways you illustrate Illustrated that in your talk was you showed a demonstration of a Super Mario Brothers game can you talk about uh what you were intending to show with that demo and yeah um so so in some sense uh whenever we are thinking about Interactive Learning System right so one of the question is um what are environments in which we can safely run these experiments in which we can have this are basically algorithmic agents interact with their environment or manipulate the environment in a safe manner of course uh you know the natural or maybe canonical embodiment even in everybody's thinking of uh such agents are robots right uh but uh robots are hard to program it's hard to in fact like control all of their sensors and actuators and it takes takes time it takes uh resources even to get one um and so a lot of people have um found um designing agents for various sort of games either computer games or even traditional board games right like bgam chess uh go now and other such games as kind of more uh controlled environments in which we can still uh have this interaction either with another player or with its uh with with the environment of the game uh-huh and uh and we can run these experiments um kind of over and over again with uh actually remarkably High throughput you know we can overclock these games so it allows for very fast experimentation right um and so uh that that's kind of uh the the the the reason why uh people have really gravitated towards using um uh games in particular now there is you know this uh Atari learning environment which is basically using Atari games again as a platform uh to to test inter Active Learning situations and uh what I was trying to show in the particular Super Mario demo actually so I should give uh credit to Stefan Ross whose work that video comes out of so you know what they were they were trying to demonstrate was um uh supervised learning is not adquate or a static pool of data is not adequate to do well in Interactive Learning problems because when you manipulate your environment then if you act a certain way you see situations that may or may not be present in your training data so you know if I know how to drive well then I might not get into a car accident uh or at least not an obvious one or I might not get stuck behind a a slow driving car very often and now of course you you don't if you don't see any data for how to recover from those errors then you know your driving agent is in trouble so so that that's the part I was trying to emphasize that uh when we try to do supervised learning in these inter interactive scenarios then often our algorithms tend to make mistakes that we don't have in the training data and then they don't know how to recover from those mhm and so they get you know stuck in corners and they they they they just uh fall maybe in pits and do all sorts of silly things that we just don't do right so walk us through how uh the approaches to Interactive Learning that you talked about address that problem yeah so in some sense uh um here's an alternative uh we could think about right so um so let's say I was um thinking about a conversational agent like a chatbot so I have two options I could uh look through many many transcripts of you know how people have talked to each other maybe in a controlled domain even like a you know call center or something and I can try to train an agent from this data again this might have issues that maybe I said something that doesn't make sense to the user maybe they respond in a way that a call center uh human agent would never encounter and I don't know how to respond now now imagine in this situation instead of just being stuck and floundering I could actually uh sort of fall back to a human agent and say hey I'm kind of uncertain about what to do right now can you bail me out please right now I have um learned um how to recover from this mistake so in future even if I'm about to make this mistake I'll be able to recover from it right and so there is there there is this um concept of learning in situations that I encounter that Interactive Learning algorithms usually embody right and another kind of thing that uh often comes up is um just even you know the these are like very ambitious tasks of course right so let's think about something even which seems much more mundane to us um like uh you know ranking search results or recommending news articles personalizing news artic articles um so maybe I have data of what users have been clicking on and I learn from this data and I think I have found um something better to to choose right how do I evaluate it well if this other these other set of choices are never displayed to the users by my previous system I have no data that can actually back that these choices are good right right and so um we only know the performance of the thing that the user actually clicked on we don't have any information about how the other things would have performed exactly and so we run into this problem that we don't even know how to evaluate a system that does something different from what we have in the data and basically Interactive Learning exactly tries to bridge this Gap it tries to come up with techniques for um essentially how you have to kind of learn on the job you have to learn on the fly in order to address these uh situations no it seems like you've talked about two different things um one is you know you've got your machine prediction how do we kind of pop out of that Loop and get input from the human and the other makes me think more of like multivariate testing AB testing that kind explore exploit kind of scenarios how are those related or yeah so um so so so in some sense um explore exploit is kind of one step further from this um you know like evaluation problem that I mentioned so um so if you think about exploration why do you explore right you explore because um okay sometimes you'll um you'll try one thing sometimes you'll try another thing so let's kind of uh in a statistical sense if we zoom out and think about what's happening in expectation roughly you're trying everything and in expectation at least sort of as a population level you're getting feedback about everything right and what as a result that enables you to do is it enables you to evaluate every choice that you could have made right so uh implicit in explore exploit is this ability to evaluate in interactive settings and so what I try to do is I I mean explore exploit kind of in my mind consists of two parts there is this ability to evaluate which I think is very crucial in its own because people currently do of an evaluation by doing like you're saying AB testing or multivariate testing which is a horribly inefficient way of going about the task um and explore exploit basically gives you much more data efficient way of doing the same thing and then it lets you in fact do something much more it it lets you actually refine your models in real time and update them and you know do this sort of really online learning which is which is great but even if you don't want to do that just this evaluation is something you can already get a lot of value out of so that's why I kind of try to present both the pieces in their own right uh but but in some sense yes explore exploit is the is is the general sort of overarching um solution that um that addresses uh basically both of the issues and why is it why does it end up being so much more data efficient than the traditional right so so there are two two um important uh Hallmarks so so let's let's first start from a non-contextual setting right so let's just say you're trying to basically find the most popular news story that works for everyone mhm then um theab testing way so to say of going about it would be let's say you have 10 you give one tenth of traffic to each one of them week later you see which one did the best okay the explore exploited way of doing is you say okay well initially yeah I'll I'll start giving uh 10% traffic to each one the moment something starts to look better than the others I can I dynamically adjust my traffic right so everything is is getting only enough traffic that I need to rule it out for as being inferior right and so that's already data efficient however now let's think about personalization about contextualization right so so um so um you said you're from St Louis you're you're hopefully not the only user from St Louis who visits uh MSN so um so you visit I I maybe give you a randomized choice of news story and maybe another user who has more or less the same features as you visits right I maybe try a different news story for them now what I've done is um let back up so so so usually for personalization um uh let's again contrast within with an AB test so an AB test would say okay so you know I'll I'll I'll I'll run this experiment to on on some percent of data to evaluate be on some perc of data to evaluate uh a right right so each data point is either going to a or to B right there is no sort of data sharing happening mhm however uh when you have a large uh class of uh available options uh available models that you might want to pick out of right then on you got tons of news like one set of parameters for neural network defining one model so you really have an infinite number of them but let's just pretend for now they're a billion okay so of course for a given user many of these models would make the same Choice mhm right so if you randomize the new story that you present to the user then and you look the look at the outcome then suddenly you have information about all the different model models that made the the the story you displayed that that made the same recommendation to this user right and you can use share this users information across all of them H so you're you're effectively training a billion models in parallel you're you're effectively evaluating a billion models in parallel and then because you're evaluating them in parallel you can just choose the best one at the end and right right well you have optimization right so so so it's this data sharing that's uh crucial and there is a certain sense there's a precise mathematical sense in which you can prove that this is as a result if you think about doing the same evaluation of a billion things through AB testing versus uh what we call multi World testing then you require exponentially fewer um samples in our approach and there is kind of a precise uh uh theory in papers backing up that claim it's uh so so that that's kind of the Crux okay okay uh so we've we've touched on this throughout the discussion but one of the big areas that you apply this on is um in personalization uh can you talk a little bit about the use case and some of the background there yeah so so you know I think really um I mean you know it's um 2016 I think it's uh it's great that some of the things we use actually do adapt to our tests over time but it's a pretty that more of them don't right and I feel like everything I do uh should learn about me especially because they are collecting a ton of data about me so might as well put it to some good um and and and basically I I think this is uh personalization is the one that's most interesting to me also is because like I was describing uh when once you start thinking if you're just trying to evaluate 10 things fine there is some difference between AB testing and uh multi World testing but it's really when you start think about creating personaliz and if you if you're only doing test things then in some sense a very smart person might just look at them and by their gut feeling pick out the best one right but once you start thinking about personalized uh models then first of all it doesn't scale for humans to do it it doesn't scale for AB testing to do it you really need a different technology to do it and that's where that that's why I think it's a very well suited scenario to something like multi testing and contextual Bandits and so that that's where we've found the most um also excitement from you know we we've we've talked to a broad range of uh um product managers and that's the the the aspect that usually receives the most uh appeal uh to them and what kind of results are you seeing with that so with uh with I mean I have more substantive results with MSN we have a lot of other kind of experiments going on but um but but I but I think MSN ones are the one I can speak of most authoritatively and basically they've uh relied so you know they have they have like this web page with with many different uh which they kind of logically partition into many different areas so they've kind of Applied our system to many different areas now and reliably in all of those they found kind of with minimal to no tuning they found always that um they were getting actually uh Improvement in most kind of user engagement metrics that that they track uh what was even more interesting uh and this this kind of really reflects the personalization aspect right so they had been running this uh these experiments in the US market and then the Olympics came around and somebody had an idea hey let's try this in Brazil right so this is Portuguese yeah we we've we've not changed anything in the system except the you know like the user browsing history they they they they had it for those users on the Portuguese articles of course and they did like a Portuguese topic model for so just the feature extraction part was a little like language specific nothing about the machine learning changed and we got you know double to Triple digit improvements over the existing system just deploying this out of the box and they they started running it on 100% traffic so this um it really does work is that level of improvement on par with what you saw in English or greater or slightly less um you know that that's kind of uh difficult to uh judge very well because also they have different amounts of uh data and different level of performance of the Baseline system in different markets so it's um yeah it's a bit hard to compare that mhm but but I think the the important thing for them was that um it was really with very little customization things were robustly able to transfer from one market to another right um yeah and so the this approach um you've talked a little bit about it being more data efficient but how does that translate to the actual implementation and computational efficiency yeah so I me is that an issue for these kinds of problems at that scale or I mean there yes you have to be mindful about competition we are mind mindful about competition we've tried to make these things as efficient as possible so um with MSN for instance we were running in you know the front end of their servers and we had like 5% or so performance overhead on their system which was deemed fine for the value so there there is obviously some performance uh that is loss that is incurred but uh I mean we we we've we've implemented these algorithms very carefully and um a lot of the costs can be amortised essentially so now the 5% running on the front end that's for model evaluation is there a training step but the nice thing is the training step is running asynchronously in the cloud and that I mean that's entirely again I mean currently even at msk first of all one thing that's nice about doing things online is right you're um not really thinking about having to deal with a billion or a trillion records all at once you're just streaming over them so uh scales become nicer but but uh so with MSN we were still able to do all the training on one machine in fact in the background but even if that's not the case it's pretty easy to parallelize the training algorithms and uh we support that so okay yeah I mean you can and and we've U the the only thing you have to be careful about is so we we've made sure that uh we we try to keep the system very reproducible because one of the frustrating things we've encountered over and over again is when something goes wrong in these complex system it's really hard to trace down what went wrong right with parallelization sometimes it can be a little bit more tricky because like all of the order of events and so on so so we recommend as far as possible to avoid it but it's definitely possible to do it if you can do it for MSN on one machine then a lot of people will be able to get pretty far on a single machine um and you've published papers about this like what have someone wanted to to try out this approach like what's the best way to for them to learn more about it yeah so there is a short URL aka.ms mwt um that website has a ton of resources on it it has um both you know more uh uh like do-it-yourself type guide type things if you direct want to get your hands dirty if you want to learn more about the signs it provides extensive uh there's a there's a very extensive white paper that provides links to even more extensive research papers and so on so really depending on how much detail you want all of the resources are available on that website awesome awesome and the and the project itself is on GitHub so you know all of the code is uh open source um you can play with the machine learning algorithms the machine learning algorithms actually have been open source for several years now so they've been you know tried out not just by us but by others across the research Community as well and what are the what are the algorithms based on like what what general classes of algorithms do these look like so I mean what so so so the way we do things is we so we take these interactive or you know contextual Bandit learning problems and we we basically sort of uh what the learning algorithms look like they massage and massage and massage the data till you can essentially think of it as some sort of a multiclass classification problem okay and so now go to party with your favorite multiclass classification algorithm I mean we we Implement our own uh for for the complete pipeline but uh and we support a lot of different types of models like you know linear models shallow feed forward Nets um we have um you know Matrix factorization type models we have a a whole variety of uh sort of very very quick feature manipulations that are ingrained into the models that you can just do so so it's so so all of this is happening in a software called voal wbit the all of the machine learning part which has been around for several years and is uh one of the uh more performant software tools for machine learning out there so it's U it provides a a lot of uh functionality and if you want something that's not in it then um you know there are ways to plug into other machine learning libraries as well and the your use case there was on personalization are there other use cases that you've seen this apply to um so I mean it depends on how far you want to extend personalization uh the definition right so one of the things we are currently uh trying to work on is so uh so we have uh users of uh Microsoft band and um you know a lot of people in this country suffer from uh Sleep Disorders or just band is the health band Yes because of uh like stress or other reasons they're just not sleeping well and so the band was uh for a while trying to give sleep insights uh basically some some recommendation to change your lifestyle in some uh small way that might help you sleep better and um so for instance we are trying to now uh do an experiment where we would choose which recommendations to show based on how the users sleep then resp responded to the recommendations so do this so I mean I think of this as within the realm of personalization but you know um and I mean again uh we uh we haven't had conversations on a more medical domain so far but we are really hoping that we can get there um in in future so so that's definitely uh one one realm other definit definitely a good chunk of the conversations we've had than that are I would say around personalization of various sorts um but but but even uh like one of the interesting use cases some of our actually so some of some of the people in our research team are um so because of a system we also have some systems researchers on the team and you know system itself um does a number of uh resource allocation choices and you know um server is kind of Distributing doing load balancing and um a lot of other resource allocation problems right so so what they've been curious is if they can allow apply even some of these techniques to core systems problems um we have some preliminary experiments with that nothing I would call convincing yet but it's it's actually pretty broadly applicable and by core systems problems are you thinking things like um you know allocation of resources within a Data Center and yeah you can you can apply it at several different so you can think about uh applying it um at the level of a router at the level of a Nick at the level of an OS level of data center or scheduler in a data center there there are many different places you can think of it and um in some sense the I mean a lot of these are basically currently working on top of uh handd designed rules that some very smart people thought about the problem very carefully and designed it right but there's no reason why we can't make them more adaptive and more intelligent mhm um so so yeah I think there is uh definitely a lot lot of potential there in like machine learning for systems type of area for uh these Interactive Learning situations and so how would you kind of taking a a step back to summarize how would you characterize at the highest level you know if you've got a problem that looks like X you know this is you know a solution question I actually do when when I start to talk to people who are interested I usually ask give them a template and ask them if their problem fits into that template right so so at the high level there is this Loop of uh You observe the world you take an action and you uh re You observe a reward right so it's important that you face this Loop over and over again one of the things you have to be careful about for instance is often uh when we start this conversation people don't necessarily have a well- defined notion of a reward they can point to and that's very important if you don't Define it well the system will just learn some garbage right right the other thing that's kind of important to think about is uh like I said contextual banded problem I was saying in the talk makes this assumption that when I take an action it does not have influence on the next context I see right so something like a conversation just does not fit this you know what you say is not independent of uh what I said before uh but something like um recommendation systems is largely true um so so so so that's something you have to keep in mind when you're thinking about how good a fit this might be to your problem now that said we are of course uh I mean we have we have a lot of research expertise and research uh uh advances in also working out the situations where your actions modify the state um it's just that there the software not quite there yet I mean we we have some software but it's not really a full- flooded system yet okay great great any other uh considerations or things that um that folks should know when they're thinking about this space um um no I would say again think uh if you if you're thinking about these problems think think really hard about the reward that's that's the one that usually goes wrong and the other thing is uh I think uh we we've tried very carefully in the various materials we've prepared to outline all the usual things that go wrong because um one of the things we find is even after we talk to people they often fall back into those traps so it's very important to thinkink through those carefully and make sure you don't fall into them and and uh what are some of those traps so a lot of those traps are well essentially even I mean it's really tempting to say right that I have some observational data uh collected from my system let's just do some machine learning with it right and this almost never works in a reliable Manner and there are various levels at which this manifests so you know one thing you might want to do is oh I ran this experiment and actually things are working quite well now why don't I turn off the experimentation I turn off the randomization no I mean you know preferences change um or um just various um uh various subtle bugs arise just uh due to the way people are recording things so of course if you do everything with our system then you know we we've designed things in a way that they shouldn't arise but um often people want to use their own custom components for for for parts of things so if you if you if you're thinking about doing that it's really important that you look into the the the failure modes that we emphasize and make sure you don't fall into those um and the other thing is yeah it just like even on top of our system when when you're building something it's important to think about the reproducibility of everything because that's the one thing we found really was key when we when like even with MSN right it wasn't all um sort of a Bed of Roses initially there there were a quite a few hiccups and because we kept everything reproducible we could quickly figure out where the problem was right right awesome awesome why don't you repeat that URL once more time yeah it is aka.ms mwt okay and can folks uh if they've got questions can they contact you through that URL yes absolutely awesome awesome well thanks so much Alec great to meet you and uh appreciate the talk yeah fun talking to you thanks all right everyone that's our show for today once again thanks so much for listening and for your continued support don't forget to share your favorite quotes for a twiml sticker these stickers are great you're going to love them you can share your favorite quote via the show notes page via Twitter via our Facebook page or via a comment on YouTube or SoundCloud and don't forget to hit that iTunes link and leave us a review the notes for this show will be up on twiml ai.com talk7 where you'll find links to Alec and the various resources mentioned in the show catch you next time [Music] time

Original Description

This week my guest is Alekh Agarwal. Alekh is a researcher with Microsoft Research whose research is focused on Interactive Machine Learning. In our discussion, Alekh and I discuss various aspects of this exciting area of research such as active learning, reinforcement learning, contextual bandits and more. The show notes can be found at twimlai.com/talk/17. Subscribe! iTunes ➙ https://itunes.apple.com/us/podcast/this-week-in-machine-learning/id1116303051?mt=2 Soundcloud ➙ https://soundcloud.com/twiml Google Play ➙ http://bit.ly/2lrWlJZ Stitcher ➙ http://www.stitcher.com/s?fid=92079&refid=stpr RSS ➙ https://twimlai.com/feed Lets Connect! Twimlai.com ➙ https://twimlai.com/contact Twitter ➙ https://twitter.com/twimlai Facebook ➙ https://Facebook.com/Twimlai Medium ➙ https://medium.com/this-week-in-machine-learning-ai

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from The TWIML AI Podcast with Sam Charrington · The TWIML AI Podcast with Sam Charrington · 17 of 60

← Previous Next →

Engineering Practical Machine Learning Systems with Xavier Amatriain - #3

Engineering Practical Machine Learning Systems with Xavier Amatriain - #3

The TWIML AI Podcast with Sam Charrington

How to Build Confidence as an ML Developer with Siraj Raval - #2

How to Build Confidence as an ML Developer with Siraj Raval - #2

The TWIML AI Podcast with Sam Charrington

Open Source Data Science Masters, Hybrid AI, Algorithmic Ethics & More with Clare Corthell - #1

Open Source Data Science Masters, Hybrid AI, Algorithmic Ethics & More with Clare Corthell - #1

The TWIML AI Podcast with Sam Charrington

Interactive AI, Plus Improving ML Education with Charles Isbell - #4

Interactive AI, Plus Improving ML Education with Charles Isbell - #4

The TWIML AI Podcast with Sam Charrington

Machine Learning for the Stars & Productizing AI with Joshua Bloom - #5

Machine Learning for the Stars & Productizing AI with Joshua Bloom - #5

The TWIML AI Podcast with Sam Charrington

Generating Labeled Training Data for Your ML/AI Models with Angie Hugeback - #6

Generating Labeled Training Data for Your ML/AI Models with Angie Hugeback - #6

The TWIML AI Podcast with Sam Charrington

Explaining the Predictions of Machine Learning Models with Carlos Guestrin - #7

Explaining the Predictions of Machine Learning Models with Carlos Guestrin - #7

The TWIML AI Podcast with Sam Charrington

Deep Learning: Modular in Theory, Inflexible in Practice with Diogo Almeida - #8

Deep Learning: Modular in Theory, Inflexible in Practice with Diogo Almeida - #8

The TWIML AI Podcast with Sam Charrington

Emotional AI: Teaching Computers Empathy with Pascale Fung - #9

Emotional AI: Teaching Computers Empathy with Pascale Fung - #9

The TWIML AI Podcast with Sam Charrington

Statistics vs Semantics for Natural Language Processing with Francisco Webber - #10

Statistics vs Semantics for Natural Language Processing with Francisco Webber - #10

The TWIML AI Podcast with Sam Charrington

Building AI Products with Hilary Mason - #11

Building AI Products with Hilary Mason - #11

The TWIML AI Podcast with Sam Charrington

Reprogramming the Human Genome with AI, w/ Brendan Frey - #12

Reprogramming the Human Genome with AI, w/ Brendan Frey - #12

The TWIML AI Podcast with Sam Charrington

Understanding Deep Neural Networks with Dr. James McCaffery - #13

Understanding Deep Neural Networks with Dr. James McCaffery - #13

The TWIML AI Podcast with Sam Charrington

Scaling Deep Learning: Systems Challenges & More with Shubho Sengupta - #14

Scaling Deep Learning: Systems Challenges & More with Shubho Sengupta - #14

The TWIML AI Podcast with Sam Charrington

Domain Knowledge in Machine Learning Models for Sustainability with Stefano Ermon - #15

Domain Knowledge in Machine Learning Models for Sustainability with Stefano Ermon - #15

The TWIML AI Podcast with Sam Charrington

Machine Learning in Cybersecurity with Evan Wright - #16

Machine Learning in Cybersecurity with Evan Wright - #16

The TWIML AI Podcast with Sam Charrington

Interactive Machine Learning Systems with Alekh Agarwal - #17

Interactive Machine Learning Systems with Alekh Agarwal - #17

The TWIML AI Podcast with Sam Charrington

Location-Based Intelligence for Smarter Marketing with Klustera - #18

Location-Based Intelligence for Smarter Marketing with Klustera - #18

The TWIML AI Podcast with Sam Charrington

AI-Powered Customer Support with HelloVera - #18

AI-Powered Customer Support with HelloVera - #18

The TWIML AI Podcast with Sam Charrington

Using AI to Simplify the Programming of Robots with Cambrian Intelligence - #18

Using AI to Simplify the Programming of Robots with Cambrian Intelligence - #18

The TWIML AI Podcast with Sam Charrington

Increasing Efficiency of Healthcare Insurance Billing with NLP, w/ Behold.ai - #18

Increasing Efficiency of Healthcare Insurance Billing with NLP, w/ Behold.ai - #18

The TWIML AI Podcast with Sam Charrington

Creating a Worldwide Financial Knowledge Graph with AlphaVertex - #18

Creating a Worldwide Financial Knowledge Graph with AlphaVertex - #18

The TWIML AI Podcast with Sam Charrington

From Particle Physics to Audio AI with Scott Stephenson - #19

From Particle Physics to Audio AI with Scott Stephenson - #19

The TWIML AI Podcast with Sam Charrington

Selling AI to the Enterprise with Kathryn Hume - #20

Selling AI to the Enterprise with Kathryn Hume - #20

The TWIML AI Podcast with Sam Charrington

Engineering the Future of AI with Ruchir Puri - #21

Engineering the Future of AI with Ruchir Puri - #21

The TWIML AI Podcast with Sam Charrington

Deep Neural Nets for Visual Recognition with Matt Zeiler - #22

Deep Neural Nets for Visual Recognition with Matt Zeiler - #22

The TWIML AI Podcast with Sam Charrington

Introducing Psycholinguistics into AI with Dominique Simmons- #23

Introducing Psycholinguistics into AI with Dominique Simmons- #23

The TWIML AI Podcast with Sam Charrington

Reinforcement Learning: The Next Frontier of Gaming with Danny Lange - #24

Reinforcement Learning: The Next Frontier of Gaming with Danny Lange - #24

The TWIML AI Podcast with Sam Charrington

Offensive vs Defensive Data Science with Deep Varma - #25

Offensive vs Defensive Data Science with Deep Varma - #25

The TWIML AI Podcast with Sam Charrington

Global AI Trends with Ben Lorica - #26

Global AI Trends with Ben Lorica - #26

The TWIML AI Podcast with Sam Charrington

Intelligent Autonomous Robots with Ilia Baranov - #27

Intelligent Autonomous Robots with Ilia Baranov - #27

The TWIML AI Podcast with Sam Charrington

Reinforcement Learning Deep Dive with Pieter Abbeel - #28

Reinforcement Learning Deep Dive with Pieter Abbeel - #28

The TWIML AI Podcast with Sam Charrington

Robotic Perception and Control with Chelsea Finn - #29

Robotic Perception and Control with Chelsea Finn - #29

The TWIML AI Podcast with Sam Charrington

Natural Language Understanding for Amazon Alexa with Zornitsa Kozareva - #30

Natural Language Understanding for Amazon Alexa with Zornitsa Kozareva - #30

The TWIML AI Podcast with Sam Charrington

The Power of Probabilistic Programming with Ben Vigoda - #33

The Power of Probabilistic Programming with Ben Vigoda - #33

The TWIML AI Podcast with Sam Charrington

Intel Nervana Update + Productizing AI Research with Naveen Rao and Hanlin Tang - #31

Intel Nervana Update + Productizing AI Research with Naveen Rao and Hanlin Tang - #31

The TWIML AI Podcast with Sam Charrington

Video Object Detection at Scale with Reza Zadeh - #34

Video Object Detection at Scale with Reza Zadeh - #34

The TWIML AI Podcast with Sam Charrington

Enhancing Customer Experiences with Emotional AI, w/ Rana el Kaliouby - #35

Enhancing Customer Experiences with Emotional AI, w/ Rana el Kaliouby - #35

The TWIML AI Podcast with Sam Charrington

Expressive AI-Generated Music With Google's Performance RNN with Doug Eck - #32

Expressive AI-Generated Music With Google's Performance RNN with Doug Eck - #32

The TWIML AI Podcast with Sam Charrington

Smart Buildings & IoT with Yodit Stanton - #36

Smart Buildings & IoT with Yodit Stanton - #36

The TWIML AI Podcast with Sam Charrington

Deep Robotic Learning with Sergey Levine - #37

Deep Robotic Learning with Sergey Levine - #37

The TWIML AI Podcast with Sam Charrington

Deep Learning for Warehouse Operations with Calvin Seward - #38

Deep Learning for Warehouse Operations with Calvin Seward - #38

The TWIML AI Podcast with Sam Charrington

Cognitive Biases in Data Science with Drew Conway - #39

Cognitive Biases in Data Science with Drew Conway - #39

The TWIML AI Podcast with Sam Charrington

Data Pipelines at Zymergen with Airflow, w/ Erin Shellman - #41

Data Pipelines at Zymergen with Airflow, w/ Erin Shellman - #41

The TWIML AI Podcast with Sam Charrington

Web Scale Engineering for Machine Learning with Sharath Rao - #40

Web Scale Engineering for Machine Learning with Sharath Rao - #40

The TWIML AI Podcast with Sam Charrington

Marrying Physics-Based and Data-Driven ML Models with Josh Bloom - #42

Marrying Physics-Based and Data-Driven ML Models with Josh Bloom - #42

The TWIML AI Podcast with Sam Charrington

Machine Teaching for Better Machine Learning with Mark Hammond - #43

Machine Teaching for Better Machine Learning with Mark Hammond - #43

The TWIML AI Podcast with Sam Charrington

LSTMs, Plus a Deep Learning History Lesson with Jürgen Schmidhuber - #44

LSTMs, Plus a Deep Learning History Lesson with Jürgen Schmidhuber - #44

The TWIML AI Podcast with Sam Charrington

Learning From Simulated & Unsupervised Images through Adversarial Training - TWiML Online Meetup

Learning From Simulated & Unsupervised Images through Adversarial Training - TWiML Online Meetup

The TWIML AI Podcast with Sam Charrington

Jennifer Prendki Interview - Agile Machine Learning - TWiML Talk #46

Jennifer Prendki Interview - Agile Machine Learning - TWiML Talk #46

The TWIML AI Podcast with Sam Charrington

Evolutionary Algorithms in Machine Learning with Risto Miikkulainen - #47

Evolutionary Algorithms in Machine Learning with Risto Miikkulainen - #47

The TWIML AI Podcast with Sam Charrington

Learning Long-Term Dependencies with Gradient Descent is Difficult - TWiML Online Meetup

Learning Long-Term Dependencies with Gradient Descent is Difficult - TWiML Online Meetup

The TWIML AI Podcast with Sam Charrington

Word2Vec & Friends with Bruno Gonçalves -#48

Word2Vec & Friends with Bruno Gonçalves -#48

The TWIML AI Podcast with Sam Charrington

Symbolic and Subsymbolic Natural Language Processing with Jonathan Mugan - #49

Symbolic and Subsymbolic Natural Language Processing with Jonathan Mugan - #49

The TWIML AI Podcast with Sam Charrington

Bayesian Optimization for Hyperparameter Tuning with Scott Clark - #50

Bayesian Optimization for Hyperparameter Tuning with Scott Clark - #50

The TWIML AI Podcast with Sam Charrington

Intel Nervana DevCloud with Naveen Rao & Scott Apeland - #51

Intel Nervana DevCloud with Naveen Rao & Scott Apeland - #51

The TWIML AI Podcast with Sam Charrington

AI-Powered Conversational Interfaces with Paul Tepper - #52

AI-Powered Conversational Interfaces with Paul Tepper - #52

The TWIML AI Podcast with Sam Charrington

Topological Data Analysis with Gunnar Carlsson - #53

Topological Data Analysis with Gunnar Carlsson - #53

The TWIML AI Podcast with Sam Charrington

ML Use Cases at Think Big Analytics with Mo Patel & Laura Frølich - #54

ML Use Cases at Think Big Analytics with Mo Patel & Laura Frølich - #54

The TWIML AI Podcast with Sam Charrington

Ray:A Distributed Computing Platform for Reinforcement Learning with Ion Stoica -#55

Ray:A Distributed Computing Platform for Reinforcement Learning with Ion Stoica -#55

The TWIML AI Podcast with Sam Charrington

This video teaches the fundamentals of Interactive Machine Learning Systems, including active learning, reinforcement learning, and contextual bandits, and their applications in real-world scenarios. It highlights the importance of interactive learning, its challenges, and potential solutions, showcasing tools like MSN and GitHub.

Key Takeaways

Run AB testing
Configure traffic to each option
Adjust traffic dynamically based on performance
Implement Multi-World Testing for Personalization
Use MSN and GitHub for Interactive Machine Learning

💡 Interactive Machine Learning Systems can learn from environments and adapt to user behavior and preferences, making them crucial for personalized recommendations and resource allocation.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related Reads

AI Is Waiting for Its Wharton Moment

Learn how the history of business education can inform the development of AI education and training, and why AI needs its own 'Wharton moment'.

AI Is Our Steel: Why Work Isn’t Ending — It’s Changing Building Material

AI is changing the workforce, but it's not ending work, rather transforming it, just like previous technological advancements

AI Skills Every College Student Must Learn Before Graduation in 2026

College students must acquire AI skills to enhance their job prospects in 2026, as a degree alone may not be sufficient

The Writers AI Replaced in 2026 Already Sounded Like AI

Writers are concerned about AI replacing their jobs, but the writing style of replaced writers already resembled AI-generated content

'Social media needs to change': Gander Social founder