Doctor AI

Data Skeptic · Beginner ·🧬 Deep Learning ·9y ago

Skills: ML Maths Basics60%Supervised Learning50%

Key Takeaways

The episode discusses the development of a temporal model using Recurrent Neural Network to predict clinical events, with a focus on Electronic Health Record (EHR) data and the potential of AI in medical diagnosis.

Full Transcript

data skeptic is the official podcast of data skeptic comm bringing you stories interviews and mini episodes on tactics and data science machine learning statistics and artificial intelligence [Music] so Linda do you think a machine could predict Hospital outcomes or I guess it's patient outcomes in general I don't know it's specific to Hospital well that's funny you asked I just read an article about this tell me all about it well I don't remember all about it using a little bit too interested there but anyways what it said is once you like train these neural networks they can catch things that humans don't catch and they do it better than human yeah that's definitely a possibility they could catch things humans don't catch if they're able to maybe look at wider arrays of data or notice things and finer resolution the humans haven't learned or something like that also they don't get fatigued so the doctor you know can't be maybe sick themselves or lazy or who knows what so there are some advantages but what do you think who is scoring better actual doctors or computers computers that's what the article said well that article is from the future then because humans are still a little bit better at this time but we're very much on the cusp of a changeover it would seem are you saying computers miss stuff they do not as good a job as doctors in general in some very specific cases maybe they do better but we're making a lot of progress and that's why I wanted to talk to Ed about dr. AI because I think it's a step in the right direction doing some interesting things in this space why do you think a computer would do better than a person the article I read suggested that computers can process the data much more rapidly doing large volumes and large amounts whereas it would take months and months for human people to do it and have peers review it as well then also it just seemed like they were the computers were able to catch things that humans weren't humans were catching other things I guess and computers were prone to catch other things so they're just separate things as well so maybe we need the complement of these two tools we need a human who also gets to look at a models output and maybe synthesize the results in some way having not read the dr. AI paper what would you guess its success rate is it predicting diagnosis and time till they see the patient again well it's got to be more than 50% well you think so actually 50% would be quite good because they're looking at a lot of different diagnoses but your your thoughts are on the right track yeah it has to be better than random interestingly enough we'll get into this a little bit in fact Edie points out to me that the baseline he uses is not random it's actually how what condition did they report last time because a good heuristic would be well probably you're complaining about the same thing again so you have to be that to be more accurate but they measure recall recall is of the conditions people report what percentage did they get and they're getting about you know depends there's a couple of ways they tweaked it but more or less 79 percent or recall after 30 days so that's pretty good my opinion I don't know if it's gonna put the doctors out of a job but it's definitely an interesting step towards this hybrid model of doctor computer in the doctor's office but do you think we'll ever eliminate doctors will one day you just plug in some sort of USB port it scans you with your phone or whatever no I don't think so I mean that's for diagnosing specifically I mean you're still need doctors for now to do the surgery for now pretty soon they're gonna be in the unemployment line maybe yeah but then now make healthcare costs come down there you go well this is actually a very complicated topic we're going to cover this in a series of data sceptic episodes between now and the end of the year because the types of things we talked about on the show are very relevant in healthcare right now we're undergoing a lot of changes so I want to kick that off with today's interview but I'm glad you're here to help me intro Linda I just found one more thing I mean if we can't even get self-driving cars passed and legalized it's gonna be a while before we get robots that do surgery passed and legalize well that's an interesting point but hopefully when the data shows that something is better then you know that will win out over people's superstitions for example most of the data on self-driving cars shows that they're actually much safer than human driving cars that doesn't mean we should roll them out immediately tomorrow but seems like we're gonna have those and as I've told you I think my niece Noemi who's what Reno I predict she'll never be allowed to drive a car it'll actually be illegal but anyway let's get into dr. AI Edouard choy received his master's in computer science and engineering from the Korea advanced Institute of Science and Technology in 2009 and his bachelors in computer science and engineering from Seoul National University in 2007 he is currently pursuing a PhD in computer science at Georgia Tech under the supervision of Professor G mang son Edwards main research interests include predictive modeling temporal modeling and healthcare analytics specifically using representation learning and interpret will deep learning for predictive healthcare over the past years he has held research internships at Sutter Health deep mind and more recently Google research when he's not in the lab Edward is a dilettante pianist a low-budget traveler a self approved philosopher and most of all a gamer at heart ed welcome to data skeptic hi this is Sarah thank you very much Kyle thanks for the invitation you know I think it's a great time to be a computer scientists and to be working in machine learning and deep learning specifically realistically every field and industry can probably benefit from your work what made you specifically interested in focusing on healthcare to be honest healthcare was not always not my passion when I decided to go to PhD I was mostly interested in machine learning rather than healthcare when I talked to my advisors so jemaine sonda jimang was my advisor who who interviewed me for Georgia Tech admission and I found out that he was interested in combining machine learning techniques with healthcare and I like the guy personally I like Jamaica and I was hoping that I could work with him and healthcare sounded like a very promising application and I I heard a lot of good stuff going on by the time 2014 and it was going to grow and grow as time goes on so I thought it was a good time to jump into healthcare analytics and I put some effort in it for a couple of years and this is where I am right now when I hear discussions about healthcare the acronym EHR electronic health records always comes up which sounds like a good phrase to describe you know generic medical data is EHR something more formal that has a really good schema or are you dealing with a variety of different types of data and formats it stands for like a total data collection solution this is not just a record of patients when he visited what kind of drug he was administered or what kind of diagnosis he receive it's more than that it's like a complete record of everything that went on in the hospital so that just structure data but it has medical notes like doctors notes discharge notes progress notes it also has lab measures associated with the Wiggin calendar it's also the demographic information that family history it's total information package for all patients who went to the hospital so it's actually a very very private information are there challenges that around getting access to the data you need because of no hip or just general privacy issues I'm not really an expert in the legal aspect of EHR but we I did internship in Sutter Health for two summers and we agreed on we agreed upon having the data actually inside Georgia Tech server but it took a long time to actually get physically inside Georgia Tech because there was a lot of legal issues to be handled and also the server had to be super secure so it had to be behind like to VPN layers and such as such so the time that we agreed to have the data inside Georgia was or ending in the summer but the time we actually captivate I was like six months later so there's absolutely there's a lot of challenges in getting hands and private data yeah I mean that's an interesting discussion in and of itself but it sounds like everything was handled very you know with a lot of care and thought and legal concerns so what you've got that data set can you tell me a little bit about what you have to work with we didn't receive all aspects of EHRs so for example as I said EHR also contains medical notes but since medical notes contain actual names of patients or doctors or like very characteristic symptoms of individual patients so it's it's even more of a risk than compared to just diagnosis history or medication history so what we had in our server in Georgia Tech was structured data set so it's when the patient visited Hospital what kind of procedure he went through what kind of medication was administered what kind of techno Ceci's some demographic information such as ethnicity jobs or age date of birth that kind of stuff so that was pretty much it I had those structured data mostly I used them to predict future disease or maybe recommend medications or future mortality that kind of well-known problems in healthcare let's get into some of the specifics there and the work you described in your recent paper about dr. AI so doctor I was actually extended work from the first deep learning application that I did in 2015 summer so what I did in the summer 2015 in Sutter Health is I try to predict given the patient record encounter history medication history procedure history my project was to try to predict the onset of heart failure diagnosis in the future I thought it was a perfect opportunity to introduce our intent into it because I Reynold is also a sequential model so that's what I did during summer and starting fall semester 2015 there was a postdoc in our lab his name is Mohammad Taha bhadori I should mention his name because he's a very very good friend of mine a close collaborator a lot of work that I've done is all through the discussion with him and he's always in my papers so I should mention his name haha thought it would be interesting to do a sequential prediction using RNAi so every single encounter we try to predict what kind of medication is going to be prescribed what kind of the diagnosis is going to be received what kind of procedure is going to go through every single encounter based on his entire history entire encounter that he's been through before work of the crown encounter and then we do it in color by encounter or maybe time step by type step if you think about RNA so that was our main objective so I know a lot of historical work trying to make predictions in healthcare has focused on single outcomes you know like we're going to build something that just predicts the onset of you know one particular diagnosis or illness or something like that can you tell me a little bit about the motivation to generalize to a larger set of diagnosis that dr. a attempts to tackle we what we wanted to be ambitious we were thrilled by the all the good news we hear from deep learning field like how it's doing so well envision how it is tackling problems in NLP natural language processing how it's done great work in audio processing so we thought it would be it's excited to test out the state-of-the-art Arnon or lsdm sequential model so we thought why don't they get ambitious and then just call it doctor AI and we tried to predict everything that goes on in patient progress or patient like trajectory over time it was just an ambitious project actually how do you guys compare your outcomes whether today or what you think can be achieved long term - more of a let's say a specialized focus I could go build 10 classifiers that each individually look at one disease or I could go the doctor a or IR out and build a meta super classifier that tries to capture every technique and maybe can even see patterns across them does that pulling it all together present some advantages for cross correlation and things like that well technically it's not so different when you focus on a single disease and multiple disease so basically in single disease prediction you just have one class whether it'll occur or not so it's a binary classification when when you go full-blown like doctor area you just have a lot of these disease like maybe thousand by closest classes so it's so thousand class multi-label multi-class classification so technically it's just adding more heads to the at the top of the R and then so you whether you use a sigmoid function or whether your software functions that's the difference so I think it's more of a like a your attitude when you focus on a single disease or when you focus on like a lot of diseases we need to focus the single disease use like the traditional sense you study that disease in a very very like deep fashion you understand the trajector whole trajectory what effects the disease what kind of medication it is usually used to treat the disease like the whole whole thing but when you go full-blown like doctor yeah you can't really as a non clinician myself you can't really understand the whole one thousand diagnosis classes so you depend on the data so it's a purely data-driven approach so that is a route that we took but traditionally what you would do when you focus on single disease you you really study closely with a clinician an expert in medical that medical field and then you understand and you extract relevant features from the record and you make a very intelligent classifier what we did is we just we dumped in a lot of data into our then hoping that it would figure out the correlation as you said correlation between a lot of diagnosis cause a lot of medication cost procedures it goes in then it would eventually figure out the relate the complex relationship between bicycles so that's the route that we took so I'd like to talk mostly today about your work obviously rather than other techniques one might apply but since the paper has such a in my opinion thorough background of different techniques and discussion is to you know why they're not as advantageous as your approach could you touch a little bit about you know some of the other techniques that have gone in the past like continuous-time markov chains and things like that before are and people use linear auto regression models or markov chains to deal with sequential data and in a set well i want i don't want to attack those techniques because they're still useful in their application but RNN is more general because you you take input at each time step and then you also consider the history from your past time step and then you combine them into a continuous embedding it's a very impressive model in hidden Markov model you first need to define step your model can be in so you have to you have to understand the problem and then you have to define oh I need I think I'll create maybe twelve states so maybe in this state the patient is having flu in this state may be patient has cancer so you have to try to find the states to use market model but RNN doesn't have a predefined human understandable state it has internal representation of how things are turning as time goes on so it's more expressive in that sense so that's why we chose R and then before jumping into deep learning I worked for either using a stochastic process models specifically I use Hawkes process I'm not sure if you're familiar with it it's the point practice model and what it does is it models the probability of events happening in the future in a continuous time line and it it is the predictions based on what kind of other events has happened before so it basically considers all the events happen before the current time step that makes the model kind of slow so it's it's a very slow training process compared to our and then so we thought it would we would I mean was first time we tried Arnez we were kind of focusing on what kind of benefit Ireland could bring compared to previous work so that was why we like address a lot of previous works in the related work section and I'd like to get into a little bit about the details of the recurrent neural network you built can may be to start with you describe a little bit about the architecture you know how many inputs you had the nature of those the hidden layers so on so forth we took all the diagnosis Kosan medication put some procedure codes occurring in the in our data set so that sums about to 30k unique codes we don't differentiate between diagnosis codes of medication procedure codes we just think of them as words in natural language processing like like each word correspond to each diagnosis or medication or procedures we have 30 k dimensional we can we can express them as 30 k dimensional one hop vector so we have 30 k dimensional vector where everything is zero except for one dimension where it represent which the damage represents a certain specific medication or procedure or diagnosis so that would be our input at each time step and it won't be a one hot like in the natural language processing if you process sentence a word by word you will have single word occurring in a single time step so it will be a one hot sector but in our case we feed one encounter at a time so in a single encounter there could be multiple disease multiple medications or procedures in a single encounter so it would be we call it a multi hot vector I'm not sure if it's a technically correct make sense though yeah because a lot of things could be one in a lot so most most of the dimensions will be 0 but about like 5 or 10 of them will be turned to 1 on average so we call them all T odd vectors so each time step we feed multi effector into our knit and first we we embed it or we project it into a lower dimensional space by applying fully connected Network and then putting another linear activation function we shrink it down to typically like 256 dimensions huge reduction and then we put that into our event so the size is I think we tried like 100 128 256 512 or maybe 100 200 300 so we search the hyper primer space not too thoroughly but in the space that made sense to us I think it's around like several hundred neurons in the hidden layer and we also try to stack one more or two more RNs on top of each other so in an overview there would be like three layers of RNs stacked on top of each other or maybe two only two layers we found that two layers work pretty well compared to a single layer Arnon so that would be the RN inside and then on top of the RN and then we would put soft next function on top of the hidden layer of yarnís so that we can make prediction at each time step so it would be the soft next function if we do if we are to do it in an ideal fashion we would predict the entire 30k dimension again because the input was 30k dimension we would have to predict 30k dimensions to see what happens in the current encounter but it would make the job of soft makes too hard because it's 30k dimensional we have third is a 30,000 class classification problem and it's mostly gonna be very very hard for for a slice off mix because a lot of codes are very similar to each other like in even in hypertension there are different types of hypertension and even in one diabetes there are a lot of different varieties diabetes so it's mostly going to be very confusing for soft mix so what we thought is why why don't we group the fine-grained codes into a little more abstract concept we use a coding scheme from icd-9 so I think that diagnosis code becasue code use five digits to code in code each single diagnosis but you can only use three digits to group them into a similar concept so if you use only three digits it'll be like 1,000 categories compared to 10,000 characters it when you use five digits so that's what we did we grouped diagnosis codes and medication codes using existing groupers so that we shrunk the output dimension sized to like tabs and dimensions and it made the Southwick job easier it made the interpretation that a result easier for people as well so that's what we did so that is the entire architecture so on the input side something that's really fascinating to me is how you did basically no cleaning or feature engineering you said here are you know the medical medicine codes or codes as well as the you know the doctor assigned this is a diagnosis and there are no separations there just those one hot encoded yes this was present or not and the very sort of almost magical quality of deep learning is that it does its own feature engineering kind of figures out which of those are important oh-hoh so that first step where you go from I think you said thirty thousand possible inputs down to a few hundred is kind of like an embedding layer I would guess is that what it is out here yes exactly exactly so because I actually it so my background and also my background lies in natural language processing because that's what is studying in my master also during 2010 and fourteen I worked in a Korean government Institute before coming to PhD and what I did was also mainly picks processing II was a part of speech tagging or named entity recognition like very classical NLP I did that so I come from NOP and I still try to follow what goes on in NLP and these days back in 2014 like there were like sequence to sequence models or are using are and then on on NLP was a very like well it was a hint so I and they typically embed words into smaller embedding space just like what we did the doctor I so that's where we drew the inspiration because we have thirty thousand dimensional input we should shrink it down to a smaller latent space and then let the are then deal with that when I look at the RN n you guys have built if I interpret it correctly there's really two objectives in the output it's trying to predict the time until the next visit as well as what condition might be in the next visit is that correct yes exactly yeah and can you tell me a little bit about how you establish a good loss function to kind of balance these two objectives we didn't like design a very intelligent loss function for this basically we have a loss function for predicting the correct codes that's going to happen in the next visit and we also have a separate loss function for the size of the duration between consecutive visits so that would be just a linear regression that spits out a single scalar value which represents the duration or the number of days between visits we just add those two functions to make a single kalasa function then we like the optimization algorithm take care of it so we didn't put too much effort into it and we actually found out that predicting the cause was working better than expected but predicting the duration between visit was much harder than much much harder than we expected I mean it was it was better than just random guessing of course but it wasn't doing well we on the app after thought was it's obvious that it is hard to predict when the patient is going to visit the hospital that depends on so many levels of factors like the ecumene economic status the neighborhood he's be the patient is living in whether he or she has a car what kind of job he or she has the income I mean a lot of things affect the decision to visit a hospital the just base of the past record I don't think it's a trivial job to predict when the patient's work though it's hospital yeah there's so many confounding factors like whether or not they have to work that day that's just not available in your own data set interesting how do you judge the accuracy of a model obviously you can do your standard train test split and see how well it performed on your training data with a holdout set but what do you benchmark against can you compare to a domain expert or to another system that does similar predictions on this paper dr. a we just did what you described in the first place having a holdout dataset and then doing a test run on that so that was mainly our initial setup I actually know it and I'm not sure if you were looking at the paper right now but there's a table three that we asked the medical student to take a take a look at the result and then he pointed out that some of them make sense some of them don't and we have that like a preliminary evaluation from medical I would have a little medical student but he's still a medical expert so better than us so he took a look at it and then gave us a preliminary or initial evaluation of and said that this seems to work well but we didn't like thoroughly go through clinical evaluation sure let's take a break from our show to talk about our sponsor for today periscope data periscope data is a great tool for data teams who want to rapidly go from sequel to charts if that's not enough they've also made it seamless and easy to do cross database joints now I'm not just talking about writing queries between two separate database is that might be hosted on the same server no no periscope data allows you to join tables in my sequel to a query in an entirely different Postgres database the interoperability extends to oracle ms sequel server redshift Vertica mem sequel bigquery and others so how do they do that it all happens on their back-end they also have this great caching layer which you can configure which centralizes all your data so that those queries run smooth and efficiently as if you were joining two tables on the same database guys that's tremendous if you work with an organization that uses more than one of these types of databases you really have to check out periscope data in just a few clicks whatever results you got from that cross database join of the two tables that were previously siloed and unconnectable you can turn that result into a dashboard of visualizations and get it sent straight to your boss's inbox check them out today at periscope datacom slash skeptics and what sort of in terms of like if it's precision and recall what do you look at when you describe the achievement of the model oh we look at the recall because we thought mainly we try to predict everything that's going to happen to the patient when he visits so we're going to predict diagnosis and for medications so we have to predict a lot of codes at the same time so that's we thought recall is a measure that would best represent how well the model is doing so we have like recall it okay because we are predicting thousands of codes at the same time so we have to take cup 10 or tap on your top authorities so we have a measure called recall a case a recall at 10:00 recall at 20 and recall a 30 and then so that's what the table twelve the paper describes yeah what kind of scores did you see achieved compared to what would be maybe random chance we didn't use random chesses baseline but we have a very weak base line which is repeating the coast from the last visit so if you had cold last visit and we are gonna night most naive model will be he had cold last visit so this time he's gonna have cold again so it's basically just repeating what kind of codes were in the last is it and that gave us the recall at 10:00 like well it does it doesn't even need recall it's something because you have a fixed set of codes so you don't you don't need to cut off from like thousands puzzle basically they that gave us like there 80% 30% recall but what we achieved is reaching towards if we do like recall 8:30 we reach like 80% recall so it's compared to that most naive model we're doing is significantly better than just repeating what has happened in the past yeah so there's absolutely some knowledge gained in the learning process there yeah what do you think about future investment in this obviously you guys didn't have infinite money an infinite time what could that be with you know a big grants and sort of stuff can we get that into the 90 range or is there some boundary on just how predictable health conditions are oh I see you mean like how far we can push the model to be so more accurate or more robust is that yeah we focused on structured data only in this Fork so that's basically just the codes that occurred in the past but as I said EHR is so much more than that it has notes it has demographic information it has lab measures and all those I heard from a medical clinical expert that the gold is hidden in the clinical notes not the structured data because he said like somewhere like 70% or 80% of the entire information is described in the medical notes and the structure data like this like the ones that we use in dr. a it only has represent like twenty or thirty percent of the the status of the patients so if we really want to significant prove this type of model we should dig into more modalities we should use everything that we can get our hands on like notes is of course a must and also wet measures like blood pressure checks or all those you know like continuous values from labs that is also a must and also demographic information is very helpful family history that is also very important but we didn't take any text-based resources into account when we were building this model so that is where we should attack in the upcoming efforts one question does raised is how well the model generalizes and I'm kind of unclear as to what I would expect you know in one hand people are people you know and we're all the same species so it seems like you know we have similar medical records but of course family history genetics these sorts of things play a role of course your training data biases what your model is able to accomplish but have how can you tell me a little bit about the applications of transfer learning using dr. AI and moving on to new sets of patient data hospitals have different patients so for example like Children's Hospital of Atlanta they would mostly treat children but like Sutter Health they are moved from more focused on like a from 40 to 80 the data that I received from Sutter Health was people from date age 40 between 40 and 85 or 90 I think so they have vastly different characteristics so it wouldn't make sense to expect doctor AI this single or in the model to do so well on all problems based on the month so if the Ireland has been been trained on one data set we cannot expect it to do well on all kinds of data set like it won't do well if doctor was trained on senior data the data set with senior patients and it won't do well on data set with children so that's sensible different size of hospitals have different size of course so some hospitals were so small small clinics they wouldn't have like huge data set that companies like started health has so well how about we trained after AI on a huge data set so that it's a it's a pre trained model and then we retrain it we refine it on another data fit that we are interested in so we would train hug doctor a 1000 pens that were maybe hundreds of thousands of patients first and then refined it or tune it fine-tune it with tens of thousands of patients that we're actually interested that we want to do prediction on and see if it actually helps having a pre trained model so that's what we did in one of the experiments and it has shown us that when you when you have pre trained model and when you so when you start from scratch when you start from scratch and then just train in a small number of patients compared to that when you actually trained operate on a huge number of patients and then just fine-tune it with small number there's it's the latter works so much better than just starting from scratch so that's like one of the potential of transfer learning that we tend ress in the paper yeah it's a very exciting area to me especially because I think of so many illnesses where because of their rarity we have a small data set if we could benefit from transfer learning we could probably gained a wealth of knowledge about some of these more rare conditions actually so I have also another paper that so the paper was accepted to clone a 27 k DD so the name is Graham so it's a graph based a patient model for how predictive healthcare so we just call it Graham like GRE and but basically it addresses the problem where it's hard for people to obtain a lot of data for people who has rare diseases even in large hospitals like even when there are so many patients only a very small number of subsets of patients will have that rare disease so in an absolute sense it's hard to gain a huge volume of data that regards to the condition or maybe some some Hospital won't have huge data at all so we try to incorporate medical ontology so until i like-like-like how different diseases are related to each other or how some diseases actually a children is a child of another more abstract disease just like icd-9 hierarchy so we use that as a medical main knowledge and try to introduce that into making prediction when using RNA so the RNN will use not only the past records but it will also incorporate medical domain knowledge such as sno-med or icd-9 or CCA uh like categorical C I forgot the abbreviation of CCS but it's basically like there are a lot of different different kinds of like hierarchies or ontology so yeah RN will pull knowledge from that ontology as well as pass breakers and make more intelligent predictions so that was our main focus in the recent paper so if people want to learn more about dr. AI and some of your other work I'll have links to those papers in the show notes of course but there's an added bonus I think people can go to your github and see the source code for dr. AI can you talk a little bit about the what's available there what people can run and what's shared so I try to make public all my codes that I've so all the codes that I've used when I was writing papers I think currently I'm maintaining I fix six different repositories so most of them are written in Theano except for one repository which is called Meghan is generating a medical patient records using general adversity all networks so except for that everything is written in yellow so if people are familiar with piano then I think they would they can easily read the code because it's not a law code basically especially for dr. AI and what I did is just write gated recurrent units which is another form of ardent so I just wrote that and then just pre processes data and then feed it into the RNA to make sequential predictions so that's basically what I did so if you go to the github I think you can easily search it if you just search my name at virtua and dr. then you'll seek it up on the top of your Google Google search page so there I have dr. a the source code and then there's also pre-processing script for mimic mimic is a public data set it's not a large data set but is one of the most famous public data set it's a pre-processing script for that so you download mimic and then you run the pre-processing strip on mimic then you'll have a data set that is ready to be fed into dr. a to make the life easier for the users basically using the source code the pre-processing script you can train your own dr. AI using mimic there that's it and see how it goes I think those are two good steps for people interested in this field any conferences or anything like that you're headed to you want to recommend for the R in the near future I don't think I need to tell people that KTV is a very probably the biggest conference venue for data mining and knowledge discovery so this is Katie D is very famous conference and I'll be attending there to present my paper Graham and but after that there is a small conference called machine learning in healthcare which will be held right after Katie D in Boston so it's a small campus started like a couple of years ago is so it's a small personal community yet but we are hoping that it would blow up to be a full-scale like large conference because I know that a lot of people are interested in healthcare combining healthcare and machine learning together so that's the main theme of the main theme of the conference so that's why it's called machine learning in healthcare so a lot of good good ideas good papers are presented there and I'm also going to present my med Gantt paper which is generating patient records as I said so I'll be presenting that paper there so it's August 18th to 19th in Boston so if people are interested they're welcome to come there I'm hoping that I would meet a lot of interesting people talk about a lot of interesting work there excellent anyway thank you so much for coming on the show so to wrap up maybe I can ask you one more just sort of question that's a unbounded prediction I'm sure you may have heard that relatively recently geoff hinton made this statement that we should consider not training radiologists anymore because it's a long training period and by the time you're out the deep learning systems will be better in fact Kristine who wrote the bio for you for the show she wrote a recent blog post on data skeptic that come all about this do you have any thoughts on that is is deep learning gonna put a lot of doctors out of business uh well first of all I have to admit that I I haven't heard what dr. Hinton said about the radiologist but I don't think maybe on very simple day to day jobs could be handled efficiently by machine learning this bit of like deep learning and also I think especially imaging as imaging room business like analyzing fmi MRIs or x-rays or cat scans or PET scans I think new illness can give a lot of helping hand to doctors in analyzing it but I don't think it's we are quite there yet to drive doctors out of business I don't think it's that simple because factors they trawl knowledge from biological chemical and also like they they learn a lot of stuffs to understand what goes on behind the scenes and that's where they draw their knowledge from but machine learning models they draw their knowledge from pure data so unless that gap is somehow shrink in the come in like next few decades I don't think doctors will be driven out by machines that easily is a yeah I mean they take whole different approaches it's an approach that has been like certified for a long that has been a proven to be working for a very very long time for the entire human history so I don't think just few years of deep learning like kid is going to drip but to change the landscape of medical business just significantly I don't I don't believe that's going to happen anytime soon well again ever thank you again so much for coming on and taking the time to share a lot of your research this is really interesting work and I'm eager to follow the rest of your career thank you very much for the invitation that I had a great time talking here all right take care one last thing before we end the show I've got an article coming out in the latest issue of skeptical Inquirer magazine it's called the missing for one one conspiracy an investigation let me give you a little quick background on this somewhere on the Internet I bumped into this idea called the missing four one one that this author David Politis had started talking about Pleiades believes that people are disappearing from national parks and other wildlife areas at an alarming rate under unusual circumstances and it's really draped in this era of mystery and kind of conspiracy and you know something's not right here but it also made like no specific claims it wasn't saying oh there being alien abductions or something wacky like that it was very open about what the claim was I got into this thinking there would be a cool statistical project or you know we could talk about how the sort of data is tracked or where to get it or what kind of biases have to be accounted for I don't I just thought there'd be an interesting stat story so I started looking into it turns out I didn't see anything interesting to say that related to statistics or the data but nonetheless I did this little skeptical investigation and tried to give a fair shake to this missing for one one idea to see if there was anything to it if you wanna hear my thoughts again those are in the latest issue of skeptical Inquirer magazine that's volume 41 number four I'm looking at it right here a bunch of other great articles in here as well and listen one last thing here I've been meaning to say something about this for a while I get some feedback from people about the show saying like you know what's the whole skeptical thing don't want skeptics you know spend their time not believing in aliens and Bigfoot and all kinds of kooky stuff like that what the heck does that have to do with data science it's a distraction why do these things mix sometimes and look even the claims in AI and in machine learning that we might look at and say I'm skeptical of this idea can that model achieve what they really say it does you know those sorts of things they're more like claims of plausibility someone says oh it's 80% accurate the way we predict what customer's going to churn out really 80 percent you're that accurate you're probably more like 70 but maybe you got something okay let's talk and we all have that sort of skepticism oh this thing says it scales infinitely well there's some exaggeration there well you know I'm sure it's a great product let's talk about how it actually works skip past the marketing BS and learn the real details and when you know you're skeptical in that way it can feel like you're just fine-tuning a claim that's more or less correct and that feels about a million miles away from the people who actually claim that they believe the earth get this in the modern era they believe the earth is flat still they're the Flat Earth Society is a thing and it's probably not entirely a joke so here's the deal there is a gamete sometimes I'm going to be too skeptical for you sometimes not enough but that's the data skeptic journey and critical thinking and logic are you have to admit absent in many important areas of our society that's not a political statement that's not an attack on anybody just in your own mind think about those words there's everything that you know of work in a logical rational way and I'm sure you'll come up with some situations where you think it doesn't so if there are people who will believe anything despite much evidence to the contrary all the data in the world is not going to help them the best models we create aren't going to matter if they're disposed of without inspection or consideration you know the power of data science not only do the kind of techniques we use need to be looked at skeptically you know as the models say it does what it does but our ability to make these useful things to our society stands entirely on a foundation of a society that thinks rationally logically and skeptically and when people believe anything despite all evidence to the contrary well skepticism at all level sorry for preaching for so long but yeah check out my article if you're interested data skeptic is a listener-supported program to support the show visit data skeptic comm and click on the membership tab [Music] you

Original Description

hen faced with medical issues, would you want to be seen by a human or a machine? In this episode, guest Edward Choi, co-author of the study titled Doctor AI: Predicting Clinical Events via Recurrent Neural Network shares his thoughts. Edward presents his team’s efforts in developing a temporal model that can learn from human doctors based on their collective knowledge, i.e. the large amount of Electronic Health Record (EHR) data.

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Data Skeptic · Data Skeptic · 53 of 60

← Previous Next →

Data Skeptic book giveaway contest winner selection

Data Skeptic book giveaway contest winner selection

OpenHouse - Front end and API overview

OpenHouse - Front end and API overview

OpenHouse Crawling with AWS Lambda

OpenHouse Crawling with AWS Lambda

[MINI] Logistic Regression on Audio Data

[MINI] Logistic Regression on Audio Data

Data Provenance and Reproducibility with Pachyderm

Data Provenance and Reproducibility with Pachyderm

[MINI] Primer on Deep Learning

[MINI] Primer on Deep Learning

Big Data Tools and Trends

Big Data Tools and Trends

[MINI] Automated Feature Engineering

[MINI] Automated Feature Engineering

The Data Refuge Project

The Data Refuge Project

[MINI] The Perceptron

[MINI] The Perceptron

[MINI] Feed Forward Neural Networks

[MINI] Feed Forward Neural Networks

Data Science at Patreon

Data Science at Patreon

[MINI] Backpropagation

[MINI] Backpropagation

[MINI] Generative Adversarial Networks

[MINI] Generative Adversarial Networks

[MINI] AdaBoost

[MINI] AdaBoost

[MINI] The Bootstrap

[MINI] The Bootstrap

[MINI] Gini Coefficients

[MINI] Gini Coefficients

[MINI] Random Forest

[MINI] Random Forest

[MINI] Heteroskedasticity

[MINI] Heteroskedasticity

Urban Congestion

Urban Congestion

[MINI] The CAP Theorem

[MINI] The CAP Theorem

Unstructured Data for Finance

Unstructured Data for Finance

Detecting Terrorists with Facial Recognition?

Detecting Terrorists with Facial Recognition?

Predictive Models on Random Data

Predictive Models on Random Data

[MINI] F1 Score

[MINI] F1 Score

Machine Learning on Images with Noisy Human-centric Labels

Machine Learning on Images with Noisy Human-centric Labels

The Library Problem

The Library Problem

Stealing Models from the Cloud

Stealing Models from the Cloud

Data Science at eHarmony

Data Science at eHarmony

Multiple Comparisons and Conversion Optimization

Multiple Comparisons and Conversion Optimization

Election Predictions

Election Predictions

[MINI] Calculating Feature Importance

[MINI] Calculating Feature Importance

MS Connect Conference

MS Connect Conference

The Police Data and the Data Driven Justice Initiatives

The Police Data and the Data Driven Justice Initiatives

Studying Competition and Gender Through Chess

Studying Competition and Gender Through Chess

[MINI] Goodhart's Law

[MINI] Goodhart's Law

Trusting Machine Learning Models with LIME

Trusting Machine Learning Models with LIME

Predictive Policing

Predictive Policing

Mutli-Agent Diverse Generative Adversarial Networks

Mutli-Agent Diverse Generative Adversarial Networks

[MINI] Convolutional Neural Networks

[MINI] Convolutional Neural Networks

Unsupervised Depth Perception

Unsupervised Depth Perception

[MINI] Max-pooling

[MINI] Max-pooling

Activation Functions

Activation Functions

[MINI] The Vanishing Gradient

[MINI] The Vanishing Gradient

Estimating Sheep Pain with Facial Recognition

Estimating Sheep Pain with Facial Recognition

[MINI] Conditional Independence

[MINI] Conditional Independence

MINI: Bayesian Belief Networks

MINI: Bayesian Belief Networks

Project Common Voice

Project Common Voice

[MINI] Recurrent Neural Networks

[MINI] Recurrent Neural Networks

The episode explores the potential of AI in medical diagnosis, discussing the development of a temporal model that can learn from human doctors based on EHR data. This model uses Recurrent Neural Network to predict clinical events, raising questions about the role of AI in healthcare.

Key Takeaways

Collect and preprocess EHR data
Develop a temporal model using Recurrent Neural Network
Train the model on EHR data
Evaluate the model's performance in predicting clinical events

💡 The use of temporal models and Recurrent Neural Network can help predict clinical events, potentially improving medical diagnosis and patient outcomes.

🔒 Pro feature: Ask AI to explain this lesson →

More on: ML Maths Basics

View skill →

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

ChethanAIChronicles

“Hello, world” from scratch on a 6502 — Part 1

“Hello, world” from scratch on a 6502 — Part 1

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

ROC and AUC in R

ROC and AUC in R

StatQuest with Josh Starmer

Data Science Fundamentals: Data Cleaning in Python

Data Science Fundamentals: Data Cleaning in Python

Related Reads

Understanding Deep Learning Through Four Interactive Experiments

Explore deep learning concepts through interactive experiments to gain hands-on understanding

Medium · Data Science

Understanding Deep Learning Through Four Interactive Experiments

Explore deep learning through interactive experiments to gain hands-on understanding

Medium · Deep Learning

Optimizers in Deep Learning: From Gradient Descent to Adam

Learn how optimizers in deep learning work, from basic Gradient Descent to advanced Adam optimizer, to improve model training

Medium · Deep Learning

The Meta-Architecture of Interface Fracture: High-Dimensional Logical Stress and Systemic Collapse…

Learn about the meta-architecture of interface fracture and its relation to high-dimensional logical stress and systemic collapse in deep learning systems

Medium · Deep Learning

Image Classification with ml5.js

The Coding Train