Doctor AI

Data Skeptic · Beginner ·🧬 Deep Learning ·9y ago

Key Takeaways

The episode discusses the development of a temporal model using Recurrent Neural Network to predict clinical events, with a focus on Electronic Health Record (EHR) data and the potential of AI in medical diagnosis.

Full Transcript

data skeptic is the official podcast of data skeptic comm bringing you stories interviews and mini episodes on tactics and data science machine learning statistics and artificial intelligence [Music] so Linda do you think a machine could predict Hospital outcomes or I guess it's patient outcomes in general I don't know it's specific to Hospital well that's funny you asked I just read an article about this tell me all about it well I don't remember all about it using a little bit too interested there but anyways what it said is once you like train these neural networks they can catch things that humans don't catch and they do it better than human yeah that's definitely a possibility they could catch things humans don't catch if they're able to maybe look at wider arrays of data or notice things and finer resolution the humans haven't learned or something like that also they don't get fatigued so the doctor you know can't be maybe sick themselves or lazy or who knows what so there are some advantages but what do you think who is scoring better actual doctors or computers computers that's what the article said well that article is from the future then because humans are still a little bit better at this time but we're very much on the cusp of a changeover it would seem are you saying computers miss stuff they do not as good a job as doctors in general in some very specific cases maybe they do better but we're making a lot of progress and that's why I wanted to talk to Ed about dr. AI because I think it's a step in the right direction doing some interesting things in this space why do you think a computer would do better than a person the article I read suggested that computers can process the data much more rapidly doing large volumes and large amounts whereas it would take months and months for human people to do it and have peers review it as well then also it just seemed like they were the computers were able to catch things that humans weren't humans were catching other things I guess and computers were prone to catch other things so they're just separate things as well so maybe we need the complement of these two tools we need a human who also gets to look at a models output and maybe synthesize the results in some way having not read the dr. AI paper what would you guess its success rate is it predicting diagnosis and time till they see the patient again well it's got to be more than 50% well you think so actually 50% would be quite good because they're looking at a lot of different diagnoses but your your thoughts are on the right track yeah it has to be better than random interestingly enough we'll get into this a little bit in fact Edie points out to me that the baseline he uses is not random it's actually how what condition did they report last time because a good heuristic would be well probably you're complaining about the same thing again so you have to be that to be more accurate but they measure recall recall is of the conditions people report what percentage did they get and they're getting about you know depends there's a couple of ways they tweaked it but more or less 79 percent or recall after 30 days so that's pretty good my opinion I don't know if it's gonna put the doctors out of a job but it's definitely an interesting step towards this hybrid model of doctor computer in the doctor's office but do you think we'll ever eliminate doctors will one day you just plug in some sort of USB port it scans you with your phone or whatever no I don't think so I mean that's for diagnosing specifically I mean you're still need doctors for now to do the surgery for now pretty soon they're gonna be in the unemployment line maybe yeah but then now make healthcare costs come down there you go well this is actually a very complicated topic we're going to cover this in a series of data sceptic episodes between now and the end of the year because the types of things we talked about on the show are very relevant in healthcare right now we're undergoing a lot of changes so I want to kick that off with today's interview but I'm glad you're here to help me intro Linda I just found one more thing I mean if we can't even get self-driving cars passed and legalized it's gonna be a while before we get robots that do surgery passed and legalize well that's an interesting point but hopefully when the data shows that something is better then you know that will win out over people's superstitions for example most of the data on self-driving cars shows that they're actually much safer than human driving cars that doesn't mean we should roll them out immediately tomorrow but seems like we're gonna have those and as I've told you I think my niece Noemi who's what Reno I predict she'll never be allowed to drive a car it'll actually be illegal but anyway let's get into dr. AI Edouard choy received his master's in computer science and engineering from the Korea advanced Institute of Science and Technology in 2009 and his bachelors in computer science and engineering from Seoul National University in 2007 he is currently pursuing a PhD in computer science at Georgia Tech under the supervision of Professor G mang son Edwards main research interests include predictive modeling temporal modeling and healthcare analytics specifically using representation learning and interpret will deep learning for predictive healthcare over the past years he has held research internships at Sutter Health deep mind and more recently Google research when he's not in the lab Edward is a dilettante pianist a low-budget traveler a self approved philosopher and most of all a gamer at heart ed welcome to data skeptic hi this is Sarah thank you very much Kyle thanks for the invitation you know I think it's a great time to be a computer scientists and to be working in machine learning and deep learning specifically realistically every field and industry can probably benefit from your work what made you specifically interested in focusing on healthcare to be honest healthcare was not always not my passion when I decided to go to PhD I was mostly interested in machine learning rather than healthcare when I talked to my advisors so jemaine sonda jimang was my advisor who who interviewed me for Georgia Tech admission and I found out that he was interested in combining machine learning techniques with healthcare and I like the guy personally I like Jamaica and I was hoping that I could work with him and healthcare sounded like a very promising application and I I heard a lot of good stuff going on by the time 2014 and it was going to grow and grow as time goes on so I thought it was a good time to jump into healthcare analytics and I put some effort in it for a couple of years and this is where I am right now when I hear discussions about healthcare the acronym EHR electronic health records always comes up which sounds like a good phrase to describe you know generic medical data is EHR something more formal that has a really good schema or are you dealing with a variety of different types of data and formats it stands for like a total data collection solution this is not just a record of patients when he visited what kind of drug he was administered or what kind of diagnosis he receive it's more than that it's like a complete record of everything that went on in the hospital so that just structure data but it has medical notes like doctors notes discharge notes progress notes it also has lab measures associated with the Wiggin calendar it's also the demographic information that family history it's total information package for all patients who went to the hospital so it's actually a very very private information are there challenges that around getting access to the data you need because of no hip or just general privacy issues I'm not really an expert in the legal aspect of EHR but we I did internship in Sutter Health for two summers and we agreed on we agreed upon having the data actually inside Georgia Tech server but it took a long time to actually get physically inside Georgia Tech because there was a lot of legal issues to be handled and also the server had to be super secure so it had to be behind like to VPN layers and such as such so the time that we agreed to have the data inside Georgia was or ending in the summer but the time we actually captivate I was like six months later so there's absolutely there's a lot of challenges in getting hands and private data yeah I mean that's an interesting discussion in and of itself but it sounds like everything was handled very you know with a lot of care and thought and legal concerns so what you've got that data set can you tell me a little bit about what you have to work with we didn't receive all aspects of EHRs so for example as I said EHR also contains medical notes but since medical notes contain actual names of patients or doctors or like very characteristic symptoms of individual patients so it's it's even more of a risk than compared to just diagnosis history or medication history so what we had in our server in Georgia Tech was structured data set so it's when the patient visited Hospital what kind of procedure he went through what kind of medication was administered what kind of techno Ceci's some demographic information such as ethnicity jobs or age date of birth that kind of stuff so that was pretty much it I had those structured data mostly I used them to predict future disease or maybe recommend medications or future mortality that kind of well-known problems in healthcare let's get into some of the specifics there and the work you described in your recent paper about dr. AI so doctor I was actually extended work from the first deep learning application that I did in 2015 summer so what I did in the summer 2015 in Sutter Health is I try to predict given the patient record encounter history medication history procedure history my project was to try to predict the onset of heart failure diagnosis in the future I thought it was a perfect opportunity to introduce our intent into it because I Reynold is also a sequential model so that's what I did during summer and starting fall semester 2015 there was a postdoc in our lab his name is Mohammad Taha bhadori I should mention his name because he's a very very good friend of mine a close collaborator a lot of work that I've done is all through the discussion with him and he's always in my papers so I should mention his name haha thought it would be interesting to do a sequential prediction using RNAi so every single encounter we try to predict what kind of medication is going to be prescribed what kind of the diagnosis is going to be received what kind of procedure is going to go through every single encounter based on his entire history entire encounter that he's been through before work of the crown encounter and then we do it in color by encounter or maybe time step by type step if you think about RNA so that was our main objective so I know a lot of historical work trying to make predictions in healthcare has focused on single outcomes you know like we're going to build something that just predicts the onset of you know one particular diagnosis or illness or something like that can you tell me a little bit about the motivation to generalize to a larger set of diagnosis that dr. a attempts to tackle we what we wanted to be ambitious we were thrilled by the all the good news we hear from deep learning field like how it's doing so well envision how it is tackling problems in NLP natural language processing how it's done great work in audio processing so we thought it would be it's excited to test out the state-of-the-art Arnon or lsdm sequential model so we thought why don't they get ambitious and then just call it doctor AI and we tried to predict everything that goes on in patient progress or patient like trajectory over time it was just an ambitious project actually how do you guys compare your outcomes whether today or what you think can be achieved long term - more of a let's say a specialized focus I could go build 10 classifiers that each individually look at one disease or I could go the doctor a or IR out and build a meta super classifier that tries to capture every technique and maybe can even see patterns across them does that pulling it all together present some advantages for cross correlation and things like that well technically it's not so different when you focus on a single disease and multiple disease so basically in single disease prediction you just have one class whether it'll occur or not so it's a binary classification when when you go full-blown like doctor area you just have a lot of these disease like maybe thousand by closest classes so it's so thousand class multi-label multi-class classification so technically it's just adding more heads to the at the top of the R and then so you whether you use a sigmoid function or whether your software functions that's the difference so I think it's more of a like a your attitude when you focus on a single disease or when you focus on like a lot of diseases we need to focus the single disease use like the traditional sense you study that disease in a very very like deep fashion you understand the trajector whole trajectory what effects the disease what kind of medication it is usually used to treat the disease like the whole whole thing but when you go full-blown like doctor yeah you can't really as a non clinician myself you can't really understand the whole one thousand diagnosis classes so you depend on the data so it's a purely data-driven approach so that is a route that we took but traditionally what you would do when you focus on single disease you you really study closely with a clinician an expert in medical that medical field and then you understand and you extract relevant features from the record and you make a very intelligent classifier what we did is we just we dumped in a lot of data into our then hoping that it would figure out the correlation as you said correlation between a lot of diagnosis cause a lot of medication cost procedures it goes in then it would eventually figure out the relate the complex relationship between bicycles so that's the route that we took so I'd like to talk mostly today about your work obviously rather than other techniques one might apply but since the paper has such a in my opinion thorough background of different techniques and discussion is to you know why they're not as advantageous as your approach could you touch a little bit about you know some of the other techniques that have gone in the past like continuous-time markov chains and things like that before are and people use linear auto regression models or markov chains to deal with sequential data and in a set well i want i don't want to attack those techniques because they're still useful in their application but RNN is more general because you you take input at each time step and then you also consider the history from your past time step and then you combine them into a continuous embedding it's a very impressive model in hidden Markov model you first need to define step your model can be in so you have to you have to understand the problem and then you have to define oh I need I think I'll create maybe twelve states so maybe in this state the patient is having flu in this state may be patient has cancer so you have to try to find the states to use market model but RNN doesn't have a predefined human understandable state it has internal representation of how things are turning as time goes on so it's more expressive in that sense so that's why we chose R and then before jumping into deep learning I worked for either using a stochastic process models specifically I use Hawkes process I'm not sure if you're familiar with it it's the point practice model and what it does is it models the probability of events happening in the future in a continuous time line and it it is the predictions based on what kind of other events has happened before so it basically considers all the events happen before the current time step that makes the model kind of slow so it's it's a very slow training process compared to our and then so we thought it would we would I mean was first time we tried Arnez we were kind of focusing on what kind of benefit Ireland could bring compared to previous work so that was why we like address a lot of previous works in the related work section and I'd like to get into a little bit about the details of the recurrent neural network you built can may be to start with you describe a little bit about the architecture you know how many inputs you had the nature of those the hidden layers so on so forth we took all the diagnosis Kosan medication put some procedure codes occurring in the in our data set so that sums about to 30k unique codes we don't differentiate between diagnosis codes of medication procedure codes we just think of them as words in natural language processing like like each word correspond to each diagnosis or medication or procedures we have 30 k dimensional we can we can express them as 30 k dimensional one hop vector so we have 30 k dimensional vector where everything is zero except for one dimension where it represent which the damage represents a certain specific medication or procedure or diagnosis so that would be our input at each time step and it won't be a one hot like in the natural language processing if you process sentence a word by word you will have single word occurring in a single time step so it will be a one hot sector but in our case we feed one encounter at a time so in a single encounter there could be multiple disease multiple medications or procedures in a single encounter so it would be we call it a multi hot vector I'm not sure if it's a technically correct make sense though yeah because a lot of things could be one in a lot so most most of the dimensions will be 0 but about like 5 or 10 of them will be turned to 1 on average so we call them all T odd vectors so each time step we feed multi effector into our knit and first we we embed it or we project it into a lower dimensional space by applying fully connected Network and then putting another linear activation function we shrink it down to typically like 256 dimensions huge reduction and then we put that into our event so the size is I think we tried like 100 128 256 512 or maybe 100 200 300 so we search the hyper primer space not too thoroughly but in the space that made sense to us I think it's around like several hundred neurons in the hidden layer and we also try to stack one more or two more RNs on top of each other so in an overview there would be like three layers of RNs stacked on top of each other or maybe two only two layers we found that two layers work pretty well compared to a single layer Arnon so that would be the RN inside and then on top of the RN and then we would put soft next function on top of the hidden layer of yarnís so that we can make prediction at each time step so it would be the soft next function if we do if we are to do it in an ideal fashion we would predict the entire 30k dimension again because the input was 30k dimension we would have to predict 30k dimensions to see what happens in the current encounter but it would make the job of soft makes too hard because it's 30k dimensional we have third is a 30,000 class classification problem and it's mostly gonna be very very hard for for a slice off mix because a lot of codes are very similar to each other like in even in hypertension there are different types of hypertension and even in one diabetes there are a lot of different varieties diabetes so it's mostly going to be very confusing for soft mix so what we thought is why why don't we group the fine-grained codes into a little more abstract concept we use a coding scheme from icd-9 so I think that diagnosis code becasue code use five digits to code in code each single diagnosis but you can only use three digits to group them into a similar concept so if you use only three digits it'll be like 1,000 categories compared to 10,000 characters it when you use five digits so that's what we did we grouped diagnosis codes and medication codes using existing groupers so that we shrunk the output dimension sized to like tabs and dimensions and it made the Southwick job easier it made the interpretation that a result easier for people as well so that's what we did so that is the entire architecture so on the input side something that's really fascinating to me is how you did basically no cleaning or feature engineering you said here are you know the medical medicine codes or codes as well as the you know the doctor assigned this is a diagnosis and there are no separations there just those one hot encoded yes this was present or not and the very sort of almost magical quality of deep learning is that it does its own feature engineering kind of figures out which of those are important oh-hoh so that first step where you go from I think you said thirty thousand possible inputs down to a few hundred is kind of like an embedding layer I would guess is that what it is out here yes exactly exactly so because I actually it so my background and also my background lies in natural language processing because that's what is studying in my master also during 2010 and fourteen I worked in a Korean government Institute before coming to PhD and what I did was also mainly picks processing II was a part of speech tagging or named entity recognition like very classical NLP I did that so I come from NOP and I still try to follow what goes on in NLP and these days back in 2014 like there were like sequence to sequence models or are using are and then on on NLP was a very like well it was a hint so I and they typically embed words into smaller embedding space just like what we did the doctor I so that's where we drew the inspiration because we have thirty thousand dimensional input we should shrink it down to a smaller latent space and then let the are then deal with that when I look at the RN n you guys have built if I interpret it correctly there's really two objectives in the output it's trying to predict the time until the next visit as well as what condition might be in the next visit is that correct yes exactly yeah and can you tell me a little bit about how you establish a good loss function to kind of balance these two objectives we didn't like design a very intelligent loss function for this basically we have a loss function for predicting the correct codes that's going to happen in the next visit and we also have a separate loss function for the size of the duration between consecutive visits so that would be just a linear regression that spits out a single scalar value which represents the duration or the number of days between visits we just add those two functions to make a single kalasa function then we like the optimization algorithm take care of it so we didn't put too much effort into it and we actually found out that predicting the cause was working better than expected but predicting the duration between visit was much harder than much much harder than we expected I mean it was it was better than just random guessing of course but it wasn't doing well we on the app after thought was it's obvious that it is hard to predict when the patient is going to visit the hospital that depends on so many levels of factors like the ecumene economic status the neighborhood he's be the patient is living in whether he or she has a car what kind of job he or she has the income I mean a lot of things affect the decision to visit a hospital the just base of the past record I don't think it's a trivial job to predict when the patient's work though it's hospital yeah there's so many confounding factors like whether or not they have to work that day that's just not available in your own data set interesting how do you judge the accuracy of a model obviously you can do your standard train test split and see how well it performed on your training data with a holdout set but what do you benchmark against can you compare to a domain expert or to another system that does similar predictions on this paper dr. a we just did what you described in the first place having a holdout dataset and then doing a test run on that so that was mainly our initial setup I actually know it and I'm not sure if you were looking at the paper right now but there's a table three that we asked the medical student to take a take a look at the result and then he pointed out that some of them make sense some of them don't and we have that like a preliminary evaluation from medical I would have a little medical student but he's still a medical expert so better than us so he took a look at it and then gave us a preliminary or initial evaluation of and said that this seems to work well but we didn't like thoroughly go through clinical evaluation sure let's take a break from our show to talk about our sponsor for today periscope data periscope data is a great tool for data teams who want to rapidly go from sequel to charts if that's not enough they've also made it seamless and easy to do cross database joints now I'm not just talking about writing queries between two separate database is that might be hosted on the same server no no periscope data allows you to join tables in my sequel to a query in an entirely different Postgres database the interoperability extends to oracle ms sequel server redshift Vertica mem sequel bigquery and others so how do they do that it all happens on their back-end they also have this great caching layer which you can configure which centralizes all your data so that those queries run smooth and efficiently as if you were joining two tables on the same database guys that's tremendous if you work with an organization that uses more than one of these types of databases you really have to check out periscope data in just a few clicks whatever results you got from that cross database join of the two tables that were previously siloed and unconnectable you can turn that result into a dashboard of visualizations and get it sent straight to your boss's inbox check them out today at periscope datacom slash skeptics and what sort of in terms of like if it's precision and recall what do you look at when you describe the achievement of the model oh we look at the recall because we thought mainly we try to predict everything that's going to happen to the patient when he visits so we're going to predict diagnosis and for medications so we have to predict a lot of codes at the same time so that's we thought recall is a measure that would best represent how well the model is doing so we have like recall it okay because we are predicting thousands of codes at the same time so we have to take cup 10 or tap on your top authorities so we have a measure called recall a case a recall at 10:00 recall at 20 and recall a 30 and then so that's what the table twelve the paper describes yeah what kind of scores did you see achieved compared to what would be maybe random chance we didn't use random chesses baseline but we have a very weak base line which is repeating the coast from the last visit so if you had cold last visit and we are gonna night most naive model will be he had cold last visit so this time he's gonna have cold again so it's basically just repeating what kind of codes were in the last is it and that gave us the recall at 10:00 like well it does it doesn't even need recall it's something because you have a fixed set of codes so you don't you don't need to cut off from like thousands puzzle basically they that gave us like there 80% 30% recall but what we achieved is reaching towards if we do like recall 8:30 we reach like 80% recall so it's compared to that most naive model we're doing is significantly better than just repeating what has happened in the past yeah so there's absolutely some knowledge gained in the learning process there yeah what do you think about future investment in this obviously you guys didn't have infinite money an infinite time what could that be with you know a big grants and sort of stuff can we get that into the 90 range or is there some boundary on just how predictable health conditions are oh I see you mean like how far we can push the model to be so more accurate or more robust is that yeah we focused on structured data only in this Fork so that's basically just the codes that occurred in the past but as I said EHR is so much more than that it has notes it has demographic information it has lab measures and all those I heard from a medical clinical expert that the gold is hidden in the clinical notes not the structured data because he said like somewhere like 70% or 80% of the entire information is described in the medical notes and the structure data like this like the ones that we use in dr. a it only has represent like twenty or thirty percent of the the status of the patients so if we really want to significant prove this type of model we should dig into more modalities we should use everything that we can get our hands on like notes is of course a must and also wet measures like blood pressure checks or all those you know like continuous values from labs that is also a must and also demographic information is very helpful family history that is also very important but we didn't take any text-based resources into account when we were building this model so that is where we should attack in the upcoming efforts one question does raised is how well the model generalizes and I'm kind of unclear as to what I would expect you know in one hand people are people you know and we're all the same species so it seems like you know we have similar medical records but of course family history genetics these sorts of things play a role of course your training data biases what your model is able to accomplish but have how can you tell me a little bit about the applications of transfer learning using dr. AI and moving on to new sets of patient data hospitals have different patients so for example like Children's Hospital of Atlanta they would mostly treat children but like Sutter Health they are moved from more focused on like a from 40 to 80 the data that I received from Sutter Health was people from date age 40 between 40 and 85 or 90 I think so they have vastly different characteristics so it wouldn't make sense to expect doctor AI this single or in the model to do so well on all problems based on the month so if the Ireland has been been trained on one data set we cannot expect it to do well on all kinds of data set like it won't do well if doctor was trained on senior data the data set with senior patients and it won't do well on data set with children so that's sensible different size of hospitals have different size of course so some hospitals were so small small clinics they wouldn't have like huge data set that companies like started health has so well how about we trained after AI on a huge data set so that it's a it's a pre trained model and then we retrain it we refine it on another data fit that we are interested in so we would train hug doctor a 1000 pens that were maybe hundreds of thousands of patients first and then refined it or tune it fine-tune it with tens of thousands of patients that we're actually interested that we want to do prediction on and see if it actually helps having a pre trained model so that's what we did in one of the experiments and it has shown us that when you when you have pre trained model and when you so when you start from scratch when you start from scratch and then just train in a small number of patients compared to that when you actually trained operate on a huge number of patients and then just fine-tune it with small number there's it's the latter works so much better than just starting from scratch so that's like one of the potential of transfer learning that we tend ress in the paper yeah it's a very exciting area to me especially because I think of so many illnesses where because of their rarity we have a small data set if we could benefit from transfer learning we could probably gained a wealth of knowledge about some of these more rare conditions actually so I have also another paper that so the paper was accepted to clone a 27 k DD so the name is Graham so it's a graph based a patient model for how predictive healthcare so we just call it Graham like GRE and but basically it addresses the problem where it's hard for people to obtain a lot of data for people who has rare diseases even in large hospitals like even when there are so many patients only a very small number of subsets of patients will have that rare disease so in an absolute sense it's hard to gain a huge volume of data that regards to the condition or maybe some some Hospital won't have huge data at all so we try to incorporate medical ontology so until i like-like-like how different diseases are related to each other or how some diseases actually a children is a child of another more abstract disease just like icd-9 hierarchy so we use that as a medical main knowledge and try to introduce that into making prediction when using RNA so the RNN will use not only the past records but it will also incorporate medical domain knowledge such as sno-med or icd-9 or CCA uh like categorical C I forgot the abbreviation of CCS but it's basically like there are a lot of different different kinds of like hierarchies or ontology so yeah RN will pull knowledge from that ontology as well as pass breakers and make more intelligent predictions so that was our main focus in the recent paper so if people want to learn more about dr. AI and some of your other work I'll have links to those papers in the show notes of course but there's an added bonus I think people can go to your github and see the source code for dr. AI can you talk a little bit about the what's available there what people can run and what's shared so I try to make public all my codes that I've so all the codes that I've used when I was writing papers I think currently I'm maintaining I fix six different repositories so most of them are written in Theano except for one repository which is called Meghan is generating a medical patient records using general adversity all networks so except for that everything is written in yellow so if people are familiar with piano then I think they would they can easily read the code because it's not a law code basically especially for dr. AI and what I did is just write gated recurrent units which is another form of ardent so I just wrote that and then just pre processes data and then feed it into the RNA to make sequential predictions so that's basically what I did so if you go to the github I think you can easily search it if you just search my name at virtua and dr. then you'll seek it up on the top of your Google Google search page so there I have dr. a the source code and then there's also pre-processing script for mimic mimic is a public data set it's not a large data set but is one of the most famous public data set it's a pre-processing script for that so you download mimic and then you run the pre-processing strip on mimic then you'll have a data set that is ready to be fed into dr. a to make the life easier for the users basically using the source code the pre-processing script you can train your own dr. AI using mimic there that's it and see how it goes I think those are two good steps for people interested in this field any conferences or anything like that you're headed to you want to recommend for the R in the near future I don't think I need to tell people that KTV is a very probably the biggest conference venue for data mining and knowledge discovery so this is Katie D is very famous conference and I'll be attending there to present my paper Graham and but after that there is a small conference called machine learning in healthcare which will be held right after Katie D in Boston so it's a small campus started like a couple of years ago is so it's a small personal community yet but we are hoping that it would blow up to be a full-scale like large conference because I know that a lot of people are interested in healthcare combining healthcare and machine learning together so that's the main theme of the main theme of the conference so that's why it's called machine learning in healthcare so a lot of good good ideas good papers are presented there and I'm also going to present my med Gantt paper which is generating patient records as I said so I'll be presenting that paper there so it's August 18th to 19th in Boston so if people are interested they're welcome to come there I'm hoping that I would meet a lot of interesting people talk about a lot of interesting work there excellent anyway thank you so much for coming on the show so to wrap up maybe I can ask you one more just sort of question that's a unbounded prediction I'm sure you may have heard that relatively recently geoff hinton made this statement that we should consider not training radiologists anymore because it's a long training period and by the time you're out the deep learning systems will be better in fact Kristine who wrote the bio for you for the show she wrote a recent blog post on data skeptic that come all about this do you have any thoughts on that is is deep learning gonna put a lot of doctors out of business uh well first of all I have to admit that I I haven't heard what dr. Hinton said about the radiologist but I don't think maybe on very simple day to day jobs could be handled efficiently by machine learning this bit of like deep learning and also I think especially imaging as imaging room business like analyzing fmi MRIs or x-rays or cat scans or PET scans I think new illness can give a lot of helping hand to doctors in analyzing it but I don't think it's we are quite there yet to drive doctors out of business I don't think it's that simple because factors they trawl knowledge from biological chemical and also like they they learn a lot of stuffs to understand what goes on behind the scenes and that's where they draw their knowledge from but machine learning models they draw their knowledge from pure data so unless that gap is somehow shrink in the come in like next few decades I don't think doctors will be driven out by machines that easily is a yeah I mean they take whole different approaches it's an approach that has been like certified for a long that has been a proven to be working for a very very long time for the entire human history so I don't think just few years of deep learning like kid is going to drip but to change the landscape of medical business just significantly I don't I don't believe that's going to happen anytime soon well again ever thank you again so much for coming on and taking the time to share a lot of your research this is really interesting work and I'm eager to follow the rest of your career thank you very much for the invitation that I had a great time talking here all right take care one last thing before we end the show I've got an article coming out in the latest issue of skeptical Inquirer magazine it's called the missing for one one conspiracy an investigation let me give you a little quick background on this somewhere on the Internet I bumped into this idea called the missing four one one that this author David Politis had started talking about Pleiades believes that people are disappearing from national parks and other wildlife areas at an alarming rate under unusual circumstances and it's really draped in this era of mystery and kind of conspiracy and you know something's not right here but it also made like no specific claims it wasn't saying oh there being alien abductions or something wacky like that it was very open about what the claim was I got into this thinking there would be a cool statistical project or you know we could talk about how the sort of data is tracked or where to get it or what kind of biases have to be accounted for I don't I just thought there'd be an interesting stat story so I started looking into it turns out I didn't see anything interesting to say that related to statistics or the data but nonetheless I did this little skeptical investigation and tried to give a fair shake to this missing for one one idea to see if there was anything to it if you wanna hear my thoughts again those are in the latest issue of skeptical Inquirer magazine that's volume 41 number four I'm looking at it right here a bunch of other great articles in here as well and listen one last thing here I've been meaning to say something about this for a while I get some feedback from people about the show saying like you know what's the whole skeptical thing don't want skeptics you know spend their time not believing in aliens and Bigfoot and all kinds of kooky stuff like that what the heck does that have to do with data science it's a distraction why do these things mix sometimes and look even the claims in AI and in machine learning that we might look at and say I'm skeptical of this idea can that model achieve what they really say it does you know those sorts of things they're more like claims of plausibility someone says oh it's 80% accurate the way we predict what customer's going to churn out really 80 percent you're that accurate you're probably more like 70 but maybe you got something okay let's talk and we all have that sort of skepticism oh this thing says it scales infinitely well there's some exaggeration there well you know I'm sure it's a great product let's talk about how it actually works skip past the marketing BS and learn the real details and when you know you're skeptical in that way it can feel like you're just fine-tuning a claim that's more or less correct and that feels about a million miles away from the people who actually claim that they believe the earth get this in the modern era they believe the earth is flat still they're the Flat Earth Society is a thing and it's probably not entirely a joke so here's the deal there is a gamete sometimes I'm going to be too skeptical for you sometimes not enough but that's the data skeptic journey and critical thinking and logic are you have to admit absent in many important areas of our society that's not a political statement that's not an attack on anybody just in your own mind think about those words there's everything that you know of work in a logical rational way and I'm sure you'll come up with some situations where you think it doesn't so if there are people who will believe anything despite much evidence to the contrary all the data in the world is not going to help them the best models we create aren't going to matter if they're disposed of without inspection or consideration you know the power of data science not only do the kind of techniques we use need to be looked at skeptically you know as the models say it does what it does but our ability to make these useful things to our society stands entirely on a foundation of a society that thinks rationally logically and skeptically and when people believe anything despite all evidence to the contrary well skepticism at all level sorry for preaching for so long but yeah check out my article if you're interested data skeptic is a listener-supported program to support the show visit data skeptic comm and click on the membership tab [Music] you

Original Description

hen faced with medical issues, would you want to be seen by a human or a machine? In this episode, guest Edward Choi, co-author of the study titled Doctor AI: Predicting Clinical Events via Recurrent Neural Network shares his thoughts. Edward presents his team’s efforts in developing a temporal model that can learn from human doctors based on their collective knowledge, i.e. the large amount of Electronic Health Record (EHR) data.
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Data Skeptic · Data Skeptic · 53 of 60

1 Data Skeptic book giveaway contest winner selection
Data Skeptic book giveaway contest winner selection
Data Skeptic
2 OpenHouse - Front end and API overview
OpenHouse - Front end and API overview
Data Skeptic
3 OpenHouse Crawling with AWS Lambda
OpenHouse Crawling with AWS Lambda
Data Skeptic
4 [MINI] Logistic Regression on Audio Data
[MINI] Logistic Regression on Audio Data
Data Skeptic
5 Data Provenance and Reproducibility with Pachyderm
Data Provenance and Reproducibility with Pachyderm
Data Skeptic
6 [MINI] Primer on Deep Learning
[MINI] Primer on Deep Learning
Data Skeptic
7 Big Data Tools and Trends
Big Data Tools and Trends
Data Skeptic
8 [MINI] Automated Feature Engineering
[MINI] Automated Feature Engineering
Data Skeptic
9 The Data Refuge Project
The Data Refuge Project
Data Skeptic
10 [MINI] The Perceptron
[MINI] The Perceptron
Data Skeptic
11 [MINI] Feed Forward Neural Networks
[MINI] Feed Forward Neural Networks
Data Skeptic
12 Data Science at Patreon
Data Science at Patreon
Data Skeptic
13 [MINI] Backpropagation
[MINI] Backpropagation
Data Skeptic
14 [MINI] GPU CPU
[MINI] GPU CPU
Data Skeptic
15 OpenHouse
OpenHouse
Data Skeptic
16 [MINI] Generative Adversarial Networks
[MINI] Generative Adversarial Networks
Data Skeptic
17 [MINI] AdaBoost
[MINI] AdaBoost
Data Skeptic
18 [MINI] The Bootstrap
[MINI] The Bootstrap
Data Skeptic
19 [MINI] Dropout
[MINI] Dropout
Data Skeptic
20 [MINI] Gini Coefficients
[MINI] Gini Coefficients
Data Skeptic
21 [MINI] Random Forest
[MINI] Random Forest
Data Skeptic
22 [MINI] Heteroskedasticity
[MINI] Heteroskedasticity
Data Skeptic
23 [MINI] ANOVA
[MINI] ANOVA
Data Skeptic
24 Urban Congestion
Urban Congestion
Data Skeptic
25 [MINI] The CAP Theorem
[MINI] The CAP Theorem
Data Skeptic
26 Unstructured Data for Finance
Unstructured Data for Finance
Data Skeptic
27 Detecting Terrorists with Facial Recognition?
Detecting Terrorists with Facial Recognition?
Data Skeptic
28 Predictive Models on Random Data
Predictive Models on Random Data
Data Skeptic
29 [MINI] Entropy
[MINI] Entropy
Data Skeptic
30 [MINI] F1 Score
[MINI] F1 Score
Data Skeptic
31 Causal Impact
Causal Impact
Data Skeptic
32 Machine Learning on Images with Noisy Human-centric Labels
Machine Learning on Images with Noisy Human-centric Labels
Data Skeptic
33 The Library Problem
The Library Problem
Data Skeptic
34 Stealing Models from the Cloud
Stealing Models from the Cloud
Data Skeptic
35 Data Science at eHarmony
Data Science at eHarmony
Data Skeptic
36 Multiple Comparisons and Conversion Optimization
Multiple Comparisons and Conversion Optimization
Data Skeptic
37 Election Predictions
Election Predictions
Data Skeptic
38 [MINI] Calculating Feature Importance
[MINI] Calculating Feature Importance
Data Skeptic
39 MS Connect Conference
MS Connect Conference
Data Skeptic
40 Music21
Music21
Data Skeptic
41 The Police Data and the Data Driven Justice Initiatives
The Police Data and the Data Driven Justice Initiatives
Data Skeptic
42 Studying Competition and Gender Through Chess
Studying Competition and Gender Through Chess
Data Skeptic
43 [MINI] Goodhart's Law
[MINI] Goodhart's Law
Data Skeptic
44 Trusting Machine Learning Models with LIME
Trusting Machine Learning Models with LIME
Data Skeptic
45 [MINI] Leakage
[MINI] Leakage
Data Skeptic
46 Predictive Policing
Predictive Policing
Data Skeptic
47 Mutli-Agent Diverse Generative Adversarial Networks
Mutli-Agent Diverse Generative Adversarial Networks
Data Skeptic
48 [MINI] Convolutional Neural Networks
[MINI] Convolutional Neural Networks
Data Skeptic
49 Unsupervised Depth Perception
Unsupervised Depth Perception
Data Skeptic
50 [MINI] Max-pooling
[MINI] Max-pooling
Data Skeptic
51 MS Build 2017
MS Build 2017
Data Skeptic
52 Activation Functions
Activation Functions
Data Skeptic
Doctor AI
Doctor AI
Data Skeptic
54 [MINI] The Vanishing Gradient
[MINI] The Vanishing Gradient
Data Skeptic
55 CosmosDB
CosmosDB
Data Skeptic
56 Estimating Sheep Pain with Facial Recognition
Estimating Sheep Pain with Facial Recognition
Data Skeptic
57 [MINI] Conditional Independence
[MINI] Conditional Independence
Data Skeptic
58 MINI: Bayesian Belief Networks
MINI: Bayesian Belief Networks
Data Skeptic
59 Project Common Voice
Project Common Voice
Data Skeptic
60 [MINI] Recurrent Neural Networks
[MINI] Recurrent Neural Networks
Data Skeptic

The episode explores the potential of AI in medical diagnosis, discussing the development of a temporal model that can learn from human doctors based on EHR data. This model uses Recurrent Neural Network to predict clinical events, raising questions about the role of AI in healthcare.

Key Takeaways
  1. Collect and preprocess EHR data
  2. Develop a temporal model using Recurrent Neural Network
  3. Train the model on EHR data
  4. Evaluate the model's performance in predicting clinical events
💡 The use of temporal models and Recurrent Neural Network can help predict clinical events, potentially improving medical diagnosis and patient outcomes.

Related Reads

📰
I Found the Neural Network I Built in Class 9 — Here’s What Happened When I Tried to Run It Again
Revisiting a 4-year-old neural network project for handwritten digit recognition using a convolutional neural network and analyzing its performance
Medium · Deep Learning
📰
Introduction to Deep Learning and Neural Networks: From Human Brain to Artificial Intelligence
Learn how biological neurons inspired artificial neural networks and deep learning, transforming the AI landscape
Medium · Deep Learning
📰
Want to get started with deep learning
Get started with deep learning by leveraging resources like Andrew Karpathy's playlist and frameworks such as TensorFlow or PyTorch
Reddit r/deeplearning
📰
Building a Deepfake Detector From Scratch — What Nobody Tells You
Learn to build a deepfake detector from scratch and understand the challenges involved in detecting AI-generated fake media
Medium · Deep Learning
Up next
Image Classification with ml5.js
The Coding Train
Watch →