Hospital /Clinic AI Decision Models: Performance of 12 AI LLM Systems (incl $$) Radiology, Biomed

Discover AI · Advanced ·🧠 Large Language Models ·3y ago

Key Takeaways

The video discusses the performance of 12 AI Language Models on clinical tasks, including fine-tuning and pre-training on clinical notes, with a focus on radiology and biomedical applications, using tools such as T5, GPT-3, and Microsoft Open AI Chat GPT.

Full Transcript

hello community so with this surrounding success of here our general domain large language model like Jack GPT or Bing chat based on GPT 3.5 quite a group of my viewers who work in hospitals ask hey is there a clinical AI model that our Hospital could use is there anything you know about Market specific hospital-specific clinical AI models what are they can we buy them how to discover them and I recording this now here at Mid end of February 2023 and this is what I found for you so let's have an AI performance overview for clinical use only and we will look at quite a lot of system and of course we will have Microsoft or chat Microsoft open AI chat GPT and GPD 3.5 systems we will have Google's T5 T5 X T5 XXL systems but at first of course since we are working in a clear Market segment and this is Clinic we have to have data sets for fine tuning we have to have benchmarks that we have a data set where we have a training data set and a relation data set and a test data set because I want to know the accuracy if we have those models and we fine-tune those models I want to have absolute clear data on the accuracy of those models if I fine-tune them for a Downstream specific task and those data sets have to come from the clinical environment so my first data set is of course a medical nli data set then for radiology I have a q a data set and I show you what clip means in a second if you want to explore the data sets here is papers with code.com you can have a look in detail at those data sets now just to give you a short overview here we have a natural language inference task we've met nli and the goal is to determine whether a hypothesis written by a doctor can be inferred from a premise taken directly from a clinical note so you notice I've shown you this before when I showed you here when we train here our bird models when we train sentence Transformers when we train cross encoders we had exactly this we had a multi-class classification task and we had labels where we had entailment neutral or contradiction of two sentences this is exactly where we are here right now only now our training data are clinical notes by doctors written by doctors and it is important that they are not perfect they have a lot of Apple variation they have a lot of medical terms so this is what we're looking for second is here a question on answering task on Radiology reports so we are here really and it's very specific clinical subtask radiology and this is a q a you know this you know the Matrix we look here at a performance measurement how good are the models if we fine-tune them from here what is their accuracy and the clip now is not for you things but clip here is a multi-label classification task in which the goal is to identify key sentences that contain some follow-up information in and they say here in the hospital you have a discharge summary so whenever they kick you out after hospital and they never come back and by the way this is here our final result this is exactly what we are looking for key sentence identification and here we have seven possible labels that we train this system on on real world data because really this is these are real I will show you the hospital involved here this is a real danger therefore we have to have to secure a client privacy so we have seven labels patient specific appointment medication lab procedure Imaging and whatever so look at the first data set we have here data set with 11 200 training data sets 1400 development data and here for test instances we have about 1400 so beautiful this is the first one the second one you have about three thousand questions so we have here in totals more than six thousand question answer evidence pairs that you have an idea how big our data set is and then clip here of course you have uh based on different mimics here we call annotated cover 718 documents representing about 100 000 sentences so continuing chaos commercial for ensuring positive Health outcome for patients discharge from an inpatient Hospital settings and improved information sharing can help to share information on caregivers write discharge notes containing further action items to share with the patient in their future caregivers but this action items are easily lost to the lengthiness of the documents and so we have here a hundred thousand sentences beautiful this was it now that we have here our fine-tuning data sets now the models Which models we're gonna use and of course as I told you February 2023 we have the hype here of jet GPT of Bing jet of Bing AI of the new Bing whatever so we have domain agnostic large language model by Microsoft or Microsoft opmei that are currently really hyping so and they have now quite some impressive performance even if applied to the clinical environment of course since they are everything for everybody or domain agnostic large language model they are huge models and you have to run them on more or less in the cloud a Microsoft supercomputer Center or other huge clusters therefore this is what quite a lot of people say this is state of the art and everybody is focusing here on those large llms by Microsoft or opening eye currently now there is however this is here there is a niche of specialized clinical models so let's make it a little bit bigger we do have a second option we can create specialized clinical models and to make it absolutely sure this is only a clinical model you cannot ask this model here in two for a French poem or a Japanese haiku where to write some python code no this is absolutely dead on focused clinical models that you use in the hospital and in the hospital only but they are highly specialized so what we can do we can pre-train them on in-house clinical notes so you know there's the Privacy your client your hospital is not allowed to give your data out we will say there was some some non-behavior later on but okay so we have here the pre-training of In-House clinical notes and then we can fine tune those models here on our three specific Downstream tasks that I showed you define tune parameter now the nice thing is since these models are just dedicated specialized models they are much smaller so you can use maybe your local computer infrastructure at the hospital or maybe you can buy a small I don't know workstation or whatever at the hospital wherever the hospital has its compute infrastructure provider and you do not need to run the huge everything for everybody models in the Microsoft cloud so let's have a look at this there's a beautiful study and they looked at this different models so you know from Google we have the T5 base model then we have and I will tell you in a moment what is exactly the difference between a clinical and a non-clinical T5 model then as you can see here T5 as you know it is an encoded decoder stack of a transformer architecture but we also have only the decoder stack 12 Stacks or 24 layers uh encoder only layers of our Transformer architecture and this is here on my Channel about birth models or Robert Berta models Advanced bird models and there we have also non-clinical models and clinical bioclinical birth models as I told you these are bird models these are not llms they are here focused on understanding the human semantic the human language and they are not trained to generate here a poem or a report or some whatever now then of course T5 large is now the T5 base much bigger and you see we go here from a size of 220 million free parameters trainable parameters if the T5 larger go to 770 million trainable parameters again we have an encoder decoder stack and we have then the clinical T5 large modeling we're going to have a look at her performance how good is each model what is the performance jump of course then we can go T5 XL this is now a 3 billion free parameter trainable model encoder decoder and then maybe you have known or I have seen on my channel flan T5 XXL with 11 billion parameter model anchor the decoder but of course a lot of you are going to ask hey our GPT is free or GPD 3.5 or whatever is coming up in the future this huge monster with 175 billion parameter which is of course an llm is of course Auto regressive is if you want a generative AI model but we know technical it is only the decoder stack of our Transformer architecture so those are the models I found the data there is a beautiful study I want to show you the results how does those models perform in only one specific environment and this is a hospital the clinic again if you want to brush up your knowledge about the bird so this means about the encoder stack of a transformer or you want to deepen the difference between bird and GPD systems where we have here the decoder stack this is the video for you to watch so of course whenever we present something that is in a clinical context we have to have hey is it reliable sources who did a comparison can we trust those sources and I know a lot of my viewers are now gonna say hey he did the comparison I'm so sorry to disappoint me no it was not me I did not hide the time and there are people who are much more capable of doing this thing at first we had MIT messages Institute of Technology then we had IBM research and then we had Harvard Medical School and of course for the private personal client data we have two hospitals we have the hospital for six children and Brigham and Women's Hospital so this is here our team and they are that called the Three Musketeer MIT IBM and Harvard Medical School and they are fighting here against to show that there are alternatives to the current dominant hyped Bing chat or jet GPT system so here we go we train now and at first as I showed you we start with the T5 model from Google as an alternative to GPT so we pre-train our day pre-trained orders pre-trained now the T5 base model and a T5 large model 220 million parameters 770 million parameters from scratch and I have here two videos on my channel where I show you how you can train the birds from scratch and give you the code in pi torch and and tensorflow too and I trained it only on the clinical notes so there's nothing else except clinical information those clinical notes since it's two hospitals in US of course is in English and you have here this and this you should know mimic is medical information mod for intensive care you have mimic three mimic four databases and here if you have if you're not familiar with this you can find the information here so what is the first model the first large language model that we have a look at and this is the clinical T5 base model so we take from Google the T5 base model only the architecture with its layers and construct and we have now a pre-training that we apply on the clinical notes on the mimic clinical notes and is about 40 billion tokens so what we end up with after the pre-training with a clinical T5 base model of 220 million now trained parameters and you might said okay but what if I use now a checkpoint often already pre-trained on a general English text there is no policy there's Finance there's news there's whatever if I take this and I and now not fine tune it but there is this notion of further pre-tune it so you say okay I am some checkpoint I don't know after 10 20 50 e-parks and you're saying okay now take this checkpoint this model this T5 base model on a specific checkpoint and I continue pre-training it further on my clinical mimic data if you know I have a specific video on this on domain adaptive training and here the process since we're here in the pre-training is called domain adaptive pre-training if you search it in my video in my channel you will find I think two videos where I explained the mechanism for this now I would like to give you since a lot of you asked me hey about what is the machine and how long does it take and what does it cost to do this now here The Orchards on the front is to tell us it was trained on eight a six thousand gpus each had 48 gigabytes of RAM batch size of 32 per GPU and each Epoch took roughly 6 hours on this 8 a 6000 gpus so you got an idea where we are and what are your RAM requirements if you want to have a checkpoint pre-train it just further this is it what you need to know now of course we want to go larger we want to have more performance and we assume that a T5 large model will give us a better performance so we take the orders took a T5 large model and trained it pre-trained it now on clinical data again T5 model pre-trained now clinical notes from scratch this is it 780 000 steps or about close to 40 billion tokens and they used as they told you a TPU not a version 4 but a version 3.3.8 cluster batch size of 1250 pu but I want to show you just what does it cost if you just take a T5 large model so not the XL or XXL or yeah really huge models no just a large model and they said for the University it cost them about 1 800 US Dollars and they're playing the pre-training process of a T5 large model on the clinical mimic data uh took about 220 hours and this was all already a Google TPU cluster so you see at first this is a price I assume for University so if you were as a for-profit company maybe you say 2000 plus US Dollars and then you just have a pre-trained model but at least you have it pre-trained modeling you know for 2000 plus and if you invest about 220 hours on this machine now we have version 4 and upcoming other words so it will be faster but also maybe the price will go up so but I wanted to show you this is about the price and about the time you have to invest now this is here from my my Twitter account here we have the openai noticing that that now there is a need for some dedicated system they have now a new developer product they call Foundry and as here Travis Estates to run openai model so the chat GPD model and uh on the GPT 3.5 model and the upcoming gpt4 model the model inference here at scale with dedicated capacity what does it cost and he States here for GPT 3.5 instance for a three month commit it would be about 78 000 in a one year commit is about a quarter of a million dollars if you go for higher performance your parents can go up wherever and if you think what another just thought model inference I only know I want to also have the ability to further find you on this model and yeah if you looked at the product preview in detail you will see that here coming soon so we don't know when openi will offer this if if Microsoft will be the cloud platform where openai will offer this uh openware I will offer this on their own machine it has a lot of question but here just to tell you they say oh my will offer a more robust fine tuning option for the latest model and the founder will be the platform for serving those models so I mean if you pay a quarter of a million uh just for that this thing is up and running in the cloud I think this will also be able to fine tune it to your needs but who knows we have to wait for the official this is just as you see every Vision this is not the final information but you can see here that yeah the price per unit and if you have 100 unit per instance this is exactly how we ended up here with this information now what else yeah the uptime is guaranteed yeah so have a look at this if you want to know if you are I don't know Bank of America and you wanna have no idea well how many people you want to have your dedicated access to your dedicated Bank of America jet GPT whatever but let's look at the performance numbers how good are those systems and now you see here you are a model model size so we go from the base model from 220 million parameters to 3 billion parameters and here we have now our three Downstream task where we have now defined tuning happening and we have now the accuracy of those models and this is what we are looking for so now let's have a look as you can see in bold we have if you want the winner and not surprisingly if the T5 base model is not bad with 85 percent accuracy on the medical nli Downstream task but if you go with the T5 large model you see if you do not train it on the clinical data but you go with the general domain index you are also at 84.9 percent but if you train it pre-train it only on the clinical data your T5 large model gives you with this particular pre-training a performance of 87.2 percent I think they did three runs and they did everything that you need to do just want to show you so a T5 large model under two clinical data is better than a T5 not trained on clinical or a T5 based on train or clinical but if you have a look I mean every every Point counts but the clinical base is with 85 percent not so bad if you compare it to the clinical T5 large with 87 so yes if you you want every performance that you can get but have a look at this for radiology you have here the number and you can see clinical T5 large is here the winner and also here for clip so beautiful unfortunately and I would love to see that at a clinical T5 XL model I would love to have seen if it come up to 90 percent or how big is the jump if you go four times from 770 million to 3 billion free trainable parameters but unfortunately well this is the last they could afford so you can imagine yeah unfortunately I could not find this data if you have some other sources and you find some performance figures please leave a link in the comment to this video but now of course everybody is interesting hey yeah but what about now our general llm our gpt3 is 3.5 what about here fine-tuning llm and what is going on now with chat GPT and GPT 3.5 in here and here we go we have now here are 175 billion parameter model gpt3 and that gpt3 unfortunately not available since it's proprietary so what are the numbers and here we go and you see here The Benchmark gives us 80 percent for gbd3 and uh still our best performer is the clinical T5 Flash of course gbt3 was not trained on the clinical data so yeah a clinical T5 flash has a better accuracy and you see if you look here you see that here the non-train non-available training data can cause of course problems in the performance this is why dedicated systems like a clinical T5 has such a performance so of course Microsoft is not stupid and recognized and is offering now for Global Corporation then they will take the GPT system from open my on Microsoft I think Microsoft is exerting such a dominant influence on open AI I think more or less yes you know what's going to happen to open AI they will for their client against some million dollars also provide I suppose a clinical GPT model in one year and two years I don't know but you have an idea what is coming up and how this is going to be done now the difference is for this you have to pay the super computer and the cloud for this you can do at a local infrastructure in your hospital for example really behind firewalls on a local premise so these are two performance figures now as I told you I have here video about Microsoft is now also going for a biomedical gpts or not a pure clinical biomedical bio Pharmaceutical but this is the video you can have a look but let's look now at the total results of each and every model so the authors tell us hey the T5 XL we have here now for the non-clinical data this is the the biggest model we could fine tune completely fully fine tune in the classical sense this was the biggest T5 XL and this are the performance characteristics of a fine-tuned T5 XL on our three Downstream tasks now the other models that they took into consideration even bigger miles the flying T5 XXL as I told you with 11 billions they could not provide a clinical version of this and of course they had not the money to provide a clinical gpt3 version of it and they even did not have the resources I mean imagine we are talking here about IBM research as a partner in this study they had net not the cloud infrastructure the computer infrastructure to perform here even a fully fine-tuned uh version on our three data set so they had not to spend the money for fine tuning so they had to resolve to ICL to in context learning or former generation prompt engineering and I will show you a detail about the difference between fine tuning and ICL and their performance degradation that we get in a second but maybe and you notice it because it is flashing in red we have a winner if we look at all the models there is now an absolute winner with 90 in the medical analy and as you can see the winner is Roberto also a burnt model so this means an encoder stack part of our Transformer that was trained on biomedical and clinical data and this model performed 90 accuracy and also uh on the radiological and the clip task they have the best performance given that this is a model with only 345 million parameters so this you could run inference on a local machine for this you would not need the huge Microsoft cloud supercomputer structure is this a coincidence so you see if you work in a clinical environment and you only want an AI system only for clinical tasks like metal iron Radiology or clip deos competitive solution that are much smaller I mean 175 billion to 0.3 billion models I mean imagine just running inference how much less it costs and for the training you can do this on more or less your local machine so you can see for a dedicated environments where you have a specific pre-training and fine-tuning only on clinical datas those models outperform those huge everything for everybody models like GPT models but of course Innovation will not stop Microsoft will offer here dedicated systems also for specific very prominent Market segment so you can imagine that in one or two years here we will have new winners but you know what a bird model you can just add new layers you can improve its performance in a very simple way this is open source this is completely open up to you I showed you how to build a bird model and I decided if I want a 12 or 24 or 36 different layers so this here is open source open you can experience it but if you go to GPD 3.5 you have to take a license you are not allowed to to modify it you have to use their dedicated infrastructure it is this is the current state of the first but it I guess it will change in the coming years good news clinical system 90 with a good old Earth system so I told you about in context learning uh and fine-tuning at first fine tuning is this huge model is extremely expensive millions and millions of dollars so ICL was the cheap alternative but unfortunately as the study found out if you apply here the clinical context it underperforms the classical task specific models that you find here so but I said okay let's let's reduce now this fine tuning what about here if you want to compare this to ICL where we have only a very limited amount of training data for ICL what when we find you now this Roberto a clinical Roberto or whatever but a clinical T5 large now only on one percent of the available fine-tuning data who are only on five percent or we just use 10 or 25 and then of course we use all available training data how does it perform if we compare this to our low amount of training data ICL so and they did some beautiful studies and here if you want to have here this prompt engineering here for this Wireless retask this is exactly the prompt structure that they used here for their specific tasks they experimented with approximately 5 to 10 different prompts for each task crafting the prompts to reflect the prompts used during the instruction tuning of 95 into P3 then they had one to three random samples examples yes yes yes they choose from 200 example of the validation set have a look at this if you're interested but the result you can see here so here you have here in color in the solid lines you have here Roberta Robert and T5 large and whatever those models they could fully train fully fine tune then and this other solid lines and the dotted lines you have where fine tuning was too expensive could not be afforded models were too huge for MIT IBM and Howard medical school so they could not find unit but they had to use ICL and you see the performance you really clearly see the performance so you have here at first here on the x-axis if you have only one percent of your training data set and five percent then ten percent then twenty five percent and then fully oops yo yeah the full uh set of sentences for your training data set and you see here let's take here at first this Roberto model in yellow you see if you only have uh a very small amount of training data you are already above the accuracy of ICL for the CO2 LMS and then of course the more training data you make available to the system your performance goes up from I don't know 82 85 87. 88 and then 90 performance so you see the more training data are available the better your performance will be of course somewhere if you go on go and go on you will not reach maybe 100 but somewhere it will set rate and it will become horizontal but anyway just to give you an idea independent which model you use a real fine tuning on dedicated unlabeled data sets is outperforming each and every uh ICL if you want prompts just for further information the result okay let's summarize this result suggest that smaller models specifically tailored for the clinical text are more parameter efficient they are smaller than large language models we find that using in context learning ICL with extremely large language model like gpt3 and coming is not a sufficient replacement for fine-tuned specialized clinical models as I showed you there is somewhere a financial limit if you are University or if you are a small hospital that you can afford but that you know in general fine tuning is better than just ICL this findings highlight the importance of developing models for highly specialized domain such as clinical text and this is more or less what we expected if you have a small model but it's a specialized model only pre-trained and fine-tuned on clinical text it should outperform those big monolithic huge AIS like that GPT or GPT 3.5 or upcoming GPT 4. of course development will not stop Microsoft open I will put more and more training data let's say the whole internet into training data but since this model got bigger and bigger you have to run it on cloud supercomputer and this will get really really expensive for a normal university or a normal hospital what else just wanted to mention page four hey open eye open eyes stored all inputs to be used as training data here when I told you about the data sets which violated mimics data user agreement really really open AI you had to violate the data user agreement to get here the clinical data okay I expect a different and he is not a study if you're interested here in the clinical language model and the alternatives to the clinical GPT miles to see huge models for Microsoft I think you find here from MIT IBM research on Harvard Medical School some excellent ideas some beautiful benchmarks it's a little bit more technical to paper but I tried to extract here the most important performance data I wanted to show you what they did how they did it but here especially in the annex you'll find a lot more of further detailed information about the system and as I told you it was published on an archive preprint server on February 16 2023 I say thank you for this short journey to some clinical environment if you apply AI in a hospital environment what are your options I have an idea what it would cost you to pre-train this system and to run inference I say thank you and I see you in my next video

Original Description

12 AI Language Models, ranging from 220M to 175B parameters, w/ measuring their performance on 3 different clinical tasks that test their ability to parse and reason over electronic health records of patients. Training T5-Base and T5-Large models from scratch on clinical notes from MIMIC III and IV to directly investigate the efficiency of clinical tokens. Result: small specialized clinical models substantially outperform all in-context learning approaches on LLMs like GPT-3, even when fine-tuned on limited annotated data. Latest AI in Clinical Settings: A Critical Look at the Performance of 12 Language Models and LLMs. All rights and credits to: Do We Still Need Clinical Language Models? https://arxiv.org/pdf/2302.08091.pdf #clinical #ai #generativeai #naturallanguageprocessing #datascience #hospital #sbert #machinelearning #test #clinicalbiochemistry
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Discover AI · Discover AI · 52 of 60

1 Step Into the Unknown (by YouChat) - May 2023 be your best year yet
Step Into the Unknown (by YouChat) - May 2023 be your best year yet
Discover AI
2 Wishing you all an amazing 2023 filled with Love, Laughter, and Happiness!
Wishing you all an amazing 2023 filled with Love, Laughter, and Happiness!
Discover AI
3 Create a Smarter Future!
Create a Smarter Future!
Discover AI
4 The Art of Text to Vector Transformation: A Comprehensive Look at AI and NLP Transformers
The Art of Text to Vector Transformation: A Comprehensive Look at AI and NLP Transformers
Discover AI
5 Feature Vectors: The Key to Unlocking the Power of BERT and SBERT Transformer Models
Feature Vectors: The Key to Unlocking the Power of BERT and SBERT Transformer Models
Discover AI
6 Domain-Specific AI Models: How to Create Customized BERT and SBERT Models for Your Business
Domain-Specific AI Models: How to Create Customized BERT and SBERT Models for Your Business
Discover AI
7 Achieve Unimaginable Levels of Domain Knowledge through SBERT Extreme in 3D   (SBERT 48)
Achieve Unimaginable Levels of Domain Knowledge through SBERT Extreme in 3D (SBERT 48)
Discover AI
8 Unlocking Scientific Domain Knowledge w/ BPE Tokenizer: An Amazing Journey!  (SBERT 49)
Unlocking Scientific Domain Knowledge w/ BPE Tokenizer: An Amazing Journey! (SBERT 49)
Discover AI
9 SBERT Extreme 3D: Train a BERT Tokenizer  on your (scientific) Domain Knowledge  (SBERT 50)
SBERT Extreme 3D: Train a BERT Tokenizer on your (scientific) Domain Knowledge (SBERT 50)
Discover AI
10 Discover Vision Transformer (ViT) Tech in 2023
Discover Vision Transformer (ViT) Tech in 2023
Discover AI
11 Pre-Train BERT from scratch: Solution for Company Domain Knowledge Data | PyTorch (SBERT 51)
Pre-Train BERT from scratch: Solution for Company Domain Knowledge Data | PyTorch (SBERT 51)
Discover AI
12 Flan-T5-XL model on a free COLAB | A free LLM - that explains itself w/ reasoning /write essay | AI
Flan-T5-XL model on a free COLAB | A free LLM - that explains itself w/ reasoning /write essay | AI
Discover AI
13 BERT and GPT in Language Models like ChatGPT or BLOOM |  EASY Tutorial on Large Language Models LLM
BERT and GPT in Language Models like ChatGPT or BLOOM | EASY Tutorial on Large Language Models LLM
Discover AI
14 Free Alternative to ChatGPT: Flan-T5-XL GUI (open-source)  #shorts
Free Alternative to ChatGPT: Flan-T5-XL GUI (open-source) #shorts
Discover AI
15 From T5 to T5X: A Game-Changing Evolution with JAX & FLAX
From T5 to T5X: A Game-Changing Evolution with JAX & FLAX
Discover AI
16 How to start with ChatGPT?  | Short Introduction to OpenAI API #shorts
How to start with ChatGPT? | Short Introduction to OpenAI API #shorts
Discover AI
17 The Future of Conversational AI? Google's PaLM w/ RLHF  | LLM ChatGPT Competitor
The Future of Conversational AI? Google's PaLM w/ RLHF | LLM ChatGPT Competitor
Discover AI
18 Microsoft and ChatGPU
Microsoft and ChatGPU
Discover AI
19 From Zero to FLAN-T5 XL Model GUI with Gradio: A Step-by-Step Guide on Free COLAB Notebook PyTorch
From Zero to FLAN-T5 XL Model GUI with Gradio: A Step-by-Step Guide on Free COLAB Notebook PyTorch
Discover AI
20 Google's 2nd Answer to "BING ChatGPT":  Sparrow | after BARD w/ LaMDA | 2nd Gen Conversational AI
Google's 2nd Answer to "BING ChatGPT": Sparrow | after BARD w/ LaMDA | 2nd Gen Conversational AI
Discover AI
21 TF2: Pre-Train BERT from scratch (a Transformer), fine-tune & run inference on text | KERAS NLP
TF2: Pre-Train BERT from scratch (a Transformer), fine-tune & run inference on text | KERAS NLP
Discover AI
22 3D Visualization for BERT: How to Pre-Train with a New Layer & Fine-Tune with Downstream Task Layer
3D Visualization for BERT: How to Pre-Train with a New Layer & Fine-Tune with Downstream Task Layer
Discover AI
23 FLAN-T5-XXL on NVIDIA A100 GPU w/ HF Inference Endpoints, let's explore 11b models!
FLAN-T5-XXL on NVIDIA A100 GPU w/ HF Inference Endpoints, let's explore 11b models!
Discover AI
24 ChatGPT - Can it Lie to you?
ChatGPT - Can it Lie to you?
Discover AI
25 ChatGPT Alternative: Perplexity by Perplexity.AI
ChatGPT Alternative: Perplexity by Perplexity.AI
Discover AI
26 2023 KerasNLP Tutorial: Explore Latest KERAS Toolbox & NLP Processing Library for BERT - TF2
2023 KerasNLP Tutorial: Explore Latest KERAS Toolbox & NLP Processing Library for BERT - TF2
Discover AI
27 Self-aware AI: You.com/chat vs Perplexity.ai | Live Demo, LLMs show Future of ChatGPT w/ BING
Self-aware AI: You.com/chat vs Perplexity.ai | Live Demo, LLMs show Future of ChatGPT w/ BING
Discover AI
28 BLOOM 176B Inference on AWS  | Bigger than GPT-3 for more Power!
BLOOM 176B Inference on AWS | Bigger than GPT-3 for more Power!
Discover AI
29 Fine-tune ChatGPT? Buy Embeddings /OpenAI? What are Embeddings?  My own ChatGPT? | Visual Q+A
Fine-tune ChatGPT? Buy Embeddings /OpenAI? What are Embeddings? My own ChatGPT? | Visual Q+A
Discover AI
30 Unleashing the Power of BLOOM 176B with AWS ml.p4de.24xlarge, DJL & DeepSpeed: The Ultimate Boost!
Unleashing the Power of BLOOM 176B with AWS ml.p4de.24xlarge, DJL & DeepSpeed: The Ultimate Boost!
Discover AI
31 After ChatGPT: NEW BioGPT by Microsoft | Do YOU trust Microsoft for your Medication?
After ChatGPT: NEW BioGPT by Microsoft | Do YOU trust Microsoft for your Medication?
Discover AI
32 Improve ChatGPT: Modular, Adaptive, Smart LLM | Inside ChatGPT
Improve ChatGPT: Modular, Adaptive, Smart LLM | Inside ChatGPT
Discover AI
33 Fine-tune ChatGPT w/  in-context learning ICL - Chain of Thought, AMA, reasoning & acting: ReAct
Fine-tune ChatGPT w/ in-context learning ICL - Chain of Thought, AMA, reasoning & acting: ReAct
Discover AI
34 The Intersection of Copyright Law and Human Faces: Exploring Virtual K-Pop with MAVE
The Intersection of Copyright Law and Human Faces: Exploring Virtual K-Pop with MAVE
Discover AI
35 New TECH: Vision Transformer 2023 on Image Classification | AI
New TECH: Vision Transformer 2023 on Image Classification | AI
Discover AI
36 PyTorch code Vision Transformer: Apply ViT models pre-trained and fine-tuned  | AI  Tech
PyTorch code Vision Transformer: Apply ViT models pre-trained and fine-tuned | AI Tech
Discover AI
37 New BING ChatGPT: Unlock the Power of Emotions in your Search Engine!
New BING ChatGPT: Unlock the Power of Emotions in your Search Engine!
Discover AI
38 New BING ChatGPT loses its mind
New BING ChatGPT loses its mind
Discover AI
39 Self-Attention Heads of last Layer of Vision Transformer (ViT) visualized (pre-trained with DINO)
Self-Attention Heads of last Layer of Vision Transformer (ViT) visualized (pre-trained with DINO)
Discover AI
40 Visualizing the Self-Attention Head of the Last Layer in DINO ViT: A Unique Perspective on Vision AI
Visualizing the Self-Attention Head of the Last Layer in DINO ViT: A Unique Perspective on Vision AI
Discover AI
41 Microsoft strongly restricts access to ChatGPT on new BING - WHY?
Microsoft strongly restricts access to ChatGPT on new BING - WHY?
Discover AI
42 PyTorch ViT: The Ultimate Guide to Fine-Tuning for Object Identification (COLAB)
PyTorch ViT: The Ultimate Guide to Fine-Tuning for Object Identification (COLAB)
Discover AI
43 New BING Chat AGGRESSIVE
New BING Chat AGGRESSIVE
Discover AI
44 Panoptic Image Segmentation: Mask2Former explained | Identify all objects!
Panoptic Image Segmentation: Mask2Former explained | Identify all objects!
Discover AI
45 Code Panoptic Image Segmentation w/ Vision Transformer & Mask2Former - A PyTorch tutorial
Code Panoptic Image Segmentation w/ Vision Transformer & Mask2Former - A PyTorch tutorial
Discover AI
46 Dream Job Alert: AI Prompt Engineer - $335K  |  AI Prompt Design: A Crash Course
Dream Job Alert: AI Prompt Engineer - $335K | AI Prompt Design: A Crash Course
Discover AI
47 Streamlining Similar Image Detection with ViT in PyTorch: A Step-by-Step Guide
Streamlining Similar Image Detection with ViT in PyTorch: A Step-by-Step Guide
Discover AI
48 Microsoft's CEO in Trouble   #shorts
Microsoft's CEO in Trouble #shorts
Discover AI
49 Why wait for KOSMOS-1? Code a VISION - LLM w/ ViT, Flan-T5 LLM and BLIP-2: Multimodal LLMs (MLLM)
Why wait for KOSMOS-1? Code a VISION - LLM w/ ViT, Flan-T5 LLM and BLIP-2: Multimodal LLMs (MLLM)
Discover AI
50 OpenAI's ChatGPT can NOW summarize external Sources on the Internet?
OpenAI's ChatGPT can NOW summarize external Sources on the Internet?
Discover AI
51 ChatGPT polarizes
ChatGPT polarizes
Discover AI
Hospital /Clinic AI Decision Models: Performance of 12 AI LLM Systems (incl $$) Radiology, Biomed
Hospital /Clinic AI Decision Models: Performance of 12 AI LLM Systems (incl $$) Radiology, Biomed
Discover AI
53 ChatGPT Prompt Engineering w/ in-context learning (ICL)  - 7 Examples | Tutorial
ChatGPT Prompt Engineering w/ in-context learning (ICL) - 7 Examples | Tutorial
Discover AI
54 Chat with your Image!  BLIP-2 connects Q-Former w/ VISION-LANGUAGE models (ViT & T5 LLM)
Chat with your Image! BLIP-2 connects Q-Former w/ VISION-LANGUAGE models (ViT & T5 LLM)
Discover AI
55 ChatGPT:  Multidimensional Prompts
ChatGPT: Multidimensional Prompts
Discover AI
56 ChatGPT:  In-context Retrieval-Augmented Learning (IC-RALM) | In-context Learning (ICL) Examples
ChatGPT: In-context Retrieval-Augmented Learning (IC-RALM) | In-context Learning (ICL) Examples
Discover AI
57 Code your BLIP-2 APP: VISION Transformer (ViT) + Chat LLM (Flan-T5) = MLLM
Code your BLIP-2 APP: VISION Transformer (ViT) + Chat LLM (Flan-T5) = MLLM
Discover AI
58 Buy Microsoft "Azure OpenAI Service" or buy from OpenAI its API for ChatGPT access & tuning?
Buy Microsoft "Azure OpenAI Service" or buy from OpenAI its API for ChatGPT access & tuning?
Discover AI
59 Pretraining vs Fine-tuning vs In-context Learning of LLM (GPT-x) EXPLAINED | Ultimate Guide ($)
Pretraining vs Fine-tuning vs In-context Learning of LLM (GPT-x) EXPLAINED | Ultimate Guide ($)
Discover AI
60 Reversible Transformer: ReFORMER for GPU Memory Optimization! Reversible Residual Layers?
Reversible Transformer: ReFORMER for GPU Memory Optimization! Reversible Residual Layers?
Discover AI

This video teaches how to fine-tune and pre-train LLMs on clinical data for radiology and biomedical applications, and how to evaluate their performance on specific clinical tasks. It highlights the importance of specialized models for clinical text and the limitations of large language models for fine-tuning. The video provides a comprehensive overview of the tools and techniques used in clinical LLM development, including T5, GPT-3, and Microsoft Open AI Chat GPT.

Key Takeaways
  1. Fine-tune LLMs on clinical data
  2. Pre-train LLMs on clinical notes
  3. Evaluate LLM performance on clinical tasks
  4. Implement retrieval augmented generation for clinical LLMs
  5. Use vector stores for efficient LLM deployment
  6. Design and develop clinical LLMs
  7. Optimize LLM architecture for clinical applications
💡 Specialized models for clinical text are more parameter efficient and outperform large language models for fine-tuning, and fine-tuning is better than just in-context learning for clinical models.

Related AI Lessons

I Asked ChatGPT to Fix My Life. It Couldn’t — Until I Changed One Thing
Learn how to effectively use AI like ChatGPT to improve your life by changing your approach
Medium · AI
I Asked ChatGPT to Fix My Life. It Couldn’t — Until I Changed One Thing
Learn how to effectively use ChatGPT to solve personal problems by changing your approach
Medium · ChatGPT
Claude Sonnet 5 Is Here: Why It Might Replace Your Opus Subscription
Learn about Claude Sonnet 5, a new AI model that offers near-flagship performance at a lower price, and its potential to replace Opus subscriptions
Medium · Programming
Introducing Claude Sonnet 5 on AWS: Anthropic’s most capable Sonnet model
Learn about Claude Sonnet 5, Anthropic's most advanced Sonnet model, now available on AWS, and how it delivers top-tier intelligence for coding, agents, and professional tasks
AWS Machine Learning
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →