Applications of Machine Learning to COVID-19 Research with Isaac Godfried

Weights & Biases · Beginner ·📐 ML Fundamentals ·6y ago

Skills: ML Maths Basics90%LLM Foundations80%Fine-tuning LLMs80%Prompt Craft70%

Key Takeaways

The video discusses the application of machine learning to COVID-19 research, focusing on the use of transformers, BERT embeddings, and semantic search to analyze a dataset of 45,000 scholarly articles, with tools such as Allen AI Research Challenge, CoronaY Slack, BERT, and uMap.

Full Transcript

hi everyone I hope everyone's doing well in quarantine or social distancing right now not going to stir crazy okay so um praia love you are already familiar at this point with the kind of Allen NLP well the Allen AI research challenge on coronavirus but anyways if you are and I'll just very briefly say they basically released this large data set around 45,000 scholarly articles and the idea is to kind of look and see which ones are the most useful for answering specific questions about the virus so you can read this later if you haven't already seen it I won't go into too much more details but one of the really cool things I think that came out of this kind of immediately unlike a lot of other Kegel challenges was there was immediately kind of a cooperative effort instead of a really competitive effort from the get-go so this group kind of corn called Corona Y formed and now we have over 500 members I believe on our slack and and a lot of people are exposed to be working in teams in a very organized way to try to address some of the real problem problems and the progress we can kind of make on this issue so I found that very interesting if you're interested in joining the group we're always looking for new people home and I can send you over the slack later specifically what I'm working on and for some of you this may be kind of review though is is looking at kind of forming good sentence embeddings or good general-purpose embedding so it to do semantic search on this corpus one of my kind of interests for a while has been using transformers to kind of do effective representation learning some of that has actually been even on the time series side but returning more to the NLP side for the moment I really wanted to see if I could find some good useful representations and one of the really challenging things I think about this task and in a sense is that we have we have no real evaluation metrics all evaluations qualitative and really have to rely kind of on in a minute moreover I guess many of us don't even really know what would be good results so we really have to rely on experts to evaluate what what the results are and if they make sense but I just wanted to kind of develop using kind of these embeddings and my knowledge of clustering a good way to quickly cluster things and make it so that the experts could see if you know those things kind of makes sense so so yeah I gotta wrote out this notebook very quickly um the the top are probably are all used to just downloading installing my main question they're just kind of with this notebook are wrong betting's useful you know how can we construct a an efficient semantic search using these embeddings because there's always a trade-off between doing like a full semantic search and the memory required which I found all too well when this started repeating lis crashing due to lack of RAM and then the other big thing is like as I said what does the embedding space look like if we display the embedding space and have experts look at it what can they tell us whether it makes sense and specifically since LaVon and did ask me to say this earlier um I will say it right now there's a lot of like machine learning going around with people not really understanding the problem space and not always understanding how it impacts impact stuff in a clinical sense or in a medical sense I know I don't know that by myself so I always want to try to rely on medical experts to try to evaluate my results and look at those and I think you should follow all good machine learning best practices but then also in addition really try to collaborate and we know form these cross team collaborations because we can't solve it on our own as machine learning experts we need that expert advice so without further ado I'll just quickly run through some of this um so as I say Wantage at this point I just kinda want to see how these vanilla Seibert embeddings performs so I just essentially loaded the model I did a very naive embedding method I basically took that across all word and betting's just later I'll show you how I refine this a bit and then you know I did some basically basic cosine similarity scores some of these seem to actually give meaning kind of here we do see like a high correlation for instance between compliance and MERS coronavirus and a random word you know there's still a high correlation so obviously that's not great moving through kind of just did some of her helper functions and then why I really wanted to do is I said plot in the embedding space so I actually used u Mac which I find really useful I kind of like it's one of my go-to dimensionality reduction techniques so with that I kind of just plotted the article title embeddings just using this naive method and I was kind of nice to see at least in my own on expert opinion having just said that that like you know certain things do seem to form like distinct kind of patterns on the cluster like here we can see like health capacity management I know that's kind of going off screen but um that's pretty much the only sing the only sing uh in the kind of this area then if we look at something like the top of the cluster then you see there's similar kinds of article titles grouped together in this part though obviously we'd want to get like an epidemiologist or a biochemist to actually thoroughly evaluate if these make a lot of sense I'm so moving down through my day couple more clusters then I did it kind of a semantic search on the various titles so one of the problems as I said from the get-go these are 768 dimensional embeddings that are returned by the bert model so they take up a lot of space so it's just not practical to really do a full search of the corpus and because i was limited only embedding 200 articles i think some of the results weren't that great to begin with because I can only essentially embed it in 200 were chunks due to the or 200 200 article chunks due to that of the titles so so yeah that was definitely a limitation I did tribe as a possible kind of unsupervised evaluation metric specifically I thought you know if we have two kind of different queries like one is current overt virus person-to-person transmission mechanics and the other one is corona virus infection infection origin and transmission from animals these these are actually two fairly different questions from kind of you know a research standpoint so why am normal search engine might return those have those returns similar results ideally you'd want them to return very different results so what I did is I took those two queries then I embedded the ten resort turn results and you can see that these aren't very good because like ideally we'd still see like distinct I guess distinct areas and the kind of embedding space where the different search results should return just qualitatively and they're kind of mixed there's kind of even some overlapping ones but again this is kind of just on the the partial kind of corpus and not the full the full one just about 200 articles in the search so a little bit's kind of understandable later on I kind went to embedding abstracts which of course full abstracts which was even more RAM intensive unfortunately what I did I did finally combine it with what was called the b25 diem25 index which is kind of a more kind of vanilla search algorithm similar to tf-idf with a few slight variations and one of the things I found is that when I combine that on the search abstracts with that and have it return a little list of twenty results on the four full like forty five thousand articles and then reweighed those results with semantic search I did actually get more distinct clusters so for instance here's like coronavirus human to bat transmission Cagle can cut off some of the edge there but and here's COPD nineteen person-to-person transmission and all these though these aren't perfect you can see there's kind of like these abstracts do you like form I guess their own kind of distinct pattern in the embedding space and there is some differentiation between the two unlike the other one that where they were just kind of overlapping so that was kind of my first attempt I came up with these conclusions and next steps so one of the things I've looked at most recently was then fine tuning actually a sentence transformer model on med and Ally which is one of the which is essentially as well it's a natural language inference data set that's not actually mine which is a natural language inference data set but can be used to like gauge how similar sentences are together based on the labels in that so I fine-tune that as full sentence transformer model to prove and this mile is actually nice because it produces full kind of sentence embeddings I haven't done the full clustering analysis on it yet but from what I've seen from the initial results at least qualitatively on a few things like with for instance you know bats a human transmission and camel to human transmission mechanism it rates it like for instance a fairly high similarity score which I think would be good and then for instance if you're looking at like treatment efficiency a cork line on COPD patients and back to human transmission coronavirus it rates it with the fairly lower similarity score which which we want because those are essentially two queries asking very different questions so I've seen qualitatively just on this basic analysis is that it's seems to be performing a lot better I guess um okay I think that covers most of what I was going to go over as I said it is definitely an interesting project and yeah that was kind of a bit informal but I always just asked you I think two days ago or a day ago to prepare this so hopefully it's common sense to people happy to answer any questions this is great thank you so much a few questions coming in already Casey the chat I'll pull it up this is the hardest part to figure out if you need stopping the shame okay okay so um can we use you map for other things than you have used it um yeah I mean I think yeah you can use you to map for any type of clustering so anytime you have in betting's or you want to do dimensionality reduction you can use you map it actually serves as kind of a good dimension I was thinking of also actually using it I guess to maybe reduce the dimensionality of those 768 dimensional vectors to maybe take up a little less memory but it's a good just dimensionality reduction technique in general and yeah I can definitely add some article links to it I think I already linked to it in a couple places in my notebook but yeah that's a really good question about the RAM intensiveness so so yeah these models are kind of hard at scale so that's why I think most people do use some kind of initial search index where you return an initial list of results before doing the kind of similarity scores which is what I was looking at there are ways I guess as I said to maybe try to use you map to reduce the dimensions of the embedded text so that can definitely help to I haven't really studied entirely at this point but yeah it is definitely is a question between how much RAM and resources are available and then how good you want the search results to be so that's one of those like kind of real-world trade-offs you have to weigh if you have any more questions by 16 verbally cinnamon chat yeah yeah I think you tell us a little bit more about some of the stuff that the corona Y group yeah sure so actually um yeah corona why it's kind of create four different main tasks one focused on you can see them all I'm kind of like Corona Y page one focused on kind of geography and how Geographic factors influence the virus and ever focused on transmission specifically another focused on vaccines and of our therapeutic s-- and the fourth is on various risk factors associated with it so so yeah there's those four kind of core tasks which people are doing very specific kind of NLP efforts on so for instance on the G of section they're extracting specifically like named entities from the medical unnamed entities of like locations and sublevel stuff about you know countries and then combine that with like geographic data to look at how you know geography impacts in this unit spread at least from the literature then like on say like the vaccines and therapeutics they're looking at specifically extracting the vaccine and you know therapeutics info so yeah they're kind of multiple efforts going on right now I've been more focused on kind of the common effort which is kind of define general models that could work across all tasks so that's where those kind of sentence embeddings come in but uh well yeah it's just kind of a definitely an interesting group and a lot of cool things going on with it uh do you know what the link to the psych mean if I can drop it in the chat if you give it the link to the slack yeah I can send out yeah I can get your link gotcha and then I also posted a link to our slack community I see two more questions one for money one from Jonathan okay um yes sure so Jonathan have you considered it have you considered visualizing any attention components for your transformers yeah I think that I could definitely be useful I didn't do it too much in that notebook but yeah I think it would be useful to see like kind of which which words are being kind of weighed kind of weighed in the language model when embedding the and when creating the embedding so that definitely would be a good thing did you see like which particularly which tokens and if it's attending to something like coronavirus or COPD 19 more that would be helpful to know but yeah that would be a good next step to so if a question that piggybacks off of that so I'm actually building this into 18 bases right now attention mechanisms a way to visualize them what are you using right now to visualize your attention or like for other projects because you haven't used in this yeah so for attention right now I kind of try to use heat maps and stuff between kind of the input you know whatever the input is and whatever the output sequence is so I think that's the big one right now I guess you could also look at specific context vectors and kind of visualizing those could also definitely be helpful so are you using Class C maps right now - should I say heat maps um I actually haven't heard of them them specifically right now I've kind of done some of my own embedding kind of visualizations of kind of the activations but I might look into them I haven't done too much into the actual kind of visualizations but I think that could definitely help with interpretability so there was another question on dimensionality reduction would you translate it to feature selection is that right oh yeah I mean it's kind of related to that it's basically just a you map pca t sine they all take kind of like a very high dimensional vector and then they try to find you know the the parts of it that really stick out and like define it in the kind of embedding space and simple terms and the map it said that use those to map to the low dimensional embedding space

Original Description

Isaac Godfried is a machine learning engineer at Monster where his main focus is to remove barriers related to the use of deep learning in industry. As part of our Virtual Deep Learning Salon he shared how he's applying machine learning to the the COVID-19 dataset and how we can do this responsibly.

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Weights & Biases · Weights & Biases · 44 of 60

← Previous Next →

0. What is machine learning?

0. What is machine learning?

Weights & Biases

1. Build Your First Machine Learning Model

1. Build Your First Machine Learning Model

Weights & Biases

Intro to ML: Course Overview

Intro to ML: Course Overview

Weights & Biases

2. Multi-Layer Perceptrons

2. Multi-Layer Perceptrons

Weights & Biases

3. Convolutional Neural Networks

3. Convolutional Neural Networks

Weights & Biases

Weights & Biases at OpenAI

Weights & Biases at OpenAI

Weights & Biases

Why Experiment Tracking is Crucial to OpenAI

Why Experiment Tracking is Crucial to OpenAI

Weights & Biases

4. Autoencoders

4. Autoencoders

Weights & Biases

5. Sentiment Analysis

5. Sentiment Analysis

Weights & Biases

6. Recurrent Neural Networks [RNNs]

6. Recurrent Neural Networks [RNNs]

Weights & Biases

7. Text Generation using LSTMs and GRUs

7. Text Generation using LSTMs and GRUs

Weights & Biases

8. Text Classification Using Convolutional Neural Networks

8. Text Classification Using Convolutional Neural Networks

Weights & Biases

9. Hybrid LSTMs [Long Short-Term Memory]

9. Hybrid LSTMs [Long Short-Term Memory]

Weights & Biases

Toyota Research Institute on Experiment Tracking with Weights & Biases

Toyota Research Institute on Experiment Tracking with Weights & Biases

Weights & Biases

Weights and Biases - Developer Tools for Deep Learning

Weights and Biases - Developer Tools for Deep Learning

Weights & Biases

Introducing Weights & Biases

Introducing Weights & Biases

Weights & Biases

10. Seq2Seq Models

10. Seq2Seq Models

Weights & Biases

11. Transfer Learning for Domain-Specific Image Classification with Small Datasets

11. Transfer Learning for Domain-Specific Image Classification with Small Datasets

Weights & Biases

12. One-shot learning for teaching neural networks to classify objects never seen before

12. One-shot learning for teaching neural networks to classify objects never seen before

Weights & Biases

13. Speech Recognition with Convolutional Neural Networks in Keras/TensorFlow

13. Speech Recognition with Convolutional Neural Networks in Keras/TensorFlow

Weights & Biases

14. Data Augmentation | Keras

14. Data Augmentation | Keras

Weights & Biases

15. Batch Size and Learning Rate in CNNs

15. Batch Size and Learning Rate in CNNs

Weights & Biases

Applied Deep Learning Fellowship Overview and Project Selection with Josh Tobin (2019)

Applied Deep Learning Fellowship Overview and Project Selection with Josh Tobin (2019)

Weights & Biases

Grading Rubric for AI Applications with Sergey Karayev (2019)

Grading Rubric for AI Applications with Sergey Karayev (2019)

Weights & Biases

16. Video Frame Prediction using CNNs and LSTMs (2019)

16. Video Frame Prediction using CNNs and LSTMs (2019)

Weights & Biases

Image to LaTeX - Applied Deep Learning Fellowship (2019)

Image to LaTeX - Applied Deep Learning Fellowship (2019)

Weights & Biases

17. Build and Deploy an Emotion Classifier (2019)

17. Build and Deploy an Emotion Classifier (2019)

Weights & Biases

Applied Deep Learning - Data Management with Josh Tobin (2019)

Applied Deep Learning - Data Management with Josh Tobin (2019)

Weights & Biases

Snorkel: Programming Training Data with Paroma Varma of Stanford University (2019)

Snorkel: Programming Training Data with Paroma Varma of Stanford University (2019)

Weights & Biases

Applied Deep Learning - Troubleshooting and Debugging with Josh Tobin (2019)

Applied Deep Learning - Troubleshooting and Debugging with Josh Tobin (2019)

Weights & Biases

Troubleshooting and Iterating ML Models with Lee Redden (2019)

Troubleshooting and Iterating ML Models with Lee Redden (2019)

Weights & Biases

Designing a Machine Learning Project with Neal Khosla (2019)

Designing a Machine Learning Project with Neal Khosla (2019)

Weights & Biases

Lukas Beiwald on ML Tools and Experiment Management (2019)

Lukas Beiwald on ML Tools and Experiment Management (2019)

Weights & Biases

Building Machine Learning Teams with Josh Tobin (2019)

Building Machine Learning Teams with Josh Tobin (2019)

Weights & Biases

Pieter Abeel on Potential Deep Learning Research Directions (2019)

Pieter Abeel on Potential Deep Learning Research Directions (2019)

Weights & Biases

Testing and Deployment of Deep Learning Models with Josh Tobin (2019)

Testing and Deployment of Deep Learning Models with Josh Tobin (2019)

Weights & Biases

Five Lessons for Team-Oriented Research with Peter Welder (2019)

Five Lessons for Team-Oriented Research with Peter Welder (2019)

Weights & Biases

Applied Deep Learning - Rosanne Liu on AI Research (2019)

Applied Deep Learning - Rosanne Liu on AI Research (2019)

Weights & Biases

Making the Mid-career Leap from Urban Design to Deep Learning/Data Science

Making the Mid-career Leap from Urban Design to Deep Learning/Data Science

Weights & Biases

Organizing ML projects — W&B walkthrough (2020)

Organizing ML projects — W&B walkthrough (2020)

Weights & Biases

Brandon Rohrer — Machine Learning in Production for Robots

Brandon Rohrer — Machine Learning in Production for Robots

Weights & Biases

Nicolas Koumchatzky — Machine Learning in Production for Self-Driving Cars

Nicolas Koumchatzky — Machine Learning in Production for Self-Driving Cars

Weights & Biases

My experiments with Reinforcement Learning with Jariullah Safi

My experiments with Reinforcement Learning with Jariullah Safi

Weights & Biases

Applications of Machine Learning to COVID-19 Research with Isaac Godfried

Applications of Machine Learning to COVID-19 Research with Isaac Godfried

Weights & Biases

Testing Machine Learning Models with Eric Schles

Testing Machine Learning Models with Eric Schles

Weights & Biases

How Linear Algebra is not like Algebra with Charles Frye

How Linear Algebra is not like Algebra with Charles Frye

Weights & Biases

Predicting Protein Structures using Deep Learning with Jonathan King

Predicting Protein Structures using Deep Learning with Jonathan King

Weights & Biases

Rachael Tatman — Conversational AI and Linguistics

Rachael Tatman — Conversational AI and Linguistics

Weights & Biases

Reformer by Han Lee

Reformer by Han Lee

Weights & Biases

Sequence Models with Pujaa Rajan

Sequence Models with Pujaa Rajan

Weights & Biases

GitHub Actions & Machine Learning Workflows with Hamel Husain

GitHub Actions & Machine Learning Workflows with Hamel Husain

Weights & Biases

Look Mom, No Indices! Vector Calculus with the Fréchet Derivative by Charles Frye

Look Mom, No Indices! Vector Calculus with the Fréchet Derivative by Charles Frye

Weights & Biases

Jack Clark — Building Trustworthy AI Systems

Jack Clark — Building Trustworthy AI Systems

Weights & Biases

Surprising Utility of Surprise: Why ML Uses Negative Log Probabilities - Charles Frye

Surprising Utility of Surprise: Why ML Uses Negative Log Probabilities - Charles Frye

Weights & Biases

Track your machine learning experiments locally, with W&B Local - Chris Van Pelt

Track your machine learning experiments locally, with W&B Local - Chris Van Pelt

Weights & Biases

Antipatterns in open source research code with Jariullah Safi

Antipatterns in open source research code with Jariullah Safi

Weights & Biases

Attention for time series forecasting & COVID predictions - Isaac Godfried

Attention for time series forecasting & COVID predictions - Isaac Godfried

Weights & Biases

Made with ML - Goku Mohandas

Made with ML - Goku Mohandas

Weights & Biases

Angela & Danielle — Designing ML Models for Millions of Consumer Robots

Angela & Danielle — Designing ML Models for Millions of Consumer Robots

Weights & Biases

Deep Learning Salon by Weights & Biases

Deep Learning Salon by Weights & Biases

Weights & Biases

The video teaches how to apply machine learning to COVID-19 research using transformers, BERT embeddings, and semantic search, and how to collaborate with medical experts for better results. It covers the use of tools such as Allen AI Research Challenge, CoronaY Slack, BERT, and uMap.

Key Takeaways

Download and install the necessary tools and datasets
Load the BERT model and use a naive embedding method
Compute cosine similarity scores and plot article title embeddings in a 2D space
Do a semantic search on 200 article chunks
Fine-tune a sentence transformer model for sentence embeddings
Use attention mechanisms to visualize token weights
Apply dimensionality reduction techniques for feature selection

💡 The use of machine learning and natural language processing techniques can aid in the analysis of large datasets and provide valuable insights for COVID-19 research, and collaboration between machine learning experts and medical experts is crucial for achieving better results.

🔒 Pro feature: Ask AI to explain this lesson →

More on: ML Maths Basics

View skill →

Coding the GARCH Model : Time Series Talk

Coding the GARCH Model : Time Series Talk

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

ChethanAIChronicles

“Hello, world” from scratch on a 6502 — Part 1

“Hello, world” from scratch on a 6502 — Part 1

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

ROC and AUC in R

ROC and AUC in R

StatQuest with Josh Starmer

Related AI Lessons

Stop Overfitting With Basically One Line of Code

Learn to prevent overfitting with a simple code tweak and understand the difference between Ridge and Lasso regression

Stop Overfitting With Basically One Line of Code

Learn to prevent overfitting in machine learning models with a simple code tweak and understand the difference between Ridge and Lasso regression

Medium · Machine Learning

Stop Overfitting With Basically One Line of Code

Prevent overfitting in models with a simple code tweak, understanding the difference between Ridge and Lasso regression

Medium · Data Science

Stop Overfitting With Basically One Line of Code

Learn to prevent overfitting in machine learning models with a simple code tweak, comparing Ridge and Lasso regression techniques

Medium · Python

Learn Deep Learning by Hand (Beginner's Guide - Part 1)