Build a sarcasm classifier using NLP and TensorFlow | Machine Learning Foundations

Google for Developers · Beginner ·📐 ML Fundamentals ·6y ago

Skills: ML Maths Basics60%

Key Takeaways

Builds a sarcasm classifier using NLP and TensorFlow for machine learning foundations

Full Transcript

Hi, and welcome back to Machine Learning Foundations, where you can learn the basics of machine learning with computer vision, natural language processing, and more. I'm Laurence Moroney from the Google AI team, and I'm here to be your guide. Over the last few episodes, we haven't done much machine learning. Instead, we looked at how you can preprocess text data to get it ready for training machine learning models. In this episode, you're going to put that knowledge to use in training a text classifier, a model which, when given a piece of text, will understand the contents of that text. You'll be working with the Sarcasm in News Headlines dataset by Rishab Mishra, which is available on his website here. This is a really fun dataset, which collects news headlines from normal news sources, as well as some more comedic ones from spoof news sites. The dataset is a JSON file with three columns. The is sarcastic one is one if the record is sarcastic, otherwise it's zero. The headline is the headline of the article, and the article link is a URL to the text of the article. We're just going to deal with the headlines here, so we have a super easy dataset to work with. The headline is our feature, and the is sarcastic is our label. The data in JSON looks a bit like this. Each entry is a JSON field with the name-value pairs showing the column and associated data. To make it easier to use in Python, I've edited the data just a little by wrapping the whole thing in square brackets and putting a comma between each line. I can then write some super simple code to load it in, and then we can start tokenizing and sequencing. And here's the code to load it in Python when it's structured like that. I'll go through this piece by piece. First, we'll import JSON so that we can use the JSON parsers in Python. Then, we'll open the sarcasm.json file. I've hosted a version with my edits that you can use. You'll see the URL for that later in the notebooks. Using JSON.load, we we load and parse the entire thing. I'll initialize arrays for the sentences, labels, and URLs. And I can now simply iterate through the data store, and for each item, I can append its headline, the sarcasm label, and URL to the appropriate array. And that's it for loading the data. In previous videos, you may recall that we had hardcoded sentences into an array of strings. We now have exactly the same data structure for the headline sentences, despite that there are now over 25,000 of them. So, for the next code, despite us using this real data set, it will look very familiar. So, let's dive in. So, here's the code to tokenize and sequence the sarcasm data set. We create a tokenizer and fit it on the sentences. In this case, the sentences are the large array of 25,000 plus sentences that we read from the sarcasm data set. We can use the tokenizer to show us the word index, so we can see what words it learned from the data set. And here's an example of some of the words. Things like "Schilling's ball", whatever that is, was tokenized at 23,055. Remember from earlier that the words with the lower numbered tokens are the ones that are more common, and the ones with the higher numbers, like our friend Schilling's ball, was less commonly used in the data set. So, of all of the words here, "hurting" is the one that was found most often. We can now turn all of our sentences into sequences, where instead of words, we have the tokens representing those words. We'll pad them post, which means that all of the sentences will be the length of whatever the longest one is, and anything shorter than that will be padded with zeros at the end of the sentences in order to keep them all the same length. If we want to inspect them, we can then print out one of them, and we can print out the shape of the entire padded data structure. You'll see output like this. This is the first sentence in our corpus after tokenizing and padding. It's a shorter sentence, so it ends with a bunch of zeros. And this is the shape of the data structure for the padding. This tells us that we have 26,709 padded sentences, and each of these is 40 values long. With just a few lines of code, you've loaded the data from sarcasm into sentence arrays, tokenized and padded them. Before we can get feeding them into an ML model, we'll take a step through a code lab where you can see all of this code running. After that, you'll be able to try it for yourself. Okay, so here's the code. The first thing that we're going to do is just download the sarcasm.json data set. And once we have that, we're going to start slicing and dicing it. We'll see that we import JSON, and then sarcasm gets downloaded to /tmp. And we'll set data store equals JSON.load this file. Then we have arrays for sentences, labels, and URLs. And for each item in this data store, we'll just take the um the either the headline will be a sentence, the is_sarcastic will be the label, and the article link will be the URL. So this, because it's in JSON, is divided line by line into these three things, and we'll be able to see those within the data store. Then finally, um we can start importing things like the tokenizer and pad sequences. We can set the tokenizer to have the out-of-vocabulary token OOV, and then we're just going to fit it on text sentences with the sentences with the headlines that we loaded. We can do the word index to be the tokenizer word index, we can print out the length of that, and print out what that looks like, and we'll show that in just a moment. The sequences are just going to be tokenizers.text_to_sequences passed to it. And then we'll pad those sequences using padding post, and we can print out the first couple. So for example, our first print here was the length of the word index, and we can see that it's 29,657. That's how many unique independent words there are in this word index. The word index itself we print out, and we can see oov is the first token cuz we specified it that way. And then the words are in frequency order. So, as you can imagine, things like two, of, the, in, four are the most common words that will be printed. We then use the text sequences to pad these and to create the pad sequences. So, we now have sequences that are padded to the length of the longest one. And here is an example of our first one that is actually padded, and you can see there's 40 characters in it. And the shape of our padded is 26,709,40. That means we have 26,709 headlines, and the maximum length of each one is 40. The first one we can see has been encoded like this, and because our padding is post, then it gets padded with zeros after the text. So, that's a quick look at how you can do this type of text preprocessing. Next, you'll see the URL so you can try this out for yourself. Well, that was pretty straightforward, I hope. And here's the URL for that code lab. So, pause the video and give it a try for yourself. After that, when you come back, we'll take a look at the next step, and what you've really been waiting for, and building a model that understands this text enough to classify future sentences as sarcastic or not. First, in the code, you'll see a number of commonly used variables. Each of these will be used throughout the code. You've seen many of them so far, but others like the embedding dimension will be clear later. The training size of 20,000 will be used next. We have a corpus of many thousands of sentences and labels, and a moment ago we specified 20,000 as the training size. So, that many sentences and labels will be the training set, and we'll hold back the other 6,000 or so as a validation set. So, our training sentences will be the complete corpus from zero to the training size, and our testing sentences will be from the training size to the end of the set. We can do similar with the labels. The training will be the first batch, and the testing will be the last ones. As we've split the data into training and testing sets, we should do the same for the padded sets, instead of having that one large master one that we had earlier on. First, we'll create a tokenizer, and we'll specify the number of words that we want, and what the out of vocabulary token should be. We'll fit the tokenizer to just the training sentence corpus. This will help us accurately reflect any real-world usage. Our testing sentences can be tested against the vocab that was learned from the training set. Now, we can create a set of training sequences from just the training sentences, and we can pad these to get a set of padded training sentences. And then we can just do the same thing for the testing sentences and for all the labels. Okay. So, before we can train a model with this, let's take a look at the concept of embeddings, which help us turn the sentiment of a word into a number, in much the same way as we tokenized words earlier. In this case, an embedding is a vector pointing in a direction, and we can use those directions to establish meanings in words. I know this is all very vague, so let me explain it visually. For example, consider the words bad and good. Now, we know they have opposite meanings, so we could draw them arrows pointing in opposite directions. We could then describe the word meh as being sort of bad, but not really that bad. So, it might be an arrow like this. And then the phrase not bad, it's not as strong as good, but it's more or less in the same direction as good. So, we could draw it with an arrow like this. If we then plot these on a chart, we could then get coordinates for these arrows. These coordinates could then be seen as embeddings for the sentiment of those words. There's no absolute meaning, but relative to each other, we can establish sentiment. To do this in code, we can simply use a Keras layer called an embedding. Our embedding should be defined as a vector for every word. So, we're going to take vocab size words and then specify how many dimensions we want the arrow direction to use. In this case, it's 16 that we created earlier. So, the embedding layer will learn 10,000 16 dimension vectors, where the direction of the vector establishes the sentiment of the word. By matching the words to the labels, it'll have a direction that it can then start learning from. Once we've defined the model, we can then train it like this. We simply specify the training padded features and labels, as well as the validation ones. Okay, now that you've seen the full thing, let's take a look at a screencast of the sarcasm model being trained and tested. After that, I'll give you the URLs you can try it for yourself. Okay, let's start looking at the code. First of all, I just want to make sure TensorFlow 2 is being used. Then I'm going to do the imports. We're importing JSON, we're importing TensorFlow, and of course, tokenizer and pad sequences so we can pre-process the text. Next, we'll just set up some of the variables that we're going to use, like the embedding dimensions and the vocab size, etc. And next up is here I've stored a version of the sarcasm data set where I've just made it a little bit more Python friendly, as we saw in the slides. I've downloaded that, it's now ready to go. So, I'm going to open that into data store and then iterate through data store and add the headlines to the sentences and add is sarcastic to the labels. This can then be split into training sentences, testing sentences, training labels, and testing labels using the training size variable that we created earlier on. That was 20,000. So, our training set's to be 20,000 and the remainder will be used for testing. I'll then use the tokenizer to fit it on the text for the training sentences only, and then get the word index out of that. Our sequences and our padding can then be created off of that tokenizer for both the training and testing sets. So, note that the testing sets are going to have a lot of out-of-vocabulary tokens in them because they'll have words that aren't in the training data set. And that's okay because that's the kind of thing that we want to use to emulate real-world scenario. I'm just here going to convert them into NumPy arrays, and now I'll define the model. We can say our model is an embedding followed by a global average pooling followed by a dense layer which has 24 neurons in it followed by a dense layer for the output with a single neuron. This is a binary classifier, so we have one neuron that's activated by sigmoid. I can summarize this as we can see here. And now I'm just going to train it for 30 epochs. Now, one of the things you'll see when it's training is that models like this tend to overfit a lot. So, we'll notice as you're looking at the training that the accuracy will start climbing very high, and it'll go to like 99% maybe in about 10 epochs. The validation accuracy, as you see, is also doing very well, but it's going to top out at something like at about 85%. But, take a look at the validation loss as you're training it. It's initially going down, but then it will eventually turn around and go up. And by the time we've done 30 epochs, our loss on the training set's going to be tiny, our loss on the validation set's going to be relatively large. And that's indicating overfitting. There's a number of techniques you can use here to avoid overfitting, and one of the best ones to use is actually the vocabulary. If your vocabulary is much too large, then you're going to be overfitting to relatively few words in the vocabulary. Cuz if you think about it, the words aren't going to be distributed evenly. The most common words will have a hockey stick curve. The most common words will be used a lot. The least common words that we use very seldom. And if your training set is training on many of the least commonly used words, you could end up in that over-fitting scenario. So, it's good for you to experiment with your vocabulary size in order to avoid over-fitting. But just to keep it simple, I'm training a pretty naive model like this one, and we can see now that we're greatly beginning to over-fit. Our loss on the training set is .03, our validation loss is .95. So, even though the validation accuracy looks pretty good, it is leading us somewhat into a false sense of security. So, by the time we're done, the accuracy is 99.3% on the training set, it's 81% on the validation set, but keep an eye on the loss. The loss on the validation set is enormous, where the loss on the training set is very small, a clear indication of over-fitting. But let's start taking a look now at some of the data in there. So, if we take a look at training padded number two, we can see that the words in it have been tokenized, and it's been padded with a ton of zeros because uh it's much shorter than the maximum length sentence. And that's another way to avoid over-fitting is to take a look at the lengths of your sentences and try and pick something optimal, so you don't end up with situations where you got tons of zeros like this. But I'm just using the default behavior. If we want to see what the original sentence was, we can do it by just looking at the training sentences, and we can see it's this one. And then we can look at the label for that. It looks like this is a sarcastic sentence because it looks like a sarcastic sentence. So, let's test with a couple of sentences ourselves. So, I have a couple of sentences, granny starting to fear spiders in the garden might be real, and Game of Thrones season finale showing this Sunday night. And we can see that the first sentence is 9.7 * 10 to the minus one, so it's 9.7, so there's a very Sorry, it's .97. So, there's a very high indication that this is sarcastic, which is true. Whereas this one is 5.48 * 10 to the minus five, a very small number, so it indicates that this probably is not sarcastic, and maybe this is dated a little bit, but maybe people will think that it is sarcastic now, but at the time this was created, obviously it wasn't. That's just a little joke. So, if we now go a little thing that we can do is to visualize the words and the embeddings that have been learned. I've added this code to allow us to do that. So, what I want to do here is create these things called vectors.tsv and meta.tsv words with the embeddings that were learned from them and the metadata around that. And then I can download these vectors.tsv and meta.tsv from Colab. Once I have them, then I can go to the embedding projector at projector.tensorflow.org and actually load them in. So, if I load, and you see it's asked for a tsv file of vectors and a tsv file of metadata. Make sure you choose the right ones. So, vecs.tsv is the vectors. meta.tsv is the metadata. Once they've loaded and I've clicked outside, we can see now a visualization of the words. Now, one of the neat things about this is if we spherize it, because this is a binary classifier, we can actually see the words clustered according to the classification that they were driving or the classification that they were labeled with. So, a lot of these words here are from are sarcastic and non-sarcastic and these ones are the same. It's up to you by looking through some of the words, maybe you can find out which ones they are. Also, when you can click on a word, you can actually see the nearest words in that vector space. So, they are words that were established to have a similar meaning in the context of sarcastic or not sarcastic. It's a really neat visualization to show how it has learned the embeddings and then based on the embeddings, if it finds a lot of words on this side, then it's most likely that those are classified sarcastic and if it finds one on this side, then they're not. That type of thing. So, have fun playing with it. Take a little look around and see what you can discover. Great. Now, you can try it for yourself. Here's the URL. Pause the video, take a look at training the model, and then come back when you're done. Hopefully, that was an interesting exploration for you into the beginnings of NLP with TensorFlow. And that brings us to the end of this 10-part series on foundations of machine learning. I hope you've enjoyed these videos and I hope you've been able to learn from them. If you have any questions, please leave them in the comments below. And if you want more videos like this, just let us know. Thank you.

Original Description

Grow your Natural Language Processing skills by creating a sentence to sequence model – using simple tools in TensorFlow. This episode is #9 of our Machine Learning Foundations free training course, teaching you the fundamentals of building machine learned models. In this lesson, you'll get a step-by-step tutorial on how to turn sentences into sequences of tokens, using sequencing APIs in java. Sentence array example → https://goo.gle/2ThBlbJ TensorFlow is Google’s end-to-end open source machine learning platform. For more videos about TensorFlow, subscribe to the TF YouTube channel → https://goo.gle/TensorFlow Machine Learning Foundations playlist → https://goo.gle/ML-Foundations Subscribe to Google Developers → https://goo.gle/developers

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Google for Developers · Google for Developers · 59 of 60

← Previous Next →

Developer Journey - Sunnyvale DSC Summit ‘19

Developer Journey - Sunnyvale DSC Summit ‘19

Google for Developers

How Google is working with students - Sunnyvale DSC Summit ‘19

How Google is working with students - Sunnyvale DSC Summit ‘19

Google for Developers

Starting your career in the Cloud - Sunnyvale DSC Summit ‘19

Starting your career in the Cloud - Sunnyvale DSC Summit ‘19

Google for Developers

The Solution Challenge - Sunnyvale DSC Summit ‘19

The Solution Challenge - Sunnyvale DSC Summit ‘19

Google for Developers

Firebase - Sunnyvale DSC Summit ‘19

Firebase - Sunnyvale DSC Summit ‘19

Google for Developers

Cloud Hero - Sunnyvale DSC Summit ‘19

Cloud Hero - Sunnyvale DSC Summit ‘19

Google for Developers

Panel discussion - Sunnyvale DSC Summit ‘19

Panel discussion - Sunnyvale DSC Summit ‘19

Google for Developers

The art of negotiation - Sunnyvale DSC Summit ‘19

The art of negotiation - Sunnyvale DSC Summit ‘19

Google for Developers

Courage to care, solve and share - Sunnyvale DSC Summit ‘19

Courage to care, solve and share - Sunnyvale DSC Summit ‘19

Google for Developers

Version 9 of Angular, Glass Enterprise Edition 2, path to DX deprecation, & more!

Version 9 of Angular, Glass Enterprise Edition 2, path to DX deprecation, & more!

Google for Developers

[DEPRECATING] Introducing a new series (Assistant for Developers Pro Tips)

[DEPRECATING] Introducing a new series (Assistant for Developers Pro Tips)

Google for Developers

Detecting memory bugs with HWASan, Bazel 2.1, Next ‘20 session guide, & more!

Detecting memory bugs with HWASan, Bazel 2.1, Next ‘20 session guide, & more!

Google for Developers

Why Podcast.app chose a .app domain name

Why Podcast.app chose a .app domain name

Google for Developers

Machine Learning Bootcamp Jakarta 2019

Machine Learning Bootcamp Jakarta 2019

Google for Developers

Android Studio 3.6, Android 11 Developer Preview, Kubeflow 1.0, & more!

Android Studio 3.6, Android 11 Developer Preview, Kubeflow 1.0, & more!

Google for Developers

[DEPRECATING] Importance of community (Assistant on Air)

[DEPRECATING] Importance of community (Assistant on Air)

Google for Developers

Why the Flutter team switched from .io to a .dev domain name

Why the Flutter team switched from .io to a .dev domain name

Google for Developers

3 website-building tips from .dev creators

3 website-building tips from .dev creators

Google for Developers

Why NimbleDroid chose a .app domain name

Why NimbleDroid chose a .app domain name

Google for Developers

Android Platform Codelab, Bazel 2.2, Maps Android Utility Library v1.0, & more!

Android Platform Codelab, Bazel 2.2, Maps Android Utility Library v1.0, & more!

Google for Developers

Google for Games Developer Summit: A free, digital experience for game developers

Google for Games Developer Summit: A free, digital experience for game developers

Google for Developers

Inspecting Home Graph (Assistant for Developers Pro Tips)

Inspecting Home Graph (Assistant for Developers Pro Tips)

Google for Developers

Google for Games Developer Summit Keynote

Google for Games Developer Summit Keynote

Google for Developers

Stadia Games & Entertainment presents: Keys to a great game pitch (Google Games Dev Summit)

Stadia Games & Entertainment presents: Keys to a great game pitch (Google Games Dev Summit)

Google for Developers

Empowering game developers with Stadia R&D (Google Games Dev Summit)

Empowering game developers with Stadia R&D (Google Games Dev Summit)

Google for Developers

Supercharging discoverability with Stadia (Google Games Dev Summit)

Supercharging discoverability with Stadia (Google Games Dev Summit)

Google for Developers

Stadia Games & Entertainment presents: Creating for content creators (Google Games Dev Summit)

Stadia Games & Entertainment presents: Creating for content creators (Google Games Dev Summit)

Google for Developers

Bringing Destiny to Stadia: A postmortem (Google Games Dev Summit)

Bringing Destiny to Stadia: A postmortem (Google Games Dev Summit)

Google for Developers

Live Captioning in Google Slides

Live Captioning in Google Slides

Google for Developers

[DEPRECATING] User engagement for the Google Assistant

[DEPRECATING] User engagement for the Google Assistant

Google for Developers

TensorFlow Dev Summit ‘20, Google for Games Dev Summit, Cloud AI Platform Pipelines, & much more!

TensorFlow Dev Summit ‘20, Google for Games Dev Summit, Cloud AI Platform Pipelines, & much more!

Google for Developers

Top 5 from the TensorFlow Dev Summit 2020

Top 5 from the TensorFlow Dev Summit 2020

Google for Developers

Developer Student Clubs 2019 Turkey Leads Summit

Developer Student Clubs 2019 Turkey Leads Summit

Google for Developers

Building simpler payment experiences | Google Pay Plugin for Magento 2

Building simpler payment experiences | Google Pay Plugin for Magento 2

Google for Developers

Become A Developer Student Club Lead

Become A Developer Student Club Lead

Google for Developers

Firebase Kotlin Extensions, ARM apps on the Android Emulator, Angular v9.1, & more!

Firebase Kotlin Extensions, ARM apps on the Android Emulator, Angular v9.1, & more!

Google for Developers

Test suite for Smart Home (Assistant for Developers Pro Tips)

Test suite for Smart Home (Assistant for Developers Pro Tips)

Google for Developers

Google Play updates, Bazel 3.0, Business Console for Google Pay, & more!

Google Play updates, Bazel 3.0, Business Console for Google Pay, & more!

Google for Developers

How to use error logs (Assistant for Developers Pro Tips)

How to use error logs (Assistant for Developers Pro Tips)

Google for Developers

Contact Center AI, Android Studio 4.1 Canary 5, TensorFlow QAT API, & more!

Contact Center AI, Android Studio 4.1 Canary 5, TensorFlow QAT API, & more!

Google for Developers

WebView DevTools, Kotlin meets gRPC, Flutter CodePen support, & more! (Episode 200)

WebView DevTools, Kotlin meets gRPC, Flutter CodePen support, & more! (Episode 200)

Google for Developers

Offline handling for Smart Home (Assistant for Developers Pro Tips)

Offline handling for Smart Home (Assistant for Developers Pro Tips)

Google for Developers

Android 11 Dev Preview 3, Google Fonts for Flutter, Shielded VM, & more!

Android 11 Dev Preview 3, Google Fonts for Flutter, Shielded VM, & more!

Google for Developers

Machine Learning Foundations: Ep #1 - What is ML?

Machine Learning Foundations: Ep #1 - What is ML?

Google for Developers

Flutter web support updates, BigQuery materialized views, Cloud Spanner emulator, & more!

Flutter web support updates, BigQuery materialized views, Cloud Spanner emulator, & more!

Google for Developers

Computer vision by building a neural network with TensorFlow | Machine Learning Foundations

Computer vision by building a neural network with TensorFlow | Machine Learning Foundations

Google for Developers

Machine Learning Foundations: Ep #3 - Convolutions and pooling

Machine Learning Foundations: Ep #3 - Convolutions and pooling

Google for Developers

Android 11 Beta plans, Flutter 1.17, Dart 2.8, & much more!

Android 11 Beta plans, Flutter 1.17, Dart 2.8, & much more!

Google for Developers

Machine Learning Foundations: Ep #4 - Coding with Convolutional Neural Networks

Machine Learning Foundations: Ep #4 - Coding with Convolutional Neural Networks

Google for Developers

Google Developers ML Summit

Google Developers ML Summit

Google for Developers

Real-world image classification using convolutional neural networks | Machine Learning Foundations

Real-world image classification using convolutional neural networks | Machine Learning Foundations

Google for Developers

Adobe XD support for Flutter, Architecture Framework, temporary closures with Places API, & more!

Adobe XD support for Flutter, Architecture Framework, temporary closures with Places API, & more!

Google for Developers

Machine Learning Foundations: Ep #6 - Convolutional cats and dogs

Machine Learning Foundations: Ep #6 - Convolutional cats and dogs

Google for Developers

Machine Learning Foundations: Ep #7 - Image augmentation and overfitting

Machine Learning Foundations: Ep #7 - Image augmentation and overfitting

Google for Developers

Announcing Firebase Live, Flutter Day, Java 11 on Google Cloud Functions, & more!

Announcing Firebase Live, Flutter Day, Java 11 on Google Cloud Functions, & more!

Google for Developers

Machine Learning Foundations: Ep #8 - Tokenization for Natural Language Processing

Machine Learning Foundations: Ep #8 - Tokenization for Natural Language Processing

Google for Developers

Android 11 Beta, Google Play Asset Delivery, Firebase Crashlytics SDK, & much more!

Android 11 Beta, Google Play Asset Delivery, Firebase Crashlytics SDK, & much more!

Google for Developers

Natural Language Processing: Using sequencing APIs in TensorFlow | Machine Learning Foundations

Natural Language Processing: Using sequencing APIs in TensorFlow | Machine Learning Foundations

Google for Developers

Build a sarcasm classifier using NLP and TensorFlow | Machine Learning Foundations

Build a sarcasm classifier using NLP and TensorFlow | Machine Learning Foundations

Google for Developers

AR Realism with the ARCore Depth API

AR Realism with the ARCore Depth API

Google for Developers

More on: ML Maths Basics

View skill →

Coding the GARCH Model : Time Series Talk

Coding the GARCH Model : Time Series Talk

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

ChethanAIChronicles

“Hello, world” from scratch on a 6502 — Part 1

“Hello, world” from scratch on a 6502 — Part 1

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

ROC and AUC in R

ROC and AUC in R

StatQuest with Josh Starmer

Related AI Lessons

Mastering TypeScript — Understanding the TypeScript Compiler (tsc) from Scratch — Lesson 2

Learn the basics of the TypeScript compiler to write better JavaScript code

Medium · JavaScript

Stop Overfitting With Basically One Line of Code

Learn to prevent overfitting with a simple code tweak and understand the difference between Ridge and Lasso regression

Stop Overfitting With Basically One Line of Code

Learn to prevent overfitting in machine learning models with a simple code tweak and understand the difference between Ridge and Lasso regression

Medium · Machine Learning

Stop Overfitting With Basically One Line of Code

Prevent overfitting in models with a simple code tweak, understanding the difference between Ridge and Lasso regression

Medium · Data Science

Learn Deep Learning by Hand (Beginner's Guide - Part 1)