Prepare your data for ML | Text Classification Tutorial Pt. 1 (Coding TensorFlow)
Key Takeaways
This video tutorial covers preparing data for machine learning, specifically text classification, using TensorFlow. It focuses on getting the data ready to train a neural network and explains the unique challenges associated with text classification.
Full Transcript
[Music] [Applause] hi everybody I'm Laurence Moroney from the tensorflow team at Google and today we're going to talk about text classification it's part one of a two-part series will will focus on the data and getting it ready to train a neural network you will do this hands-on using a workbook that you can find at the link in the description below and I'll step you through it text classification has some unique challenges so before you get coding let me step you through some of these first of all neural networks typically deal with numbers are not text when learning patterns that can be used for prediction or classification so in this case we're looking at learning from movie reviews to see if those reviews are positive or negative and the first step of course is to change the words into numbers that represent them there'll be a little bit more processing of these words into vectors determining their sentiments and we'll cover that in the next video so let's get coding first first things first I'll have to check the licenses before I begin and now I'll import tensorflow and numpy I'll also use care Us and print out the version of tensorflow that I'm using okay now it's time to get the data set the IMDB set is included with care us so let's download it and let's take a look at what's in there note that in this case the nice folks that care us have done the work for us of converting the words into integers they've also sorted them into a dictionary so that lower numbers are the most common words and higher numbers are the least common words so when we loaded the specified 10,000 words this will then give us the top 10,000 words that are used across all of the reviews okay now we've loaded the data and we have our training data and labels as well as our test data and labels it's also nicely sorted into integers for us which is a great first step for learning let's see what the data looks like next first we'll look at our training data you'll see that we have a total of 25,000 items of data and 25,000 labels describing them the labels are very simple it's zero for a negative view and one for a positive one a reviews look like this it's just a long set of numbers and these are the indexes into the array of words the review will start with a 1 indicating the start of the review so the first word in the review is word number 14 which translates to the word this followed by the value 22 which translates to the word film the next bit of code is then a handy-dandy way of decoding the review note that the values zero through three are reserved with one being the start of the review as we mentioned a moment ago and zero is for padding now this is important and you'll see that in a moment I can now decode the review and see that one 14:22 other start character and this and then film it's pretty cool right now earlier I skipped over this piece of code showing me the length of the review so for example the first movie was 218 words long and the second was 189 words long now that's really awkward and it's confusing to train a neural network if all of the training data is of different lengths so let's pick a standard length for every review and if it's longer we'll trim it to that length and if it's shorter we'll pad it to that length the Charis pre-processing api's make this really easy here you can see I'm taking the training and test data and making sure it's 256 words long if I need to pad it then I'll pad it with the pad character which is the 0 that we saw earlier a quick look will now show that it worked they're all 256 words long and if I now look at my first set of training data you'll see that it's padded by zeros remember it had been 218 words long so the extras get patted out to make it 256 great our training and test data is now ready so in the next episode you'll take a look at how to design a neural network to accept this data and to train a model to determine the sentiment of movie reviews I'll see you there [Music] you [Music]
Original Description
@lmoroney is back with another episode of Coding TensorFlow! In this episode, we discuss Text Classification, which assigns categories to text documents. This is part 1 of a 2 part sub series that focuses on the data and gets it ready to train a neural network. Laurence also explains the unique challenges associated with Text Classification. Watch to follow along and stay tuned for part 2 of this episode where we’ll look at how to design a neural network to accept the data we prepared.
Hands on tutorial → http://bit.ly/2CNVMbi
Watch Part 2 https://www.youtube.com/watch?v=vPrSca-YjFg
Subscribe to TensorFlow → http://bit.ly/TensorFlow1
Watch more Coding TensorFlow → http://bit.ly/2zoZfvt
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from TensorFlow · TensorFlow · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
The TensorFlow YouTube Channel is Here!
TensorFlow
Answering Your TF Questions #AskTensorFlow
TensorFlow
Chatting With the TensorFlow Community (TensorFlow Meets)
TensorFlow
All About TensorFlow Code (Coding TensorFlow)
TensorFlow
TensorFlow: an ML platform for solving impactful and challenging problems
TensorFlow
Keynote (TensorFlow Dev Summit 2018)
TensorFlow
tf.data: Fast, flexible, and easy-to-use input pipelines (TensorFlow Dev Summit 2018)
TensorFlow
Eager Execution (TensorFlow Dev Summit 2018)
TensorFlow
Machine Learning in JavaScript (TensorFlow Dev Summit 2018)
TensorFlow
Training Performance: A user’s guide to converge faster (TensorFlow Dev Summit 2018)
TensorFlow
The Practitioner's Guide with TF High Level APIs (TensorFlow Dev Summit 2018)
TensorFlow
Distributed TensorFlow (TensorFlow Dev Summit 2018)
TensorFlow
Debugging TensorFlow with TensorBoard plugins (TensorFlow Dev Summit 2018)
TensorFlow
TensorFlow Lite (TensorFlow Dev Summit 2018)
TensorFlow
Searching Over Ideas (TensorFlow Dev Summit 2018)
TensorFlow
Reconstructing Fusion Plasmas (TensorFlow Dev Summit 2018)
TensorFlow
Nucleus: TensorFlow toolkit for Genomics (TensorFlow Dev Summit 2018)
TensorFlow
Open Source Collaboration (TensorFlow Dev Summit 2018)
TensorFlow
Swift for TensorFlow - TFiwS (TensorFlow Dev Summit 2018)
TensorFlow
TensorFlow Hub (TensorFlow Dev Summit 2018)
TensorFlow
Applied AI at The Coca-Cola Company (TensorFlow Dev Summit 2018)
TensorFlow
Real-World Robot Learning (TensorFlow Dev Summit 2018)
TensorFlow
TensorFlow Extended (TFX) (TensorFlow Dev Summit 2018)
TensorFlow
Project Magenta (TensorFlow Dev Summit 2018)
TensorFlow
TensorFlow Dev Summit 2018 - Livestream
TensorFlow
Introducing TensorFlow Lite (Coding TensorFlow)
TensorFlow
TensorFlow Dev Summit 2018 Highlights
TensorFlow
Jeff Dean, Head of AI at Google discusses the impact of ML (TensorFlow Meets)
TensorFlow
TensorFlow Mobile vs. TF Lite and More! #AskTensorFlow
TensorFlow
Using TensorFlow to enable research & production across many fields (TensorFlow Meets)
TensorFlow
Teaching TensorFlow for Deep Learning at Stanford University (TensorFlow Meets)
TensorFlow
TensorFlow Lite for Android (Coding TensorFlow)
TensorFlow
Using the tf.data API to build input pipelines (TensorFlow Meets)
TensorFlow
Training Models in the Cloud & the Benefits of AI Toolkits #AskTensorFlow
TensorFlow
Execute operations immediately with TensorFlow's Eager Execution (TensorFlow Meets)
TensorFlow
TensorFlow Lite for iOS (Coding TensorFlow)
TensorFlow
Get started with TensorFlow's High-Level APIs (Google I/O '18)
TensorFlow
TensorFlow for JavaScript (Google I/O '18)
TensorFlow
TensorFlow in production: TF Extended, TF Hub, and TF Serving (Google I/O '18)
TensorFlow
Get started with TensorFlow's High-Level APIs in 5 mins | Google I/O 2018
TensorFlow
TensorFlow and deep reinforcement learning, without a PhD (Google I/O '18)
TensorFlow
TensorFlow Lite for mobile developers (Google I/O '18)
TensorFlow
Advances in machine learning and TensorFlow (Google I/O '18)
TensorFlow
Distributed TensorFlow training (Google I/O '18)
TensorFlow
Classification using neural networks & ML regression models #AskTensorFlow
TensorFlow
TensorFlow and Keras in R - Josh Gordon meets with J.J. Allaire (TensorFlow Meets)
TensorFlow
Focus on your experiment with TensorFlow Estimators (TensorFlow Meets)
TensorFlow
How to get started with AI/ML, retraining models, & more! #AskTensorFlow
TensorFlow
TensorFlow - the deep learning solution for mobile platforms (TensorFlow Meets)
TensorFlow
MiniGo: TensorFlow Meets Andrew Jackson (TensorFlow Meets)
TensorFlow
The growth of TensorFlow with added support for JS & Swift (TensorFlow Meets)
TensorFlow
At the intersection of TensorFlow & nuclear physics (TensorFlow Meets)
TensorFlow
NVidia TensorRT: high-performance deep learning inference accelerator (TensorFlow Meets)
TensorFlow
Try TensorFlow.js in your browser (Coding TensorFlow)
TensorFlow
TensorFlow Hub: reusing machine learning modules (TensorFlow Meets)
TensorFlow
How to use TensorFlow in PyCharm (TensorFlow Tip of the Week)
TensorFlow
Training models faster with TensorFlow Hub (TensorFlow Meets)
TensorFlow
Prepare your dataset for machine learning (Coding TensorFlow)
TensorFlow
Using ML to predict insulin use for Type 1 Diabetes (TensorFlow Meets)
TensorFlow
TFX: an end-to-end machine learning platform for TensorFlow (TensorFlow Meets)
TensorFlow
More on: ML Pipelines
View skill →Related Reads
📰
📰
📰
📰
I Found the Neural Network I Built in Class 9 — Here’s What Happened When I Tried to Run It Again
Medium · Deep Learning
Introduction to Deep Learning and Neural Networks: From Human Brain to Artificial Intelligence
Medium · Deep Learning
Want to get started with deep learning
Reddit r/deeplearning
Building a Deepfake Detector From Scratch — What Nobody Tells You
Medium · Deep Learning
🎓
Tutor Explanation
DeepCamp AI