TensorFlow high-level APIs: Part 1 - loading data
Key Takeaways
This video covers the basics of TensorFlow high-level APIs, focusing on loading and preparing data for machine learning models using the TensorFlow CSV dataset and eager execution.
Full Transcript
[Music] hi and welcome to coding tensorflow I'm Carmel Allison and I'm here to guide you through a scenario using tensor flows high-level api's this video is the first in a three-part series in this we'll look at data and in particular how to prepare and load your data for machine learning the rest of the series is available on this channel so don't forget to hit that subscribe button building a machine learning model is a multi-stage process you have to collect clean and process your data prototype and iterate on your model architecture train and evaluate results prepare your model for production serving and then you have to do it all over again because the model is a living thing that will have to be updated and improved tensorflow high-level api's aim to help you at each stage of your models lifecycle from the beginning of an idea to training and serving large scale applications in this series I will walk through the key steps in developing a machine learning model and show you what tensorflow provides for you at each step and I'll also cover some of the new developments that we are working on to continue to improve your workflow we start with the problem and associated data set we will use the cover type data set from the US Forestry Service and Colorado State University which has about 500,000 rows of geophysical data collected from particular regions in national forest areas we are going to use the features in this data set to try to predict the soil type that was found in each region and there are a mix of features that we'll be working with some are real values elevation slope aspect and so on some our real values that have been binned and an 8-bit scale and some are categorical values that assign integers to soil types and wilderness area names if we inspect the first couple rows of our data this is what we see integers no header so we have to work from the info file ok so here we can see that we have some of our real values and it looks like some of the categorical values are one hot encoded and some are just categories some features band multiple cells so we'll have to handle that where do we start what's the first thing we should do here I'm going to suggest to you that when you're prototyping a new model in tensorflow the very first thing you should do is enable eager execution it's simple you just add a single line after importing tensorflow and you're good to go the way it does that is rather than deferring execution of your tensor flow graph it runs ops immediately the result is that you can write your models in Iger while you're experimenting and iterating but you still get the full benefit of tensor flow graph execution when it comes time to train and deploy your model at scale the first thing we're going to want to do is load our data in and process the data and column so that we can feed it into a model the data is a CSV file with 55 columns of integers we'll go over each of those in detail in a bit but first we will use the tensorflow CSV data set to load our data from disk this particular data set doesn't have a header but if it did we could process that as well with the CSV data set now a tensorflow data set is similar to a numpy array or a panda's data frame and that it reads and processes data but instead of being optimized for in-memory analysis it is designed to take data run the set of operations that are necessary to process and consume that data for training here we are telling tensorflow to read our data from disk parse the CSV and process the incoming data as a vector of 55 integers because we are running with eager execution enabled our data set here does already represent our data and we can even check to see what each row currently looks like if we take the first row we can see that right now each row is a tupple of 55 integer tensors not yet processed batch or even split into features and labels so we have tuples of 55 integers but we want our data to reflect the structure of the data we know is in there for that we can write a function to apply to our data set row by row this function will take in the tupple of 55 integers in each row a data set is expected to return tuples of features and labels so our goal with each row is to parse the row and return the set of features we care about plus a class label so what needs to go in between here this function is going to be applied at runtime to each row of data but it will be applied efficiently by tensorflow datasets so this is a good place to put things like image processing or adding random noise or other special transformations in our case we handle most of our transformations using feature columns which I will explain more in a bit so our main goal in the parsing function is to make sure we correctly separate and group our columns of features so for example if you read over the details of the data set you will see that the soil type is a categorical feature that is one hot encoded it is spread out over 40 of our integers we combine those here into a single length 40 tensor so that we can learn soil type as a single feature rather than 40 independent features then we can combine the soil type tensor with the other features which are spread out over the set of 55 columns in the original data set we can splice the tupple of incoming values to make sure we get everything we need and then we zip those up with human readable column names to get a dictionary of features that we can process further later finally we convert our one hot encoded wilderness area class into a class label that is in the range 0 to 3 we could leave them one hot encoded as well and for some model architectures or loss calculations that might be preferable and that gives us features and a label for each row we then map this function to our data row wise and then we batch the rows in two sets of 64 examples using tensorflow data sets here allows us to take advantage of many built-in performance optimizations that data sets provide for this type of mapping and batching to help remove io bottlenecks there are many other tricks for i/o performance optimization depending on your system that we won't cover here but a guide is included in the description below because we are using eager execution we can check to see what our data looks like after this and you can see that now we have parse dictionaries of intz with nice human readable names each feature has been batched so a feature that is a single number is a length 64 tensor and we can see that our conversion of soil type results in a tensor with a shape of 64 by 40 we can also see that we have a single tensor for the class labels which has the category indices as expected just to keep our eyes on the big picture here let's see where we are we've taken our raw data and put it into a tensor flow data set that generates dictionaries of feature tensors and labels but something is still wrong with the integers we have AZ features here anyone care to venture a guess we have lots of feature types our continuous summer categorical summer one hot encoded we need to represent these in a way that is meaningful to an ml model you'll see how to fix that using feature columns in part two of this series right here on YouTube so don't forget to hit that subscribe button and I'll see you there [Music] [Applause] you
Original Description
Welcome to Part 1 of our mini-series on TensorFlow high-level APIs! In this 3 part mini-series, TensorFlow Engineering Manager Karmel Allison runs us through different scenarios using TensorFlow’s high-level APIs. Building a ML model takes a lot of time, effort, and often involves multiple stages. Luckily, TensorFlow high-level APIs aim to help you along with each stage, from the start of your idea, to training and serving large scale applications. Watch to discover the key steps in developing machine learning models, where TensorFlow comes in for each step, and lastly how to prepare and load your data!
Learn more about TensorFlow high-level APIs → http://bit.ly/2zETMOK
Want to watch more? → http://bit.ly/Coding-TensorFlow
Subscribe to the channel to catch new episodes of Coding TensorFlow → https://goo.gl/ht3WGe
And...stay tuned for Part 2 & 3!
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from TensorFlow · TensorFlow · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
The TensorFlow YouTube Channel is Here!
TensorFlow
Answering Your TF Questions #AskTensorFlow
TensorFlow
Chatting With the TensorFlow Community (TensorFlow Meets)
TensorFlow
All About TensorFlow Code (Coding TensorFlow)
TensorFlow
TensorFlow: an ML platform for solving impactful and challenging problems
TensorFlow
Keynote (TensorFlow Dev Summit 2018)
TensorFlow
tf.data: Fast, flexible, and easy-to-use input pipelines (TensorFlow Dev Summit 2018)
TensorFlow
Eager Execution (TensorFlow Dev Summit 2018)
TensorFlow
Machine Learning in JavaScript (TensorFlow Dev Summit 2018)
TensorFlow
Training Performance: A user’s guide to converge faster (TensorFlow Dev Summit 2018)
TensorFlow
The Practitioner's Guide with TF High Level APIs (TensorFlow Dev Summit 2018)
TensorFlow
Distributed TensorFlow (TensorFlow Dev Summit 2018)
TensorFlow
Debugging TensorFlow with TensorBoard plugins (TensorFlow Dev Summit 2018)
TensorFlow
TensorFlow Lite (TensorFlow Dev Summit 2018)
TensorFlow
Searching Over Ideas (TensorFlow Dev Summit 2018)
TensorFlow
Reconstructing Fusion Plasmas (TensorFlow Dev Summit 2018)
TensorFlow
Nucleus: TensorFlow toolkit for Genomics (TensorFlow Dev Summit 2018)
TensorFlow
Open Source Collaboration (TensorFlow Dev Summit 2018)
TensorFlow
Swift for TensorFlow - TFiwS (TensorFlow Dev Summit 2018)
TensorFlow
TensorFlow Hub (TensorFlow Dev Summit 2018)
TensorFlow
Applied AI at The Coca-Cola Company (TensorFlow Dev Summit 2018)
TensorFlow
Real-World Robot Learning (TensorFlow Dev Summit 2018)
TensorFlow
TensorFlow Extended (TFX) (TensorFlow Dev Summit 2018)
TensorFlow
Project Magenta (TensorFlow Dev Summit 2018)
TensorFlow
TensorFlow Dev Summit 2018 - Livestream
TensorFlow
Introducing TensorFlow Lite (Coding TensorFlow)
TensorFlow
TensorFlow Dev Summit 2018 Highlights
TensorFlow
Jeff Dean, Head of AI at Google discusses the impact of ML (TensorFlow Meets)
TensorFlow
TensorFlow Mobile vs. TF Lite and More! #AskTensorFlow
TensorFlow
Using TensorFlow to enable research & production across many fields (TensorFlow Meets)
TensorFlow
Teaching TensorFlow for Deep Learning at Stanford University (TensorFlow Meets)
TensorFlow
TensorFlow Lite for Android (Coding TensorFlow)
TensorFlow
Using the tf.data API to build input pipelines (TensorFlow Meets)
TensorFlow
Training Models in the Cloud & the Benefits of AI Toolkits #AskTensorFlow
TensorFlow
Execute operations immediately with TensorFlow's Eager Execution (TensorFlow Meets)
TensorFlow
TensorFlow Lite for iOS (Coding TensorFlow)
TensorFlow
Get started with TensorFlow's High-Level APIs (Google I/O '18)
TensorFlow
TensorFlow for JavaScript (Google I/O '18)
TensorFlow
TensorFlow in production: TF Extended, TF Hub, and TF Serving (Google I/O '18)
TensorFlow
Get started with TensorFlow's High-Level APIs in 5 mins | Google I/O 2018
TensorFlow
TensorFlow and deep reinforcement learning, without a PhD (Google I/O '18)
TensorFlow
TensorFlow Lite for mobile developers (Google I/O '18)
TensorFlow
Advances in machine learning and TensorFlow (Google I/O '18)
TensorFlow
Distributed TensorFlow training (Google I/O '18)
TensorFlow
Classification using neural networks & ML regression models #AskTensorFlow
TensorFlow
TensorFlow and Keras in R - Josh Gordon meets with J.J. Allaire (TensorFlow Meets)
TensorFlow
Focus on your experiment with TensorFlow Estimators (TensorFlow Meets)
TensorFlow
How to get started with AI/ML, retraining models, & more! #AskTensorFlow
TensorFlow
TensorFlow - the deep learning solution for mobile platforms (TensorFlow Meets)
TensorFlow
MiniGo: TensorFlow Meets Andrew Jackson (TensorFlow Meets)
TensorFlow
The growth of TensorFlow with added support for JS & Swift (TensorFlow Meets)
TensorFlow
At the intersection of TensorFlow & nuclear physics (TensorFlow Meets)
TensorFlow
NVidia TensorRT: high-performance deep learning inference accelerator (TensorFlow Meets)
TensorFlow
Try TensorFlow.js in your browser (Coding TensorFlow)
TensorFlow
TensorFlow Hub: reusing machine learning modules (TensorFlow Meets)
TensorFlow
How to use TensorFlow in PyCharm (TensorFlow Tip of the Week)
TensorFlow
Training models faster with TensorFlow Hub (TensorFlow Meets)
TensorFlow
Prepare your dataset for machine learning (Coding TensorFlow)
TensorFlow
Using ML to predict insulin use for Type 1 Diabetes (TensorFlow Meets)
TensorFlow
TFX: an end-to-end machine learning platform for TensorFlow (TensorFlow Meets)
TensorFlow
More on: ML Pipelines
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
10 Python Concepts You Must Know Before Calling Yourself Advanced
Medium · AI
10 Python Concepts You Must Know Before Calling Yourself Advanced
Medium · Data Science
10 Python Concepts You Must Know Before Calling Yourself Advanced
Medium · Programming
10 Python Concepts You Must Know Before Calling Yourself Advanced
Medium · Python
🎓
Tutor Explanation
DeepCamp AI