TensorFlow high-level APIs: Part 1 - loading data

TensorFlow · Intermediate ·📐 ML Fundamentals ·7y ago

Key Takeaways

This video covers the basics of TensorFlow high-level APIs, focusing on loading and preparing data for machine learning models using the TensorFlow CSV dataset and eager execution.

Full Transcript

[Music] hi and welcome to coding tensorflow I'm Carmel Allison and I'm here to guide you through a scenario using tensor flows high-level api's this video is the first in a three-part series in this we'll look at data and in particular how to prepare and load your data for machine learning the rest of the series is available on this channel so don't forget to hit that subscribe button building a machine learning model is a multi-stage process you have to collect clean and process your data prototype and iterate on your model architecture train and evaluate results prepare your model for production serving and then you have to do it all over again because the model is a living thing that will have to be updated and improved tensorflow high-level api's aim to help you at each stage of your models lifecycle from the beginning of an idea to training and serving large scale applications in this series I will walk through the key steps in developing a machine learning model and show you what tensorflow provides for you at each step and I'll also cover some of the new developments that we are working on to continue to improve your workflow we start with the problem and associated data set we will use the cover type data set from the US Forestry Service and Colorado State University which has about 500,000 rows of geophysical data collected from particular regions in national forest areas we are going to use the features in this data set to try to predict the soil type that was found in each region and there are a mix of features that we'll be working with some are real values elevation slope aspect and so on some our real values that have been binned and an 8-bit scale and some are categorical values that assign integers to soil types and wilderness area names if we inspect the first couple rows of our data this is what we see integers no header so we have to work from the info file ok so here we can see that we have some of our real values and it looks like some of the categorical values are one hot encoded and some are just categories some features band multiple cells so we'll have to handle that where do we start what's the first thing we should do here I'm going to suggest to you that when you're prototyping a new model in tensorflow the very first thing you should do is enable eager execution it's simple you just add a single line after importing tensorflow and you're good to go the way it does that is rather than deferring execution of your tensor flow graph it runs ops immediately the result is that you can write your models in Iger while you're experimenting and iterating but you still get the full benefit of tensor flow graph execution when it comes time to train and deploy your model at scale the first thing we're going to want to do is load our data in and process the data and column so that we can feed it into a model the data is a CSV file with 55 columns of integers we'll go over each of those in detail in a bit but first we will use the tensorflow CSV data set to load our data from disk this particular data set doesn't have a header but if it did we could process that as well with the CSV data set now a tensorflow data set is similar to a numpy array or a panda's data frame and that it reads and processes data but instead of being optimized for in-memory analysis it is designed to take data run the set of operations that are necessary to process and consume that data for training here we are telling tensorflow to read our data from disk parse the CSV and process the incoming data as a vector of 55 integers because we are running with eager execution enabled our data set here does already represent our data and we can even check to see what each row currently looks like if we take the first row we can see that right now each row is a tupple of 55 integer tensors not yet processed batch or even split into features and labels so we have tuples of 55 integers but we want our data to reflect the structure of the data we know is in there for that we can write a function to apply to our data set row by row this function will take in the tupple of 55 integers in each row a data set is expected to return tuples of features and labels so our goal with each row is to parse the row and return the set of features we care about plus a class label so what needs to go in between here this function is going to be applied at runtime to each row of data but it will be applied efficiently by tensorflow datasets so this is a good place to put things like image processing or adding random noise or other special transformations in our case we handle most of our transformations using feature columns which I will explain more in a bit so our main goal in the parsing function is to make sure we correctly separate and group our columns of features so for example if you read over the details of the data set you will see that the soil type is a categorical feature that is one hot encoded it is spread out over 40 of our integers we combine those here into a single length 40 tensor so that we can learn soil type as a single feature rather than 40 independent features then we can combine the soil type tensor with the other features which are spread out over the set of 55 columns in the original data set we can splice the tupple of incoming values to make sure we get everything we need and then we zip those up with human readable column names to get a dictionary of features that we can process further later finally we convert our one hot encoded wilderness area class into a class label that is in the range 0 to 3 we could leave them one hot encoded as well and for some model architectures or loss calculations that might be preferable and that gives us features and a label for each row we then map this function to our data row wise and then we batch the rows in two sets of 64 examples using tensorflow data sets here allows us to take advantage of many built-in performance optimizations that data sets provide for this type of mapping and batching to help remove io bottlenecks there are many other tricks for i/o performance optimization depending on your system that we won't cover here but a guide is included in the description below because we are using eager execution we can check to see what our data looks like after this and you can see that now we have parse dictionaries of intz with nice human readable names each feature has been batched so a feature that is a single number is a length 64 tensor and we can see that our conversion of soil type results in a tensor with a shape of 64 by 40 we can also see that we have a single tensor for the class labels which has the category indices as expected just to keep our eyes on the big picture here let's see where we are we've taken our raw data and put it into a tensor flow data set that generates dictionaries of feature tensors and labels but something is still wrong with the integers we have AZ features here anyone care to venture a guess we have lots of feature types our continuous summer categorical summer one hot encoded we need to represent these in a way that is meaningful to an ml model you'll see how to fix that using feature columns in part two of this series right here on YouTube so don't forget to hit that subscribe button and I'll see you there [Music] [Applause] you

Original Description

Welcome to Part 1 of our mini-series on TensorFlow high-level APIs! In this 3 part mini-series, TensorFlow Engineering Manager Karmel Allison runs us through different scenarios using TensorFlow’s high-level APIs. Building a ML model takes a lot of time, effort, and often involves multiple stages. Luckily, TensorFlow high-level APIs aim to help you along with each stage, from the start of your idea, to training and serving large scale applications. Watch to discover the key steps in developing machine learning models, where TensorFlow comes in for each step, and lastly how to prepare and load your data! Learn more about TensorFlow high-level APIs → http://bit.ly/2zETMOK Want to watch more? → http://bit.ly/Coding-TensorFlow Subscribe to the channel to catch new episodes of Coding TensorFlow → https://goo.gl/ht3WGe And...stay tuned for Part 2 & 3!
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from TensorFlow · TensorFlow · 0 of 60

← Previous Next →
1 The TensorFlow YouTube Channel is Here!
The TensorFlow YouTube Channel is Here!
TensorFlow
2 Answering Your TF Questions #AskTensorFlow
Answering Your TF Questions #AskTensorFlow
TensorFlow
3 Chatting With the TensorFlow Community (TensorFlow Meets)
Chatting With the TensorFlow Community (TensorFlow Meets)
TensorFlow
4 All About TensorFlow Code (Coding TensorFlow)
All About TensorFlow Code (Coding TensorFlow)
TensorFlow
5 TensorFlow: an ML platform for solving impactful and challenging problems
TensorFlow: an ML platform for solving impactful and challenging problems
TensorFlow
6 Keynote (TensorFlow Dev Summit 2018)
Keynote (TensorFlow Dev Summit 2018)
TensorFlow
7 tf.data: Fast, flexible, and easy-to-use input pipelines (TensorFlow Dev Summit 2018)
tf.data: Fast, flexible, and easy-to-use input pipelines (TensorFlow Dev Summit 2018)
TensorFlow
8 Eager Execution (TensorFlow Dev Summit 2018)
Eager Execution (TensorFlow Dev Summit 2018)
TensorFlow
9 Machine Learning in JavaScript (TensorFlow Dev Summit 2018)
Machine Learning in JavaScript (TensorFlow Dev Summit 2018)
TensorFlow
10 Training Performance: A user’s guide to converge faster (TensorFlow Dev Summit 2018)
Training Performance: A user’s guide to converge faster (TensorFlow Dev Summit 2018)
TensorFlow
11 The Practitioner's Guide with TF High Level APIs (TensorFlow Dev Summit 2018)
The Practitioner's Guide with TF High Level APIs (TensorFlow Dev Summit 2018)
TensorFlow
12 Distributed TensorFlow (TensorFlow Dev Summit 2018)
Distributed TensorFlow (TensorFlow Dev Summit 2018)
TensorFlow
13 Debugging TensorFlow with TensorBoard plugins (TensorFlow Dev Summit 2018)
Debugging TensorFlow with TensorBoard plugins (TensorFlow Dev Summit 2018)
TensorFlow
14 TensorFlow Lite (TensorFlow Dev Summit 2018)
TensorFlow Lite (TensorFlow Dev Summit 2018)
TensorFlow
15 Searching Over Ideas (TensorFlow Dev Summit 2018)
Searching Over Ideas (TensorFlow Dev Summit 2018)
TensorFlow
16 Reconstructing Fusion Plasmas (TensorFlow Dev Summit 2018)
Reconstructing Fusion Plasmas (TensorFlow Dev Summit 2018)
TensorFlow
17 Nucleus: TensorFlow toolkit for Genomics (TensorFlow Dev Summit 2018)
Nucleus: TensorFlow toolkit for Genomics (TensorFlow Dev Summit 2018)
TensorFlow
18 Open Source Collaboration (TensorFlow Dev Summit 2018)
Open Source Collaboration (TensorFlow Dev Summit 2018)
TensorFlow
19 Swift for TensorFlow - TFiwS (TensorFlow Dev Summit 2018)
Swift for TensorFlow - TFiwS (TensorFlow Dev Summit 2018)
TensorFlow
20 TensorFlow Hub (TensorFlow Dev Summit 2018)
TensorFlow Hub (TensorFlow Dev Summit 2018)
TensorFlow
21 Applied AI at The Coca-Cola Company (TensorFlow Dev Summit 2018)
Applied AI at The Coca-Cola Company (TensorFlow Dev Summit 2018)
TensorFlow
22 Real-World Robot Learning (TensorFlow Dev Summit 2018)
Real-World Robot Learning (TensorFlow Dev Summit 2018)
TensorFlow
23 TensorFlow Extended (TFX) (TensorFlow Dev Summit 2018)
TensorFlow Extended (TFX) (TensorFlow Dev Summit 2018)
TensorFlow
24 Project Magenta (TensorFlow Dev Summit 2018)
Project Magenta (TensorFlow Dev Summit 2018)
TensorFlow
25 TensorFlow Dev Summit 2018 - Livestream
TensorFlow Dev Summit 2018 - Livestream
TensorFlow
26 Introducing TensorFlow Lite (Coding TensorFlow)
Introducing TensorFlow Lite (Coding TensorFlow)
TensorFlow
27 TensorFlow Dev Summit 2018 Highlights
TensorFlow Dev Summit 2018 Highlights
TensorFlow
28 Jeff Dean, Head of AI at Google discusses the impact of ML (TensorFlow Meets)
Jeff Dean, Head of AI at Google discusses the impact of ML (TensorFlow Meets)
TensorFlow
29 TensorFlow Mobile vs. TF Lite and More! #AskTensorFlow
TensorFlow Mobile vs. TF Lite and More! #AskTensorFlow
TensorFlow
30 Using TensorFlow to enable research & production across many fields (TensorFlow Meets)
Using TensorFlow to enable research & production across many fields (TensorFlow Meets)
TensorFlow
31 Teaching TensorFlow for Deep Learning at Stanford University (TensorFlow Meets)
Teaching TensorFlow for Deep Learning at Stanford University (TensorFlow Meets)
TensorFlow
32 TensorFlow Lite for Android (Coding TensorFlow)
TensorFlow Lite for Android (Coding TensorFlow)
TensorFlow
33 Using the tf.data API to build input pipelines (TensorFlow Meets)
Using the tf.data API to build input pipelines (TensorFlow Meets)
TensorFlow
34 Training Models in the Cloud & the Benefits of AI Toolkits #AskTensorFlow
Training Models in the Cloud & the Benefits of AI Toolkits #AskTensorFlow
TensorFlow
35 Execute operations immediately with TensorFlow's Eager Execution (TensorFlow Meets)
Execute operations immediately with TensorFlow's Eager Execution (TensorFlow Meets)
TensorFlow
36 TensorFlow Lite for iOS (Coding TensorFlow)
TensorFlow Lite for iOS (Coding TensorFlow)
TensorFlow
37 Get started with TensorFlow's High-Level APIs (Google I/O '18)
Get started with TensorFlow's High-Level APIs (Google I/O '18)
TensorFlow
38 TensorFlow for JavaScript (Google I/O '18)
TensorFlow for JavaScript (Google I/O '18)
TensorFlow
39 TensorFlow in production: TF Extended, TF Hub, and TF Serving (Google I/O '18)
TensorFlow in production: TF Extended, TF Hub, and TF Serving (Google I/O '18)
TensorFlow
40 Get started with TensorFlow's High-Level APIs in 5 mins |  Google I/O 2018
Get started with TensorFlow's High-Level APIs in 5 mins | Google I/O 2018
TensorFlow
41 TensorFlow and deep reinforcement learning, without a PhD (Google I/O '18)
TensorFlow and deep reinforcement learning, without a PhD (Google I/O '18)
TensorFlow
42 TensorFlow Lite for mobile developers (Google I/O '18)
TensorFlow Lite for mobile developers (Google I/O '18)
TensorFlow
43 Advances in machine learning and TensorFlow (Google I/O '18)
Advances in machine learning and TensorFlow (Google I/O '18)
TensorFlow
44 Distributed TensorFlow training (Google I/O '18)
Distributed TensorFlow training (Google I/O '18)
TensorFlow
45 Classification using neural networks & ML regression models #AskTensorFlow
Classification using neural networks & ML regression models #AskTensorFlow
TensorFlow
46 TensorFlow and Keras in R - Josh Gordon meets with J.J. Allaire (TensorFlow Meets)
TensorFlow and Keras in R - Josh Gordon meets with J.J. Allaire (TensorFlow Meets)
TensorFlow
47 Focus on your experiment with TensorFlow Estimators (TensorFlow Meets)
Focus on your experiment with TensorFlow Estimators (TensorFlow Meets)
TensorFlow
48 How to get started with AI/ML, retraining models, & more! #AskTensorFlow
How to get started with AI/ML, retraining models, & more! #AskTensorFlow
TensorFlow
49 TensorFlow - the deep learning solution for mobile platforms (TensorFlow Meets)
TensorFlow - the deep learning solution for mobile platforms (TensorFlow Meets)
TensorFlow
50 MiniGo: TensorFlow Meets Andrew Jackson (TensorFlow Meets)
MiniGo: TensorFlow Meets Andrew Jackson (TensorFlow Meets)
TensorFlow
51 The growth of TensorFlow with added support for JS & Swift (TensorFlow Meets)
The growth of TensorFlow with added support for JS & Swift (TensorFlow Meets)
TensorFlow
52 At the intersection of TensorFlow & nuclear physics (TensorFlow Meets)
At the intersection of TensorFlow & nuclear physics (TensorFlow Meets)
TensorFlow
53 NVidia TensorRT: high-performance deep learning inference accelerator (TensorFlow Meets)
NVidia TensorRT: high-performance deep learning inference accelerator (TensorFlow Meets)
TensorFlow
54 Try TensorFlow.js in your browser (Coding TensorFlow)
Try TensorFlow.js in your browser (Coding TensorFlow)
TensorFlow
55 TensorFlow Hub: reusing machine learning modules (TensorFlow Meets)
TensorFlow Hub: reusing machine learning modules (TensorFlow Meets)
TensorFlow
56 How to use TensorFlow in PyCharm (TensorFlow Tip of the Week)
How to use TensorFlow in PyCharm (TensorFlow Tip of the Week)
TensorFlow
57 Training models faster with TensorFlow Hub (TensorFlow Meets)
Training models faster with TensorFlow Hub (TensorFlow Meets)
TensorFlow
58 Prepare your dataset for machine learning (Coding TensorFlow)
Prepare your dataset for machine learning (Coding TensorFlow)
TensorFlow
59 Using ML to predict insulin use for Type 1 Diabetes (TensorFlow Meets)
Using ML to predict insulin use for Type 1 Diabetes (TensorFlow Meets)
TensorFlow
60 TFX: an end-to-end machine learning platform for TensorFlow (TensorFlow Meets)
TFX: an end-to-end machine learning platform for TensorFlow (TensorFlow Meets)
TensorFlow

This video teaches how to load and prepare data for machine learning models using TensorFlow high-level APIs, covering eager execution, CSV datasets, and feature columns.

Key Takeaways
  1. Enable eager execution
  2. Load data using TensorFlow CSV dataset
  3. Parse and process data using a function
  4. Map the function to the data row-wise
  5. Batch the data
  6. Check the resulting data structure
💡 Using eager execution and feature columns can simplify the process of loading and preparing data for machine learning models in TensorFlow.

Related AI Lessons

10 Python Concepts You Must Know Before Calling Yourself Advanced
Learn 10 essential Python concepts to take your skills to the advanced level and stand out as a developer
Medium · AI
10 Python Concepts You Must Know Before Calling Yourself Advanced
Learn 10 crucial Python concepts to elevate your skills from intermediate to advanced and become a proficient developer
Medium · Data Science
10 Python Concepts You Must Know Before Calling Yourself Advanced
Learn 10 essential Python concepts to take your skills to the advanced level and stand out as a developer
Medium · Programming
10 Python Concepts You Must Know Before Calling Yourself Advanced
Learn 10 essential Python concepts to take your skills to the advanced level and separate yourself from beginner developers
Medium · Python
Up next
Is Python Dead in 2026?| Truth About Python in AI Era | 90 Days Roadmap @FameWorldEducationalHub
FAME WORLD EDUCATIONAL HUB
Watch →