TensorFlow in production: TF Extended, TF Hub, and TF Serving (Google I/O '18)

TensorFlow · Beginner ·📰 AI News & Updates ·8y ago

Skills: LLM Engineering90%ML Pipelines80%Prompt Craft70%

Key Takeaways

This video introduces TensorFlow Extended (TFX), TensorFlow Hub, and TensorFlow Serving, providing a comprehensive overview of TensorFlow in production, including model serving, deployment, and inference. The session covers the key features and innovations of these tools, including reusable machine learning modules, distributed serving, and model evaluation.

Full Transcript

[Music] welcome everyone I am Jeremiah and this is tensorflow in production I'm excited that you're all here because that means you're excited about production and that means you're building things that people actually use so our talk today has three parts I want to start by quickly drawing a thread that kind of connects all of them and the first thread is the origin of these projects these projects really come from our teams that are on the front line of machine learning so these are real problems that we've come across doing machine learning at Google's scale and these are the real solutions that let us do machine learning at Google the second thing I want to talk about is this observation if we look at software engineering over the years we see this growth as we discover new tools as we discover best practices we're really getting more effective at doing machine or doing software engineering and we're getting more efficient we're seeing the same kind of growth on the machine learning side right we're discovering new best practices and new tools the catch is that this growth is maybe 10 or 15 years behind software engineering and we're also rediscovering a lot of the same things that exist in software engineering but in a machine learning context so we're doing things like discovering version control for machine learning or continuous integration for machine learning so I think it's worth keeping that in mind as we move through the talks the first one up is going to be tensorflow hub and this is something that lets you share reusable pieces of machine learning much the same way we share code then we'll talk a little bit about deploying machine learning models with tensorflow serving and we'll finish up with tensorflow extended which wraps a lot of these things together in a platform to increase your velocity as a machine-learning practitioner so with that I'd like to hand it over to Andrew to talk about tensorflow hub thanks Jeremiah hi everybody I'm Andrew Gasper ovitch and I'd like to talk to you a little bit about tensorflow hub which is a new library that's designed to bring reusability to machine learning so software repositories have been a real benefit to developer productivity over the past 10 or 15 years and are great first of all because when you're writing something new if you have a repository you think oh maybe I'll check whether there's something that already exists and reuse that rather than starting from scratch but a second thing that happens is you start thinking maybe I'll write my code in a way that's specifically designed for reuse which is great because it makes your code more modular but it also has a potential to benefit a whole community if you share that code what we are doing with tensorflow hub is bringing that idea of a repository to machine learning in this case tensorflow hub is designed so that you can create share and reuse components of ml models and if you think about it it's even more important to have a repository for machine learning even more so than software development because in the case of machine learning not only are you reusing the algorithm and the expertise but you're also reusing potentially enormous amount of compute power that went into training the model and all of the training data as well so all four of those the algorithm the training data the compute and the expertise all go into a module which is shareable tensorflow hub and then you can import those into your model and those models those modules are pre-trained so they have the weights and the tensor flow graph inside and unlike a model they're designed to be composable which means that you can put them together like building blocks and add to your own stuff on top they're reusable which means that they have common signatures so that you can swap one for another and they're retraining Bowl which means that you can actually back propagate through a module that you've inserted into your graph so let's take a quick look at an example in this case we'll do a little bit of image classification and say that we want to make an app to classify rabbit breeds from photos but we only have a few hundred example photos probably not enough to build a whole image classifier from scratch but what we could do is start from a general-purpose model and we could take the reusable part of it the architecture and the weights their takeoff the classification and then we could add our own classifier on top and train it with our own examples we'll keep that reusable part fixed and we'll train our own classifier on top so if you're using tensorflow hub you start at tensorflow org / hub where you can find a whole bunch of newly released state-of-the-art research-oriented and the well-known image modules some of them are include classification and some of them chop off the class classification layers and just output feature vectors so that's what we want in our own case in this case because we're going to add classification on top so maybe we'll choose NASA net which is a an image module that was created by a neural architecture search so we'll choose NASA net a the large version with the feature vectors so we just paste the URL for the module into our TF hub code and then we're ready to use that module just like a function in between the module gets downloaded and instantiate it into your graph so all you have to do is get those feature vectors add your own classification on top and output the the new categories so specifically what we're doing is training just the classification part while keeping all of the modules way it's fixed but the great thing about reusing a module is that you get all of the training and compute that's gone into that reusable portion so in the case of nazma it was over 62,000 GPU hours that went into finding the architecture and training the model plus all of the expertise the testing and the research that went into nazma you're reusing all of that in that one line of code and as I mentioned before those modules are trainable so if you have enough data you can do fine tuning with the module if you set that trainable parameter to true and you select that you want to use the training graph what you'll end up doing is training the entire thing along with your classification the caveat being that of course you have to lower the learning rate so that you don't ruin the weights inside the module but if you have enough training data it's something that you can do to get even better accuracy and in general we have lots of image modules on TF hub we have ones that are straight out of research papers like NASA net we have ones that are great for production even once made for on device usage like mobile net plus all of the industry standard ones that people are familiar with like inception and ResNet so let's look at one more example in this case doing a little bit of text classification we'll do look at some restaurant reviews and decide whether they're positive or negative sentiment and one of the great things about TF hub is that all of those modules because they're tensor flow graphs you can include things like pre-processing so the text modules that are available on TF hub take whole sentences and phrases not just individual words because they have all of the tokenization and pre-processing stored in the graph itself so we'll use one of those and basically the same idea we're going to select a sentence embedding module we'll add our own classification on top and we'll train it with our own data but we'll keep the module itself fixed and just like before we'll start by going to tensorflow org slash hub and take a look at the text modules that are available in this case maybe we'll choose the universal sentence encoder which is just recently released based on a research paper from last month the idea is that it was trained on a variety of tasks and is specifically meant to support using it with a variety of tasks and it also takes just a very small amount of training data to use it in your model which is perfect for our example case so we'll use that universal sentence encoder and just like before we'll paste the URL into our code the difference here is we're using it with a text embedding column that way we can feed it into one of the high-level tensorflow estimators in this case the DNN classifier but you could also use that module it like I showed in the earlier example calling it just as a function if you are using the text and embedding column that also just like any other example can be trained as well and just like any other example it's something that you can do with a lower learning rate if you have a lot of training data and it may give you better accuracy and so we have a lot of text modules available on TF we actually just added three new languages to the nn/lm modules Chinese Korean and Indonesian those are all trained on G news training data and we also have a really great module called Elmo from some recent research which understands words in context and of course the universal a sentence encoder as I talked about so just to show you for a minute some of those URLs that we've been looking at maybe we'll take apart the pieces here TF hub dev is our new source for Google and selected partner published modules in this case this is Google that's the publisher and new universal sentence encoder is the name of the module the one at the end is a version number so tensorflow hub considers modules to be immutable and so the version number is there so that if you're you know doing one training run and then another you don't have a situation where the the module change changes unexpectedly so all modules on TF abductive are version that way and one of the nice things about those URLs if you paste them into a browser you get the module documentation the idea being that maybe you read a new paper you see oh there's a URL for TF hub module and it you paste it into your browser you see the documentation you paste it into some code and in one line you're able to use that module and try out the new research and speaking of the universal encoder the team just released a new light version which is a much smaller size it's about 25 megabytes and it's specifically designed for cases where the full text module wouldn't for doing things like on device classification also today we released a new module from deep mind this one you can feed in video and it will classify and detect the actions in that video so in this case it correctly guesses the video is of people playing cricket and of course we also have a number of other interesting modules there's a generative image module which is trained on Salemme it has a progressive gann inside and also the deep local features module which can identify the key points of landmark images those are all available now on TFM and last but not least I wanted to mention that we just announced our support for tensorflow j/s so using the tensorflow j/s converter you can directly convert a TF hub module into a format that can be used on the web it's a really simple integration to be able to take a module and use it in the web browser with sensor flow jas and we're really excited to see what you build with it so just to summarize tensorflow hub is designed to be a starting point for reusable machine learning and the idea is just like with a software repository before you start from scratch check out what's available on tensorflow hub and you may find that it's better to start with a module and import that into your model rather than starting the task completely from scratch we have a lot of modules available and we're adding more all the time and we're really excited to see what you built so thanks next up is jeremiah to talk about TF serving all right thank you Andrew so next tensorflow serving this is going to be how we deploy modules our deploy models just to get a sense for where this falls in the machine learning process right we start with our data we use tensorflow to train a model in the output our artifact there are these models right these are saved models it's a graphical representation of the data flow and once we have those we want to share them with the world that's where tensorflow serving comes in it's this big orange box so this is something that takes our models and exposes them to the world through a service so clients can make requests tensorflow serving will take them run the inference run the model come up with an answer and return that in a response so tensorflow serving is actually the libraries and binary is you need to do this to do this production-grade inference overtrained tensorflow models it's written in c++ and supports things like G RPC and plays nicely with kubernetes so to do this well it has a couple of features the first and most important is it supports multiple models so on one tensor flow model server you can load multiple models right and just like most folks probably wouldn't push a new binary right to production you don't want to push a new model right to production either so having these multiple models in memory lets you be serving one model on production traffic and load a new one and maybe send it some canary request send it some QA requests make sure everything's all right and then move the traffic over to that new model and this supports doing things like reloading if you have a stream of models you're producing pensive low serving will transparently load the new ones and unload the old ones we've built in a lot of isolation if you have a model that's serving a lot of traffic in one thread and it's time to load a new model you make sure do that in a separate thread that way we don't cause any hiccups in the thread that's serving production traffic and again this entire system has been built from the ground up to be very high throughput things like selecting those different models based on the name or selecting different versions that's very very efficient similarly it has some advanced batching right this way we can make use of accelerators we also see improvements on standard CPUs with this batching and then lots of other enhancements everything from protocol buffer magic to lots more and this is really what we use inside Google to serve tensorflow I think there's over 1,500 projects that use it it serves somewhere in the neighborhood of 10 million QPS which ends up being about a hundred million items predicted per second and we're also seeing some adoption outside of Google one of the new things I'd like to share today is distributed serving so looking inside Google we've seen a couple of trends one is that models are getting bigger and bigger some of the ones inside Google are over a terabyte in size the other thing we're seeing is this sharing of sub-graphs right TF hub is producing these common pieces of models and we're also seeing more and more specialization in these models as they get bigger and bigger right if you look at some of these model structures they look less like a model that would belong on one machine and more like an entire system so that's this is exactly what distributed serving is meant for kinda lets us take the single model and basically break it up into micro services so to get a better feel for that will say that andrew has taken his rabbit classifier and is serving it on a model server and we'll say that I want to create a similar system to classify cat breeds and so I've done the same thing I've started from tensorflow hub so you can see I've got the tensorflow hub module in the center there and you'll notice that since we both started from the same module we the same bits of code we have the same core to our mission their model so what we can do is we can start a third server and we can put the tensorflow hub module on that server and we can remove it from the servers on the outside and leave in its place this placeholder we call a remote op you can think of this as a portal it's kind of a forwarding op that when we run the inference it forwards at the appropriate point in the in the processing to the model server there the computation is done and the result gets sent back and the computation continues on our classifiers on the outside so there's a few reasons we might want to do this right we can get rid of some duplication now we only have one model server loading all these weights we also get the benefit that that can batch requests that are coming from both sides and also we can set up different configurations you can imagine we might have this model server just loaded with TP use our tensor processing units so that it can do what are most likely convolutional operations and things like that very efficiently so another place where we use this is with large sharded models so if you're familiar with deep learning there's this technique of embedding things like words or YouTube video IDs as a string of numbers right we represent them as this vector of numbers and if you have a lot of words or you have a lot of YouTube videos you're gonna have a lot of data so much that it won't fit on one machine so we use a system like this to split up those embeddings for the words into these shards and we can distribute there and of course the main model when it needs something can reach out get it and then do the computation another example is what we call triggering models so we'll say we're building a spam detector and we have a full model which is a very very powerful spam detector you know maybe it looks at the words understands the context it's very powerful but it's very expensive and we can't afford to run it on every single email message we get so what we do instead is we put this triggering model in front of it as you can imagine there's a lot of cases where we're in a position to very quickly say yes this is spam or no it's not so for instance if we get an email that's from within our own domain maybe we can just say that's not spam and the triggering model can quickly return that if it's something that's difficult it can go ahead and forward that on to the full model where it will process it so a similar concept is this mixture of experts so in this case let's say we want to build a system where we're going to classify the breed of either a rabbit or a cat so what we're gonna do is we're gonna have two models we're gonna call expert models right so we have one that's an expert at rabbits and another that's an expert at cats and so here we're gonna use a gating model to get a picture of either a rabbit or cat and the only thing that's gonna do is decide if it's a rabbit or a cat and forward it on to the appropriate expert who will process it and we'll send that that result back all right there's lots of use cases we're excited to see what people start to build with these remote ops the next thing I'll quickly mention is a REST API this was one of the top requests on github so we're happy to be releasing this soon this will make it much easier to integrate things with existing existing services and it's nice because you don't actually have to choose on one model server with one tensorflow model you can serve either the restful endpoint or the GRP see there's three api's there's some higher-level ones like for classification and regression there's also a lower-level predict and this is more of a tensor in tensor out for the things that don't fit into classify and regress so looking at this quickly you can see the URI here we can specify the model right this may be like rabbit or cat we can optionally specify a version and our verbs are the classify regress and predict we have two examples the first one you can see we're asking the iris model to classify something in here we aren't giving it a version a model version so it'll just use the most recent or the highest version automatically and the bottom example is one where we're using the M NIST model and we're specifying the version to be 3 1 4 and asking it to do a prediction so this lets you this lets you easily integrate things and easily version models and switch between them I'll quickly mention the API if you're familiar with tensorflow example you know that representing it in JSON is a little bit cumbersome so you can see it's pretty verbose here there's some other warts like needing to encode things base64 instead with tensorflow serving the RESTful API uses a more idiomatic JSON which is much more pleasant much more succinct and here this last example just kind of pulls it all together where you can use curl to actually make predictions from the command-line so I encourage you to check out the project that tensorflow serving there's lots of great documentation and things like that and we also welcome contributions and code discussion ideas on our github project page so I'd like to finish with James to talk about tensorflow extended like all right so I'm gonna start with a single non-controversial statement this has been shown true made many times by many people in short tf-x is our answer to that statement I'll start with a simple diagram this core box represents your machine learning code this is the magic bits of algorithms that actually take the data in and produce reasonable results the blue boxes represent everything else you need to actually use machine learning reliably and scalably in an actual real production setting the blue boxes are going to be where you're spending most of your time it comprises most of the lines of code it's also going to be the source of most of the things that are selling off your pagers in the middle of the night in our case if we squint at this just about correctly the core ml box looks like tensorflow and all of the blue boxes together comprise tf-x so we're gonna quickly run through four of the key principles that tf-x was built on first Express ability and tf-x is gonna be flexible in three ways first of all we're gonna take advantage of the flexibility built into tensorflow using it as our trainer means that we can do anything tensor for look into at the model level which means you can have wide models deep models supervised models unsupervised tree models anything that we can whip up together second were flexible with regards to input data we could handle images texts sparse data multimodal models where you might want to Train images and surrounding text or something like videos plus captions third there are multiple ways you might go about actually training a model if your goal is to build a kitten detector you may have all of your data up front and your goal may be to build one model of sufficient high quality and then you're done in contrast to that if your goal is to build a viral kitten video detector or a personalized kitten recommender then you're not going to have all of your data up front so typically your trade a model get it into production and then as data comes in you'll throw away that model and train a new model and then throw away that model and train a new model we're actually throwing out some good data along with these models though so we can try a worm starting strategy instead where we'll continuously train the same model but as data comes in we'll warm start based on the previous state of the model and just add the additional new data this will let us result in higher quality models with faster convergence next let's talk about portability so each of the tf-x modules represented by the blue boxes don't need to do all of the heavy lifting themselves they're part of an open source ecosystem which means we can lean on things like tensorflow and take advantage of its native portability this means we can run locally we can scale up and run in a cloud environment we can scale to devices that you're thinking about today and to devices that you might be thinking about tomorrow a large portion of machine learning is data processing so we rely on Apache beam which is built for this task and again we can take advantage of beams portability as our own which means we can use the direct runner locally where you might be starting out with a small piece of data building small models to affirm that your approaches are actually correct and then scale up into the cloud with a data flow rudder also utilize something like the flink runner or things that are in progress right now like a spark Runner will see the same story again with kubernetes where we can start with mini cube running locally scale up into the cloud or two clusters that we have for other purposes and eventually scale to things that don't yet exist but they're still in progress so portability is only part of the scalability story traditionally we've seen two very different roles involved in machine learnings you'll have the data scientists on one side and be production infrastructure engineers on the other side the differences between these are not just amounts of data but there are key concerns that each has about as they go about their daily busy this with tf-x we can specifically target use cases that are in common between the two as well as things that are specific to the two so this will allow us to have one unified system that can scale up to the cloud and down to smaller environments and actually unlock collaboration between these two roles finally we believe heavily in interactivity we were able to get quick iterative results with responsive tooling and fast debugging and there's interactivity should remain such even at scale with large sets of data or large models this is a fairly ambitious goal so where are we now so today we've open-sourced a few key areas of responsibility so we have transform model analysis serving and facets each one of these is useful on its own but is much more so when used in concert with the others so let's walk through what this might look like in practice so our goal here is to take a bunch of data we've accumulated and do something useful for our users of our product these are the steps we want to take along the way so let's start with step one with the data we're going to pull this up in facets and use it to actually analyze what features might be useful predictors look for any anomalies so outliers in our data or missing features to try to avoid the classic garbage in garbage out problem and to try to inform what data we're going to need to further pre-process before it's useful for our ml training which leads into our next step which is to actually use transform to transform our features so TF transform will let you do full pass analysis and transforms of your base data and it's also very firmly attached to the TF graph itself which will ensure that you're applying the same transforms in training as in serving from the code you can see that we're taking advantage of a few ops built into transform and we could do things like scale generate vocabularies or bucket eyes our base data and this code will look the same regardless of our execution environment and of course if you needed to find your own operations you can do so so this puts us at the point where we're strongly suspicious that we have data we can actually use to generate a model so let's look at doing that we're going to use a ten circle estimator which is a high level API that will let us quickly define train and export our model this is a small set of estimators that are present in court tensorflow there are a lot more available and you can also create your own we're going to look ahead to some future steps and we're gonna purposefully export two graphs into our saved model one specific disserving and one specific to model evaluation and again from the code you can see that we're good in this case we're going to use a white and deep model we're gonna define it we're gonna train it we're gonna do our exports so now we have a model we could just push this directly to production but that would probably be a very bad idea so let's try to gain a little more confidence and what would happen if we actually did so for our end-users so we're gonna step into TF model analysis we're gonna utilize this to evaluate our model over a large data set and then we're going to define in this case one but you could possibly use many slices of this data that we want to analyze independently from others this will allow us to actually look at subsets of our data that may be representative of subsets of our users and how our metrics actually track between these groups for example you may have sets of users in different languages maybe access different devices or maybe you have a very small but passionate community of rabbit aficionados mixed in with your larger community of kitten fanatics and you want to make sure that your model will actually give a positive experiences to both groups equally so now we have a model that we're confident in and we want to push it to serving so let's get this up and for some queries at it so this is quick now we have a model up we have a server listening on port 9000 for G RPC requests so now we're going to back out into our actual products code we can assemble individual prediction requests and then we can send them out to our server and if this slide doesn't look like your actual code and this one looks more similar than you'll be happy to see that this is coming soon I'm treating a little by showing you this now as current state but we're super excited about this and this is one of those real soon now scenarios so that's today what's coming next so first please contribute and join the tensor flood org community we don't want the only time that we're talking back and forth here to be at summits and conferences secondly some of you may have seen the tf-x paper at kdd last year this specifies what we believe an end-to-end platform actually looks like here it is and by we believing that this is what it looks like this is what it looks like this is actually what's powering some of the pretty awesome AI first products that you've been seeing at i/o and that you've probably been using yourselves but again this is where we are now ss right now this is not the full platform but you can see what we're aiming for and we'll get there eventually so again please download this software use it to make good things and send us feedback and thank you from all of us for being current and future users and for choosing to spend your time with us today [Applause] [Music]

Original Description

This session will introduce TensorFlow Extended (TFX), TensorFlow Hub, and announce new innovations and features in TensorFlow Serving. As machine learning is evolving from experimentation to serve production workloads, so does the need to effectively manage the end-to-end training and production workflow including model management, versioning, and serving. TFX provides this solution to Google and you'll hear about the release plans to deliver it to the community. TensorFlow Hub is a central repository of reusable parts of TensorFlow models. With its libraries, you can incorporate these parts in your models for transfer learning and package them up to be served with TensorFlow Serving. Rate this session by signing-in on the I/O website here → https://goo.gl/g6f9KH Watch more TensorFlow sessions from I/O '18 here → https://goo.gl/GaAnBR See all the sessions from Google I/O '18 here → https://goo.gl/q1Tr8x Subscribe to the TensorFlow channel → https://goo.gl/ht3WGe #io18 event: Google I/O 2018; re_ty: Publish; product: TensorFlow - TensorFlow Extended, TensorFlow - TensorFlow Hub; fullname: Jeremiah Harmsen, Andrew Gasparovic, James Pine; event: Google I/O 2018;

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from TensorFlow · TensorFlow · 39 of 60

← Previous Next →

The TensorFlow YouTube Channel is Here!

The TensorFlow YouTube Channel is Here!

Answering Your TF Questions #AskTensorFlow

Answering Your TF Questions #AskTensorFlow

Chatting With the TensorFlow Community (TensorFlow Meets)

Chatting With the TensorFlow Community (TensorFlow Meets)

All About TensorFlow Code (Coding TensorFlow)

All About TensorFlow Code (Coding TensorFlow)

TensorFlow: an ML platform for solving impactful and challenging problems

TensorFlow: an ML platform for solving impactful and challenging problems

Keynote (TensorFlow Dev Summit 2018)

Keynote (TensorFlow Dev Summit 2018)

tf.data: Fast, flexible, and easy-to-use input pipelines (TensorFlow Dev Summit 2018)

tf.data: Fast, flexible, and easy-to-use input pipelines (TensorFlow Dev Summit 2018)

Eager Execution (TensorFlow Dev Summit 2018)

Eager Execution (TensorFlow Dev Summit 2018)

Machine Learning in JavaScript (TensorFlow Dev Summit 2018)

Machine Learning in JavaScript (TensorFlow Dev Summit 2018)

Training Performance: A user’s guide to converge faster (TensorFlow Dev Summit 2018)

Training Performance: A user’s guide to converge faster (TensorFlow Dev Summit 2018)

The Practitioner's Guide with TF High Level APIs (TensorFlow Dev Summit 2018)

The Practitioner's Guide with TF High Level APIs (TensorFlow Dev Summit 2018)

Distributed TensorFlow (TensorFlow Dev Summit 2018)

Distributed TensorFlow (TensorFlow Dev Summit 2018)

Debugging TensorFlow with TensorBoard plugins (TensorFlow Dev Summit 2018)

Debugging TensorFlow with TensorBoard plugins (TensorFlow Dev Summit 2018)

TensorFlow Lite (TensorFlow Dev Summit 2018)

TensorFlow Lite (TensorFlow Dev Summit 2018)

Searching Over Ideas (TensorFlow Dev Summit 2018)

Searching Over Ideas (TensorFlow Dev Summit 2018)

Reconstructing Fusion Plasmas (TensorFlow Dev Summit 2018)

Reconstructing Fusion Plasmas (TensorFlow Dev Summit 2018)

Nucleus: TensorFlow toolkit for Genomics (TensorFlow Dev Summit 2018)

Nucleus: TensorFlow toolkit for Genomics (TensorFlow Dev Summit 2018)

Open Source Collaboration (TensorFlow Dev Summit 2018)

Open Source Collaboration (TensorFlow Dev Summit 2018)

Swift for TensorFlow - TFiwS (TensorFlow Dev Summit 2018)

Swift for TensorFlow - TFiwS (TensorFlow Dev Summit 2018)

TensorFlow Hub (TensorFlow Dev Summit 2018)

TensorFlow Hub (TensorFlow Dev Summit 2018)

Applied AI at The Coca-Cola Company (TensorFlow Dev Summit 2018)

Applied AI at The Coca-Cola Company (TensorFlow Dev Summit 2018)

Real-World Robot Learning (TensorFlow Dev Summit 2018)

Real-World Robot Learning (TensorFlow Dev Summit 2018)

TensorFlow Extended (TFX) (TensorFlow Dev Summit 2018)

TensorFlow Extended (TFX) (TensorFlow Dev Summit 2018)

Project Magenta (TensorFlow Dev Summit 2018)

Project Magenta (TensorFlow Dev Summit 2018)

TensorFlow Dev Summit 2018 - Livestream

TensorFlow Dev Summit 2018 - Livestream

Introducing TensorFlow Lite (Coding TensorFlow)

Introducing TensorFlow Lite (Coding TensorFlow)

TensorFlow Dev Summit 2018 Highlights

TensorFlow Dev Summit 2018 Highlights

Jeff Dean, Head of AI at Google discusses the impact of ML (TensorFlow Meets)

Jeff Dean, Head of AI at Google discusses the impact of ML (TensorFlow Meets)

TensorFlow Mobile vs. TF Lite and More! #AskTensorFlow

TensorFlow Mobile vs. TF Lite and More! #AskTensorFlow

Using TensorFlow to enable research & production across many fields (TensorFlow Meets)

Using TensorFlow to enable research & production across many fields (TensorFlow Meets)

Teaching TensorFlow for Deep Learning at Stanford University (TensorFlow Meets)

Teaching TensorFlow for Deep Learning at Stanford University (TensorFlow Meets)

TensorFlow Lite for Android (Coding TensorFlow)

TensorFlow Lite for Android (Coding TensorFlow)

Using the tf.data API to build input pipelines (TensorFlow Meets)

Using the tf.data API to build input pipelines (TensorFlow Meets)

Training Models in the Cloud & the Benefits of AI Toolkits #AskTensorFlow

Training Models in the Cloud & the Benefits of AI Toolkits #AskTensorFlow

Execute operations immediately with TensorFlow's Eager Execution (TensorFlow Meets)

Execute operations immediately with TensorFlow's Eager Execution (TensorFlow Meets)

TensorFlow Lite for iOS (Coding TensorFlow)

TensorFlow Lite for iOS (Coding TensorFlow)

Get started with TensorFlow's High-Level APIs (Google I/O '18)

Get started with TensorFlow's High-Level APIs (Google I/O '18)

TensorFlow for JavaScript (Google I/O '18)

TensorFlow for JavaScript (Google I/O '18)

TensorFlow in production: TF Extended, TF Hub, and TF Serving (Google I/O '18)

TensorFlow in production: TF Extended, TF Hub, and TF Serving (Google I/O '18)

Get started with TensorFlow's High-Level APIs in 5 mins | Google I/O 2018

Get started with TensorFlow's High-Level APIs in 5 mins | Google I/O 2018

TensorFlow and deep reinforcement learning, without a PhD (Google I/O '18)

TensorFlow and deep reinforcement learning, without a PhD (Google I/O '18)

TensorFlow Lite for mobile developers (Google I/O '18)

TensorFlow Lite for mobile developers (Google I/O '18)

Advances in machine learning and TensorFlow (Google I/O '18)

Advances in machine learning and TensorFlow (Google I/O '18)

Distributed TensorFlow training (Google I/O '18)

Distributed TensorFlow training (Google I/O '18)

Classification using neural networks & ML regression models #AskTensorFlow

Classification using neural networks & ML regression models #AskTensorFlow

TensorFlow and Keras in R - Josh Gordon meets with J.J. Allaire (TensorFlow Meets)

TensorFlow and Keras in R - Josh Gordon meets with J.J. Allaire (TensorFlow Meets)

Focus on your experiment with TensorFlow Estimators (TensorFlow Meets)

Focus on your experiment with TensorFlow Estimators (TensorFlow Meets)

How to get started with AI/ML, retraining models, & more! #AskTensorFlow

How to get started with AI/ML, retraining models, & more! #AskTensorFlow

TensorFlow - the deep learning solution for mobile platforms (TensorFlow Meets)

TensorFlow - the deep learning solution for mobile platforms (TensorFlow Meets)

MiniGo: TensorFlow Meets Andrew Jackson (TensorFlow Meets)

MiniGo: TensorFlow Meets Andrew Jackson (TensorFlow Meets)

The growth of TensorFlow with added support for JS & Swift (TensorFlow Meets)

The growth of TensorFlow with added support for JS & Swift (TensorFlow Meets)

At the intersection of TensorFlow & nuclear physics (TensorFlow Meets)

At the intersection of TensorFlow & nuclear physics (TensorFlow Meets)

NVidia TensorRT: high-performance deep learning inference accelerator (TensorFlow Meets)

NVidia TensorRT: high-performance deep learning inference accelerator (TensorFlow Meets)

Try TensorFlow.js in your browser (Coding TensorFlow)

Try TensorFlow.js in your browser (Coding TensorFlow)

TensorFlow Hub: reusing machine learning modules (TensorFlow Meets)

TensorFlow Hub: reusing machine learning modules (TensorFlow Meets)

How to use TensorFlow in PyCharm (TensorFlow Tip of the Week)

How to use TensorFlow in PyCharm (TensorFlow Tip of the Week)

Training models faster with TensorFlow Hub (TensorFlow Meets)

Training models faster with TensorFlow Hub (TensorFlow Meets)

Prepare your dataset for machine learning (Coding TensorFlow)

Prepare your dataset for machine learning (Coding TensorFlow)

Using ML to predict insulin use for Type 1 Diabetes (TensorFlow Meets)

Using ML to predict insulin use for Type 1 Diabetes (TensorFlow Meets)

TFX: an end-to-end machine learning platform for TensorFlow (TensorFlow Meets)

TFX: an end-to-end machine learning platform for TensorFlow (TensorFlow Meets)

This video provides an overview of TensorFlow in production, covering TensorFlow Extended, TensorFlow Hub, and TensorFlow Serving. It introduces the key features and innovations of these tools, including reusable machine learning modules, distributed serving, and model evaluation. The session is designed for beginners and provides a comprehensive understanding of TensorFlow in production.

Key Takeaways

Use pre-trained models from TensorFlow Hub
Compose modules together like building blocks
Add custom classification on top of pre-trained models
Train custom models with custom data
Fine-tune pre-trained models for better accuracy
Deploy models using TensorFlow Serving
Evaluate model performance using TF Model Analysis

💡 TensorFlow Extended, TensorFlow Hub, and TensorFlow Serving provide a comprehensive platform for building, deploying, and evaluating machine learning models in production.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Engineering

View skill →

Build an LLM and RAG-based Chat Application using AlloyDB and LangChain

FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Ultimate Guide: Deploy Google ADK Agents to Vertex AI & Cloud Run (Step-by-Step Tutorial)

Ultimate Guide: Deploy Google ADK Agents to Vertex AI & Cloud Run (Step-by-Step Tutorial)

Shane | LLM Implementation

How to Make an Asteroids Game Bot (LIVE)

How to Make an Asteroids Game Bot (LIVE)

Using Claude Code + Nano Banana Pro To Create a Dataset of Engineering Drawings

Using Claude Code + Nano Banana Pro To Create a Dataset of Engineering Drawings

Automata Learning Lab

Related AI Lessons

The AI Moat Paradox: The Better Models Become, the Less Models Matter

The AI moat paradox suggests that as AI models improve, their importance may decrease, and understanding this concept is crucial for AI professionals and businesses.

170,927 AI Papers Reveal the Biggest Research Shifts of the First Half of 2026

Discover the biggest AI research shifts of 2026 based on 170,927 papers, and learn how to apply these trends to your work

Medium · Machine Learning

170,927 AI Papers Reveal the Biggest Research Shifts of the First Half of 2026

Discover the major research shifts in AI from 170,927 papers published in the first half of 2026, and learn how to analyze trends in AI research

Medium · Data Science

[PoV] When Everyone Is Smart, No One Is

In a world where AI makes everyone smart, the value of intelligence decreases, and new challenges arise

‘ENOUGH IS ENOUGH’: Lebanon is STANDING UP to Iran, expert says