PyTorch Front-End Features: Named Tensors and Type Promotion - Gregory Chanan
Skills:
ML Maths Basics70%
Key Takeaways
The video discusses PyTorch's new front-end features, specifically named tensors and type promotion, which aim to simplify the user experience and improve productivity in deep learning tasks. Named tensors allow users to name dimensions, making code more readable and maintainable, while type promotion enables automatic handling of data type mismatches in tensor operations.
Full Transcript
[Music] hi everyone my name is Greg Chanin and I am one of the tech leads on PI torch and I'm going to be talking about the new front-end features available in pi torch 1.3 or how name tensors and tie promotion can simplify your PI torch life and the PI torch front-end were really guided by having the best user experience by focusing on expressivity and productivity you see this from the beginning of high torch where we focused on building a framework around writing programs not manually building graphs and if we apply this thinking to pi torch today to think you know how can we make pi torch even more expressive and make using pi torch even more productive where can we make improvements I think one area is around retaining semantic information so today users start with images with text with video but as soon as you start doing PI torch operations we kind of force you to throw away that information and use an abstract mathematical object called the tensor so the idea behind the in tensor is super simple it's to name dimensions so if we take our image we turn it into the typical 3d tensor format instead of erasing what the dimensions mean we're gonna name them explicitly so height width channel and HW see in this example this idea was proposed in a blog called tensor considered harmful by Professor Sascha rush who's now at Cornell Tech and we worked closely with professor rush on developing this named tensor feature okay so how does this look like what does this look like in code today you'll often see tensor dimensions being named by comments so here we set up a tensor and then we name it n CHW and then when we access the dimensions we access them by positions so here we are summing over the channels and I know that because I can you know refer back to the comment and this is trivial in a tusla to line slide example but in a real program with complex shape manipulation operation this can be very onerous to track so at its core name tensors is a simple extension to the to the tensor API which is just passing names explicitly so here we pass the names of the dimensions as n CHW and then I can access dimensions by name to the by position so here we're summing over the channels just like we did on the left hand side and this is more readable because it's closer to our intent right I want to sum over the channels and I don't want to sum over the first dimension and it's also more maintainable because if my shape manipulation code changes the names are automatically propagated and so if my channels end up in a different position this code is resilient to those changes okay so that's a very high-level view of the API let's I'm just going to go through a quick case study on image normalization here's a function from torch vision which is normalizing the channels over a batch of images and if you look at the implementation in torch vision you'll see something like this which is using none indexing and if you're familiar with not indexing essentially it's shifting the dimension positions by one this is pretty hard to read it's pretty unintuitive because like you can't reason from first principles what indexing a dimension by none means and it also has this other issue where there are many different formats and if you if you manipulate dimensions by position and you have different formats like NC h WN h WC etc you have to have multiple normalization functions and then you also have to be very careful because we type a centrally type a trace with the dimensions mean not to call the wrong normalization function for your format so how can we improve this situation we just use this nifty new function called align as so same set up but now we're going to say that all the tensors are named and we're going to again subtract the mean divide off the standard deviation but instead of using none indexing we're gonna call aligned as so essentially what this does is shifts over the channel dimension of the mean to match wherever the channels are in the batch of images so this is more readable than the non indexing and it also works for all the different formats so no matter where the channels dimensions are positionally in the image the channels will be just be shifted by name and so I only need one normalization function and I no longer need to worry about calling the right normalization function for my format okay that's a super brief introduction to name tensors it's considered experimental in version 1.3 but we would love if you tried it out and gave us feedback the core functionality is in an eager mode this basically means that the top-level Torche operations are supported have name propagation rules you can also mix named and unnamed tensors this is a useful property so that you can add name tensors incrementally to your program you don't have to do it all in one go but this is interesting to you there's a more in-depth tutorial online about supporting name tensors in multi-headed attention that just goes into a lot more detail in the future we'll be expanding coverage to more apply torch so this means most of the NN package will be supported we will propagate Auto grad names today you can you can run Auto grad on name tensors but the gradients that come out are unnamed and we'll do similar things for serialization multi-processing distributed and JIT okay that was named tensors I'm going to talk briefly about type promotion which is just a nice quality of life improvement that we've added in version 1.3 around MIT's d-type operations so you may have seen or written PI torch code like this in the past this is just adding a Python number to a tensor and this just works even though the number is an integer and the tensor is floating point where pi charges handles this automatically for you but in previous versions of high torch if we tried to generalize this to tensors and replace the integer number with a integer tensor you would get an error that complained about D type mismatch so essentially we've just generalized this this feature to tensors now this will just work in 1.3 and it also works for all D types so that means that you know I showed an example of floating point tensors and integer tensors but you can mix float32 and float64 tensors and essentially the type promotion system will just pick the minimal D type that retains the fidelity of the data so type promotion is available in 1.3 arithmetic operations comparison operations and a number of other operations are supported there's full documentation on the website about all the rules if for you numpy fans out there the rules are very similar to numpy we just made some slight tweaks to support our use case and in the future we'll be expanding coverage to the long tail of operators that was a brief intro to name tensors and type promotion we'd love for you to try them give us feedback so we can tailor it for your use case and our hope here is really that incorporating these features into your program will make hi towards programs more readable more maintainable less air prone and ultimately that that makes writing PI torch even more enjoyable than it is today thank you [Applause] [Music]
Original Description
Cornell University’s Sasha Rush has argued that, despite its ubiquity in deep learning, the traditional implementation of tensors has significant shortcomings, such as exposing private dimensions, broadcasting based on absolute position, and keeping type information in documentation. He proposed named tensors as an alternative approach. PyTorch now supports the ability to name tensors, allowing for clearer code with less need for inline comments.
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from PyTorch · PyTorch · 22 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
▶
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
What is PyTorch?
PyTorch
PyTorch Tutorial: A Quick Preview
PyTorch
PyTorch Summer Hackathon 2019
PyTorch
Tips and Tricks on Hacking with PyTorch: A Quick Tutorial by Brad Heintz
PyTorch
PyTorch 1.2 and PyTorch Hub: A Quick Introduction by Soumith Chintala and Ailing Zhang
PyTorch
Torchtext 0.4 with Supervised Learning Datasets: A Quick Introduction by George Zhang
PyTorch
Torchaudio 0.3 with Kaldi Compatibility, New Transforms: A Quick Introduction by Jason Lian
PyTorch
Torchvision 0.4 with Support for Video: A Quick Introduction by Francisco Massa
PyTorch
Introduction to Machine Learning for Developers at F8 2019
PyTorch
Powered by PyTorch at F8 2019
PyTorch
Developing and Scaling AI Experiences at Facebook with PyTorch at F8 2019
PyTorch
New Approaches to Image and Video Reconstruction Using Deep Learning at Facebook at F8 2019
PyTorch
PyTorch Developer Conference 2018: Recap
PyTorch
PyTorch Developer Conference 2018: Keynote & Deep Dive
PyTorch
PyTorch Developer Conference 2018: Production & Research Sessions
PyTorch
PyTorch Developer Conference 2018: Cloud & Academia Sessions
PyTorch
PyTorch Developer Conference 2018: Enterprise, Education, & Future of AI Panel
PyTorch
PyTorch Developer Conference 2019 | Full Livestream
PyTorch
PyTorch Developer Conference 2019: Recap
PyTorch
PyTorch Developer Conference Keynote - Mike Schroepfer
PyTorch
What’s new in PyTorch 1.3 - Lin Qiao
PyTorch
PyTorch Front-End Features: Named Tensors and Type Promotion - Gregory Chanan
PyTorch
Research to Production: PyTorch JIT/TorchScript Updates - Michael Suo
PyTorch
Quantization - Dmytro Dzhulgakov
PyTorch
PyTorch ONNX Export Support - Lara Haidar, Microsoft
PyTorch
Apex - Michael Carilli, NVIDIA
PyTorch
Dataloader Design for PyTorch - Tongzhou Wang, MIT
PyTorch
Linear Algebra in PyTorch - Vishwak Srinivasan, CMU
PyTorch
PyTorch Mobile - David Reiss
PyTorch
Model Interpretability with Captum - Narine Kokhilkyan
PyTorch
Detectron2 - Next Gen Object Detection Library - Yuxin Wu
PyTorch
Speech Extensions to Fairseq - Dmytro Okhonko
PyTorch
PyTorch on Google Cloud TPUs - Google, Salesforce, Facebook
PyTorch
PyTorch Summer Hackathon Winners - Joe Spisak, Sebastien Arnold, Tristan Deleu
PyTorch
PyTorch in Robotics - Yisong Yue, Caltech
PyTorch
StanfordNLP - Yuhao Zhang, Stanford
PyTorch
Sotabench for Reproducible Research - Robert Stojnic, Papers with Code
PyTorch
Collaborative Natural Language Inference - Sasha Rush, Cornell
PyTorch
Privacy Preserving AI - Andrew Trask, OpenMined
PyTorch
CrypTen - Laurens van der Maaten
PyTorch
PyTorch at Uber - Sidney Zhang, Uber
PyTorch
PyTorch at Tesla - Andrej Karpathy, Tesla
PyTorch
PyTorch at Microsoft - Saurabh Tiwary, Microsoft
PyTorch
PyTorch at Dolby Labs - Vivek Kumar, Dolby Labs
PyTorch
PyTorch Developer Conference 2019 - Panel Discussion
PyTorch
Using deep learning and PyTorch to power next gen aircraft at Caltech
PyTorch
Named Tensors, Model Quantization, and the Latest PyTorch Features - Part 1
PyTorch
TorchScript and PyTorch JIT | Deep Dive
PyTorch
Announcing the PyTorch Global Summer Hackathon 2020
PyTorch
Opening Up the Black Box: Model Understanding with Captum and PyTorch
PyTorch
PyTorch Mobile Runtime for Android
PyTorch
Torchvision in 5 minutes
PyTorch
3D Deep Learning with PyTorch3D
PyTorch
What is Torchtext?
PyTorch
TorchAudio: A Quick Intro
PyTorch
PyTorch Mobile Runtime for iOS
PyTorch
PySlowFast: Deep learning with Video
PyTorch
PyTorch Pruning | How it's Made by Michela Paganini
PyTorch
Measuring Fairness in Machine Learning Systems
PyTorch
PyTorch for Hackathons
PyTorch
More on: ML Maths Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Want to get started with deep learning
Reddit r/deeplearning
Building a Deepfake Detector From Scratch — What Nobody Tells You
Medium · Deep Learning
Unfolding the Meandering Path: High-Dimensional Invariance and the Flat 2D Plane of Neural…
Medium · Deep Learning
Implementing Neural Style Transfer from Scratch: The Project That Started It All
Medium · Deep Learning
🎓
Tutor Explanation
DeepCamp AI