3D Deep Learning with PyTorch3D

PyTorch · Beginner ·📰 AI News & Updates ·5y ago

Key Takeaways

PyTorch3D is a library of optimized, efficient, reusable components in PyTorch for state-of-the-art 3D deep learning tasks, providing methods for loading meshes, applying transforms, and using differentiable rendering to learn scene properties.

Full Transcript

hi everyone welcome to this tutorial on 3d deep learning with pytorch 3d my name is nikhila ravi and i'm a research engineer in the facebook ai research team working on computer vision and 3d understanding in this tutorial i'll give you an overview of the pytorch 3d library and then walk you through how to use several components including code examples in particular i'll cover the data structures common operations such as data loading and transformations loss functions and differentiable rendering firstly what is pytorch 3d it's a library of reusable components for state-of-the-art 3d planning research tasks the goals of pytorch 3d are to combine the features of a good deep learning library with the features needed for working with 3d data a key focus throughout is efficiency modularity and differentiability several components have custom cuda implementations for fast performance in addition most operators natively support heterogeneous batching of 3d data such as batching meshes of different sizes by torch 3d has pre-built packages for anaconda and can be easily installed with a few commands it has few external dependencies you can find detailed installation instructions on the pytorch 3d github repository this is an overview of the main components in the code base the foundation layer consists of data structures for 3d data data loading utilities and composable transforms the data structures in particular enable the operators and loss functions in the second layer to efficiently support heterogeneous batching to start let's look at the data structures for 3d data we found that batching meshes and point clouds requires different batching strategies and the flexibility to be able to move from one representation to the other meshes takes as input the vertices and faces for a batch of meshes you can start by defining a batch of meshes as a list of tensors we can then easily switch to a packed representation which is just a different view on the same data with this representation we need some auxiliary information for example the first indices into the packed tensor for each batch element the packed representation is useful for operations like graph convolution we might then need to reshape the vertices to add back in the batch dimension and this involves padding the vertices based on the number of vertices of the largest mesh in the batch the padded representation is useful for other operators like vertex line we can see why this flexibility is important by looking at the architecture diagram the mesh rcnn a paper from iccv 2019 which is built using pytorch 3d the meshes data structure is used throughout and the representation of the vertices and faces in the batch is interchanged multiple times during the end to end loop here's a quick code example of how you can use the meshes data structure and easily switch between different views and also access other properties of the mesh we start by importing meshes from the structures module we can initialize a list of the vertices and faces of all the meshes in the batch as a list of tensors we can then initialize the meshes class by calling the constructor with the list representations we can switch to a different representation such as the packed representation by calling the appropriate method and we can access the auxiliary tensors by calling their respective methods finally we can access other computed properties of the mesh such as the edges another set of common functions are loading utilities for 3d data and composable 3d transforms a common task for almost all projects is loading data from file for example loading meshes pi torch 3d provides methods for loading meshes from obj files here we load the vertices and faces and auxiliary information the faces and aux variables are in fact named tuples which contain a number of different variables we can get the face indices using the verts index key the normals and texture information can be retrieved from the aux tuple in many cases you will use the data from load obj to construct a meshes object in this case you can use the load objs as meshes function to directly load a mesh from file into a meshes object the batched mesh is of type meshes and in this example contains a batch of three meshes transforming 3d data is another common task pytorch 3d has a general purpose transforms 3d class with subclasses to support different types of transforms we can create separate translate and rotate transforms both of which can be independently applied to a tensor of x y z points or they can also be composed to create one combined transform you can also use the transform methods directly on the transform's 3d class for example here we have an xyz scaling followed by an xyz translation next let's look at some of the optimized operators in pi torch 3d k nearest neighbors is a function that's used frequently with point clouds here we have two point clouds p and q for a given point in cloud p the goal is to find the k closest points in cloud q for example k equals five in pytorch 3d we implement exact k n with custom kuder kernels that natively handle heterogeneous batches here's a quick code example we import k n points from the ops module we can then initialize two random tensors and then call the k n points method with the point and the desired value of k another operator which is used frequently with meshes is graph convolution each vertex in the mesh can have an associated feature vector f i graph convolution computes new feature vectors for each vertex propagating information along edges of the mesh for one particular node this involves two steps one gathering the features of all the adjacent nodes and summing them and two adding them back to the node's own feature vector the graphconf class is available in the ops module of pytorch 3d this can be initialized using the input and output dimensions as well as the method of initialization for the weights tensors and whether the graph is assumed to be directed or undirected the graphcon function is then called with the verts and edges of the mesh next let's look at some of the loss functions available in pi torch 3d chamfer loss is a method of comparing two sets of point clouds for example these points might be samples from the surface of a mesh chamfer loss is used as a loss function in many 3d planning research tasks for each point in set 1 we need to find the nearest neighbors in set 2 and then vice versa here is a quick example we first import the chamfer distance function along with two helper functions one to create a sphere mesh and another function to differentiably sample a point cloud from the surface of the mesh we then initialize two spheres of different topologies and sample 5000 points from the surface of each of these measures finally we use these points to calculate the chamfer loss lastly let's look at the differentiable rendering module in pi torch 3d what does having differentiable rendering step and a training loop mean a 3d scene can be composed of a number of different components including a mesh with textures light sources and a camera which is the viewpoint from which the image is generated now how do all these scene properties come into play in differentiable rendering each of these properties could be a variable which we want to learn for example the position of the camera the intensity of the light or the position of the mesh vertices in the forward pass we transform a mesh and pass it through a renderer to generate an image the image might then be used as part of a loss function we then want to propagate gradients back through the whole system and update the scene properties this is where the renderer needs to be differentiable so we can learn the scene properties in an end-to-end way the pytorch 3d renderer is split into two parts a rasterizer and a shader it can take as an input a heterogeneous batch of meshes and associated textures the first step inside the rasterizer is to use a camera to transform and project the input batch of meshes onto the 2d plane the next step is the rasterization from which we output four intermediate variables for each pixel which we called the fragment data this includes the z-buffer 2d euclidean distance barycentric coordinates and the face indices we also output not just the closest value but the top k values for each of these variables in the shader we continue to keep the top k values while applying shading and texturing and finally in the blending step aggregate across the top k values the rasterization step is encoder for efficiency but the rest of the pipeline is in pi torch for increased modularity and ease of experimentation here is a quick example of how to set up a renderer with pytorch 3d we have more detailed examples in the tutorial section of the pytorch 3d github code base first import the necessary components from the renderer module next we need to initialize a camera and here we use a perspective camera and the look at transform to determine the rotate and translate transforms next we can initialize the rasterization settings which include the faces per pixel which corresponds to the k parameter so this determines the top k values which are returned from the rasterizer for a full explanation of the parameters please refer to the pytorch 3d documentation next we initialize a renderer by composing a rasterizer and a shader there can be many different types of shaders and it's also very easy to create your own if the mesh or any of the scene properties had tenses with requires grad equals true i.e we want to learn this parameter we can easily back propagate through the entire system for example given a ground truth output image we can calculate the loss and then directly call backward on the loss the tutorials have more detailed examples of learning using the renderer in the blending step while we aggregate across the top k values it's very easy to try different blending functions in pi torch the blending for this cube uses a soft max blending formulation from soft rasterizer which can be written in a few lines of code and pie torch we have three different types of mesh texturing options including vertex textures vertex uv coordinates and a texture map and a texture atlas where each face has its own small r cross r texture map the texture type can be chosen based on your use case vertex textures are the simplest to implement uv coordinates and texture maps enable more detailed textures but are limited to one texture map per mesh and finally texture atlas allows representation of complex mesh textures such as shape net meshes which have multiple texture maps per mesh i want to conclude by highlighting how you can get started with pytorch 3d on the github repository we have several tutorials which take you step by step through some example use cases these tutorials can also be run with google colab so you can try the code without having to download or install anything the tutorials include 3d shape prediction bundle adjustment pose optimization and textured mesh rendering from multiple viewpoints thanks a lot for listening you can find the code on github or also via the pytorch 3d website and there you can also find links to the documentation and tutorials we hope you found this tutorial useful and we look forward to seeing the projects you build for the hackathon you

Original Description

Facebook AI Research Engineer Nikhila Ravi presents an informative overview of PyTorch3D, a library of optimized, efficient, reusable components in PyTorch for state-of-the-art 3D deep learning tasks. Efficiency, modularity, and differentiability are the key elements of PyTorch3D that bring faster performance to any 3D Deep Learning project. Subscribe to this page to get the latest news, updates, and weekly tutorials planned for the full duration of the Hackathon. Haven't signed up yet? Get involved, and learn how you could build with the community and also have a chance to win up to $25,000: https://bit.ly/2ZwLYKX
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from PyTorch · PyTorch · 53 of 60

1 What is PyTorch?
What is PyTorch?
PyTorch
2 PyTorch Tutorial: A Quick Preview
PyTorch Tutorial: A Quick Preview
PyTorch
3 PyTorch Summer Hackathon 2019
PyTorch Summer Hackathon 2019
PyTorch
4 Tips and Tricks on Hacking with PyTorch: A Quick Tutorial by Brad Heintz
Tips and Tricks on Hacking with PyTorch: A Quick Tutorial by Brad Heintz
PyTorch
5 PyTorch 1.2 and PyTorch Hub: A Quick Introduction by Soumith Chintala and Ailing Zhang
PyTorch 1.2 and PyTorch Hub: A Quick Introduction by Soumith Chintala and Ailing Zhang
PyTorch
6 Torchtext 0.4 with Supervised Learning Datasets: A Quick Introduction by George Zhang
Torchtext 0.4 with Supervised Learning Datasets: A Quick Introduction by George Zhang
PyTorch
7 Torchaudio 0.3 with Kaldi Compatibility, New Transforms: A Quick Introduction by Jason Lian
Torchaudio 0.3 with Kaldi Compatibility, New Transforms: A Quick Introduction by Jason Lian
PyTorch
8 Torchvision 0.4 with Support for Video: A Quick Introduction by Francisco Massa
Torchvision 0.4 with Support for Video: A Quick Introduction by Francisco Massa
PyTorch
9 Introduction to Machine Learning for Developers at F8 2019
Introduction to Machine Learning for Developers at F8 2019
PyTorch
10 Powered by PyTorch at F8 2019
Powered by PyTorch at F8 2019
PyTorch
11 Developing and Scaling AI Experiences at Facebook with PyTorch at F8 2019
Developing and Scaling AI Experiences at Facebook with PyTorch at F8 2019
PyTorch
12 New Approaches to Image and Video Reconstruction Using Deep Learning at Facebook at F8 2019
New Approaches to Image and Video Reconstruction Using Deep Learning at Facebook at F8 2019
PyTorch
13 PyTorch Developer Conference 2018: Recap
PyTorch Developer Conference 2018: Recap
PyTorch
14 PyTorch Developer Conference 2018: Keynote & Deep Dive
PyTorch Developer Conference 2018: Keynote & Deep Dive
PyTorch
15 PyTorch Developer Conference 2018: Production & Research Sessions
PyTorch Developer Conference 2018: Production & Research Sessions
PyTorch
16 PyTorch Developer Conference 2018: Cloud & Academia Sessions
PyTorch Developer Conference 2018: Cloud & Academia Sessions
PyTorch
17 PyTorch Developer Conference 2018: Enterprise, Education, & Future of AI Panel
PyTorch Developer Conference 2018: Enterprise, Education, & Future of AI Panel
PyTorch
18 PyTorch Developer Conference 2019 | Full Livestream
PyTorch Developer Conference 2019 | Full Livestream
PyTorch
19 PyTorch Developer Conference 2019: Recap
PyTorch Developer Conference 2019: Recap
PyTorch
20 PyTorch Developer Conference Keynote - Mike Schroepfer
PyTorch Developer Conference Keynote - Mike Schroepfer
PyTorch
21 What’s new in PyTorch 1.3 - Lin Qiao
What’s new in PyTorch 1.3 - Lin Qiao
PyTorch
22 PyTorch Front-End Features: Named Tensors and Type Promotion - Gregory Chanan
PyTorch Front-End Features: Named Tensors and Type Promotion - Gregory Chanan
PyTorch
23 Research to Production: PyTorch JIT/TorchScript Updates - Michael Suo
Research to Production: PyTorch JIT/TorchScript Updates - Michael Suo
PyTorch
24 Quantization - Dmytro Dzhulgakov
Quantization - Dmytro Dzhulgakov
PyTorch
25 PyTorch ONNX Export Support - Lara Haidar, Microsoft
PyTorch ONNX Export Support - Lara Haidar, Microsoft
PyTorch
26 Apex -  Michael Carilli, NVIDIA
Apex - Michael Carilli, NVIDIA
PyTorch
27 Dataloader Design for PyTorch - Tongzhou Wang, MIT
Dataloader Design for PyTorch - Tongzhou Wang, MIT
PyTorch
28 Linear Algebra in PyTorch - Vishwak Srinivasan, CMU
Linear Algebra in PyTorch - Vishwak Srinivasan, CMU
PyTorch
29 PyTorch Mobile - David Reiss
PyTorch Mobile - David Reiss
PyTorch
30 Model Interpretability with Captum - Narine Kokhilkyan
Model Interpretability with Captum - Narine Kokhilkyan
PyTorch
31 Detectron2 - Next Gen Object Detection Library - Yuxin Wu
Detectron2 - Next Gen Object Detection Library - Yuxin Wu
PyTorch
32 Speech Extensions to Fairseq - Dmytro Okhonko
Speech Extensions to Fairseq - Dmytro Okhonko
PyTorch
33 PyTorch on Google Cloud TPUs - Google, Salesforce, Facebook
PyTorch on Google Cloud TPUs - Google, Salesforce, Facebook
PyTorch
34 PyTorch Summer Hackathon Winners - Joe Spisak, Sebastien Arnold, Tristan Deleu
PyTorch Summer Hackathon Winners - Joe Spisak, Sebastien Arnold, Tristan Deleu
PyTorch
35 PyTorch in Robotics - Yisong Yue, Caltech
PyTorch in Robotics - Yisong Yue, Caltech
PyTorch
36 StanfordNLP - Yuhao Zhang, Stanford
StanfordNLP - Yuhao Zhang, Stanford
PyTorch
37 Sotabench for Reproducible Research - Robert Stojnic, Papers with Code
Sotabench for Reproducible Research - Robert Stojnic, Papers with Code
PyTorch
38 Collaborative Natural Language Inference - Sasha Rush, Cornell
Collaborative Natural Language Inference - Sasha Rush, Cornell
PyTorch
39 Privacy Preserving AI - Andrew Trask, OpenMined
Privacy Preserving AI - Andrew Trask, OpenMined
PyTorch
40 CrypTen - Laurens van der Maaten
CrypTen - Laurens van der Maaten
PyTorch
41 PyTorch at Uber - Sidney Zhang, Uber
PyTorch at Uber - Sidney Zhang, Uber
PyTorch
42 PyTorch at Tesla - Andrej Karpathy, Tesla
PyTorch at Tesla - Andrej Karpathy, Tesla
PyTorch
43 PyTorch at Microsoft - Saurabh Tiwary, Microsoft
PyTorch at Microsoft - Saurabh Tiwary, Microsoft
PyTorch
44 PyTorch at Dolby Labs - Vivek Kumar, Dolby Labs
PyTorch at Dolby Labs - Vivek Kumar, Dolby Labs
PyTorch
45 PyTorch Developer Conference 2019 - Panel Discussion
PyTorch Developer Conference 2019 - Panel Discussion
PyTorch
46 Using deep learning and PyTorch to power next gen aircraft at Caltech
Using deep learning and PyTorch to power next gen aircraft at Caltech
PyTorch
47 Named Tensors, Model Quantization, and the Latest PyTorch Features - Part 1
Named Tensors, Model Quantization, and the Latest PyTorch Features - Part 1
PyTorch
48 TorchScript and PyTorch JIT | Deep Dive
TorchScript and PyTorch JIT | Deep Dive
PyTorch
49 Announcing the PyTorch Global Summer Hackathon 2020
Announcing the PyTorch Global Summer Hackathon 2020
PyTorch
50 Opening Up the Black Box: Model Understanding with Captum and PyTorch
Opening Up the Black Box: Model Understanding with Captum and PyTorch
PyTorch
51 PyTorch Mobile Runtime for Android
PyTorch Mobile Runtime for Android
PyTorch
52 Torchvision in 5 minutes
Torchvision in 5 minutes
PyTorch
3D Deep Learning with PyTorch3D
3D Deep Learning with PyTorch3D
PyTorch
54 What is Torchtext?
What is Torchtext?
PyTorch
55 TorchAudio: A Quick Intro
TorchAudio: A Quick Intro
PyTorch
56 PyTorch Mobile Runtime for iOS
PyTorch Mobile Runtime for iOS
PyTorch
57 PySlowFast: Deep learning with Video
PySlowFast: Deep learning with Video
PyTorch
58 PyTorch Pruning | How it's Made by Michela Paganini
PyTorch Pruning | How it's Made by Michela Paganini
PyTorch
59 Measuring Fairness in Machine Learning Systems
Measuring Fairness in Machine Learning Systems
PyTorch
60 PyTorch for Hackathons
PyTorch for Hackathons
PyTorch

PyTorch3D is a library for state-of-the-art 3D deep learning tasks, providing methods for loading meshes, applying transforms, and using differentiable rendering to learn scene properties. This library is useful for tasks such as 3D shape prediction, bundle adjustment, and pose optimization.

Key Takeaways
  1. Load a mesh from an OBJ file using the load_objs function
  2. Create a Meshes object from a batch of meshes
  3. Apply a transform to a tensor of x, y, z points
  4. Use the k-nearest neighbors function to find the k closest points in a point cloud
  5. Compute new feature vectors for each vertex using graph convolution
  6. Initialize a camera and a rasterizer
  7. Transform and project meshes onto a 2D plane
  8. Output intermediate variables for each pixel
  9. Apply shading and texturing
  10. Aggregate across top k values
💡 PyTorch3D provides a differentiable renderer to learn scene properties in an end-to-end way, which is useful for tasks such as 3D shape prediction, bundle adjustment, and pose optimization.

Related AI Lessons

The AI Moat Paradox: The Better Models Become, the Less Models Matter
The AI moat paradox suggests that as AI models improve, their importance may decrease, and understanding this concept is crucial for AI professionals and businesses.
Medium · AI
170,927 AI Papers Reveal the Biggest Research Shifts of the First Half of 2026
Discover the biggest AI research shifts of 2026 based on 170,927 papers, and learn how to apply these trends to your work
Medium · Machine Learning
170,927 AI Papers Reveal the Biggest Research Shifts of the First Half of 2026
Discover the major research shifts in AI from 170,927 papers published in the first half of 2026, and learn how to analyze trends in AI research
Medium · Data Science
[PoV] When Everyone Is Smart, No One Is
In a world where AI makes everyone smart, the value of intelligence decreases, and new challenges arise
Medium · AI
Up next
‘ENOUGH IS ENOUGH’: Lebanon is STANDING UP to Iran, expert says
Fox Business
Watch →