How to use edge features in Graph Neural Networks (and PyTorch Geometric)

DeepFindr · Beginner ·🧬 Deep Learning ·5y ago

Skills: Research Methods90%Reading ML Papers90%Paper Reproduction80%RAG Basics60%Vector Stores60%

Key Takeaways

The video discusses how to use edge features in Graph Neural Networks (GNNs) and PyTorch Geometric, covering topics such as edge weights, edge types, and edge features, with references to research papers and implementations in PyTorch Geometric.

Full Transcript

hello everyone today we want to have a closer look at how we can include edge features and graphene networks i assume in the following that you are familiar with gnns or have watched the corresponding series i recently uploaded at the end i'll also quickly show how edge features can be used in pytorch geometric let's start with the question why edge features are even important isn't the information in a node sufficient to create meaningful embeddings a typical graph can be a social network like the one shown here note features in our graph are for instance the age of the people their weight or whether they smoke additionally we know for each of our notes if they like the movie hobbit these are the labels in this example now let's assume we have a new person joining our graph and of course we are immediately interested if the person is also a hobbit fan well to answer this question we can build a graph new network that combines the note features and the connections of the notes in order to classify this new member of the network doing so all we use if two people are connected or not but there is so much more information we can get out of this relationship if we had edge features that describe the type of connection such as since when the people are friends or if they live together we would have a valuable additional source of information and this is the case for many applications of graph new networks by adding further properties to the edges so not just the binary information we can empower the gnn to get much better in the following i want to show a couple of ways how edge features are typically utilized in the literature to help you to get started using edge features in graph new networks is still a hot research topic and there are different ways how we can do this as you might know etch features are just like note features nothing else but a vector of values let's start with the most basic form of this vector a single binary value this simply means either we have a connection or not if we have a look at our simple graph we can easily represent the connections in a matrix numerically this can be converted to either one or zero and voila that's the adjacency matrix of our graph it's symmetrical along the diagonal as we have bi-directional edges in our social network to make sure that we are on the same page let's quickly have a look at how this basic edge information is utilized in a regular gnn so i try to generalize the overall process in graph new networks to make sure we have the same thing in mind let me tell you that it's not straightforward to generalize all of the different g and n variants into one summary and please forgive me if there's an approach that doesn't fit perfectly into that pattern say we want to generate a note embedding for alice what we always do is collect the neighbor notes in our case those two gentlemen with the note feature vectors in blue next we prepare the messages for the message passing step most gene ends therefore apply some sort of differential transformation to these note features in order to get a high level representation this can be simply a multi-layer perceptron but also things like relu these transformed representations are then aggregated in some way the important thing here is that this aggregation is permutation invariant that means the order of our notes is not relevant these aggregations are often also normalized according to the degree of the node which means how many neighbors a node has what we retrieve is a summarized representation of alice's neighborhood in the graph finally we combine the original note features with the aggregated neighbor embedding and this can be again any differential function such as another mlp a gated recurrent unit or just a sum we obtain a new embedding for alice that contains information about her and her neighbors this embedding can be used to perform a prediction for our hobbit classification by using another fully connected layer then we can calculate the loss so how far are we away from the correct prediction and then we adjust all the learnable matrices in our layers such as transform and update that's especially the reason why they need to be differentiable we want to be able to calculate gradients so that's how we perform representation learning in a nutshell we can summarize this procedure in the following formula again there exist many different variants so this might deviate from approach to approach for instance we can add self loops and simplify the formula like this as alice herself is now part of her neighborhood okay so now back to the original question where is the edge information used in this process the basic binary edge information is used directly when we select the neighbor nodes for this selection we of course don't loop over all nodes instead in a g and n layer matrix multiplications are performed when we multiply the adjacency matrix with the feature matrix this neighborhood aggregation is implicitly performed all non-adjacent nodes are basically zeroed out and we only share information between the nodes that are directly connected so in our formula this part stands for the multiplication with the adjacency matrix so far so good now the first trivial option to utilize edge features is by using edge weights that simply means instead of ones and zeroes we have weights in the adjacency matrix for instance we could encode how happy the people are with the other person okay this is a stupid example but for instance alice likes it a lot to spend time with her boyfriend but not vice versa that's why we put a 0.9 and a 0.4 here let's have a look at this propagation formula in the matrix form from the gcn paper the first part is the normalized adjacency matrix x the current node feature matrix and the last part is the multiplication with the learnable weight matrix x prime is the new embedding it's straightforward to replace the adjacency matrix now with the weighted adjacency matrix and as a result people alice is close to are more emphasized in the propagation this usage of edge weights can be easily added in most of the graph neural network implementations now imagine we not only have a wait for the connection but also use different types of connections in our social network we would for instance differentiate between different relationship types such as friends couple or colleagues if we have such a setup our edge features are simply one-dimensional vectors with integer values this for instance typically occurs when working with molecule data as you have single double or triple bonds there exist several papers on how we can include such discrete edge types in a gnn the first approach we want to have a look at is called relational graph convolutional network from the paper reference below let's quickly investigate this propagation formula we calculate alice's new embedding by summing over the neighbor nodes so this is our aggregation and applying a mlp transformation to each of these node feature vectors finally there's a non-linear function such as relu applied to generate a new embedding the green section is just a normalization and the last part of this formula is another transformation applied on alice's original note features which doesn't really fit into my structure here the new part here is now that we have the sum over r and this sum simply represents the different relations we have so edge types you see that the weight matrix is indexed with this r as well that simply means depending on the type of edge we apply different transformations to the nodes this is sometimes also called edge conditioned gnn if we visualize this we quickly see depending on the type of alice's neighbor friend couple or colleague we pass the note vector through the corresponding weight matrix doing so we can include the edge information as we have different transformations applied based on the type of connection as a consequence we will of course also have different adjacency matrices so one that holds the information for friends one for couple connections and finally another one for colleagues also note how the embedding of alice's partner is yellow and the embedding of her friend is blue as they went through different transformations so as you see this first approach is pretty intuitive let's have a look at the next paper which is called graph new networks with feature-wise linear modulation the propagation formula looks slightly different but regarding edges we have exactly the same concept here we again sum over all neighbors but differentiate between the type of connection l and this l is also the index of our transformation matrix w so the transformation we apply on each different neighbor node vector depends on the relationship with alice there are a couple of other things happening in this formula but we can ignore them as we just want to look at edge related things here for including different edge types other similar papers exist but i think you get a point how this can be handled i found this overview in the g gnn film paper which provides a nice summary it shows how different note features a b c d are multiplied with separate weight matrices the little arrow that appears in the index of some weight matrices stands for self loops now let's have a look at the most interesting and also most general case what if we have multi-dimensional vectors for each of our edges this is basically what we had in the introductory example when we added since when are people friends or if they live together one way to handle these edge features is to directly integrate them into the transformation of the neighborhood states let's have a look at the general propagation formula presented in the message passing neural network paper here we see that we can include the edge features e between node w and node v in the transformation step when we calculate the embedding you see that we have two indices here so that's the edge information from node w to node v another way to think of these edge features is like an adjacency matrix that has vectors instead of ones and zeros so the zero here stands for edge feature vector filled with zeros the shape of our adjacency matrix is then number of nodes times number of nodes times the dimension of the edge features this is just a side note and happens mostly internally when multiplying the different matrices again let's have a look at a couple of papers to understand how we can include these multi-dimensional edge features in the paper neural message passing for quantum chemistry the authors simply input both the note features as well as the edge features into the message function this transformation is typically a multi-layer perceptron so we can visualize it like this in the case of alice we always take alice's embedding her direct neighbor and in between the edge features for that connection this way we include the edge features into our transformed representation the other things in this propagation formula are already familiar to us we perform some sort of aggregation and combine the representation through an update function with alice's original note features pretty intuitive right a similar idea can be found in the paper principle neighborhood aggregation for graph nets here we also simply include the edge features into the transformation step as it's shown here the paper about crystal graph convolutional nets uses the same approach and we can easily see how the edge features denoted with u here are concatenated with the node features v in order to obtain a vector for each node edge node triple this combined vector is then again transformed by multiplying it with a learnable weight matrix again there exist other papers that share similar approaches for the multi-dimensional edge features and i'll link a couple of them here for completeness most of them are also implemented for pytorch geometric so now we've already seen a lot of ways how we can include edge features in graph new networks finally another way to use them is to create edge embeddings that's like creating node embeddings but using the edge features instead this is the last approach we will quickly investigate in this video one recent paper displayed on the right uses a so-called hierarchical dual level attention mechanism that simply means they have alternating layers one that updates node embeddings and then one that creates edge embeddings and so on the propagation formulas look like this we can see that the edge features are used in both layers to generate new embeddings the left layer generates node embeddings and the right layer edge embeddings additionally they use the attention mechanism and thus learn how important specific nodes or edges are for the new embedding the importance coefficients are alpha and beta here so to summarize it this approach iteratively updates node and edge embeddings in order to merge both information together similarly as the previous paper this approach now also incorporates the edge features when calculating the attention coefficients here only one layer is required as both the node and edge embeddings are updated simultaneously the edge embeddings are simply set to the calculated attention coefficients alpha so instead of using the adjacency matrix and calculating the node embeddings as on the left here we now use both the edge and node features to update the embeddings again there exist a couple of other papers that go into a similar direction and i display some of them here on this page sense net for instance also alternates note and edge embedding layers but without using the attention mechanism the co-embedding of nodes and edges is basically the same paper as it comes from the same group of researchers so now we've seen many different ways how we can use edge features and gnns finally let's quickly talk about how we can use these approaches in the popular gene n library pytorch geometric all you have to do is navigate to the documentation and scan the different layers for the following attributes if you find edge weight as argument for one layer that simply means you can pass other values then 0 or 1 to the adjacency matrix edge type means that the implementation can work with different edge types as we've seen it before finally if you find etch etcher that means the layer can handle multi-dimensional edge features for more recent papers with edge embeddings there's currently not so much available but i can imagine that the implementations will follow soon otherwise you can always create a pull request with your own implementation of a paper and help the deep learning community with this contribution let's quickly have a look at two examples in pytorch geometric okay so here we are on the documentation page and you can see we have this rtc and conf layer on github there's for the repository python geometric also an example part where you find different examples and here you can see there's one example for this rgcn paper and here we import this layer and we can directly use it in our model definition here and we can now specify the number of relations so the number of edge types we have and down here in the usage you can see we pass the edge types of our data set to our model so the second example is for this n n conf layer again if i click on this you can see the propagation formula here and down here you find edge edger and as i said that stands for multi-dimensional edge features so now if we go to github again and look at the examples folder we find another example for this nnconflare and it's simply imported here and you can see in this function the edge attributes are calculated in some way and the layer and end conf is defined as conf1 here and another conf 2 here and in the forward function we now pass the edge edger so our multi-dimensional edge features to this layer and simply include it as it's described in the paper so now that's it for this video we've seen different possibilities to use edge features in graph neural networks i hope this helped you as a starting point and i'm pretty sure we will see many new approaches in the next years but wait there's one more thing who are these people well they are created of course with a generative adversary network and i really thought whether i need to cite them or not that's actually also an interesting thought in my opinion who do you cite if an eye creates things like text or images leave a comment what you think and i'll see you soon in the next video

Original Description

In this video I talk about edge weights, edge types and edge features and how to include them in Graph Neural Networks. :) ▬▬ Papers ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Edge types: - Modeling Relational Data with Graph Convolutional Network (https://arxiv.org/pdf/1703.06103.pdf) - GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation (https://arxiv.org/pdf/1906.12192.pdf) Multidim. edge features: - Neural Message Passing for Quantum Chemistry (https://arxiv.org/pdf/1704.01212.pdf) - Principal Neighbourhood Aggregation for Graph Nets (https://arxiv.org/pdf/2004.05718.pdf) - Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties (https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.120.145301) Edge feature embeddings: - NENN: Incorporate Node and Edge Features in Graph Neural Networks (http://proceedings.mlr.press/v129/yang20a.html) - Exploiting Edge Features in Graph Neural Networks (https://arxiv.org/pdf/1809.02709.pdf) ▬▬ Timestamps ▬▬▬▬▬▬▬▬▬▬▬ 00:00 Introduction 05:10 Edge weights 06:10 Edge types / relations 09:21 Multidim. edge features 12:04 Edge feature embeddings 13:52 Pytorch Geometric ▬▬ Support me if you like 🌟 ►Link to this channel: https://bit.ly/3zEqL1W ►Support me on Patreon: https://bit.ly/2Wed242 ►Buy me a coffee on Ko-Fi: https://bit.ly/3kJYEdl

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from DeepFindr · DeepFindr · 8 of 56

← Previous Next →

Understanding Graph Neural Networks | Part 1/3 - Introduction

Understanding Graph Neural Networks | Part 1/3 - Introduction

Understanding Graph Neural Networks | Part 2/3 - GNNs and it's Variants

Understanding Graph Neural Networks | Part 2/3 - GNNs and it's Variants

Understanding Graph Neural Networks | Part 3/3 - Pytorch Geometric and Molecule Data using RDKit

Understanding Graph Neural Networks | Part 3/3 - Pytorch Geometric and Molecule Data using RDKit

Node Classification on Knowledge Graphs using PyTorch Geometric

Node Classification on Knowledge Graphs using PyTorch Geometric

Understanding Convolutional Neural Networks | Part 1 / 3 - The Basics

Understanding Convolutional Neural Networks | Part 1 / 3 - The Basics

Understanding Convolutional Neural Networks | Part 2 / 3 - Wonders of the world CNN with PyTorch

Understanding Convolutional Neural Networks | Part 2 / 3 - Wonders of the world CNN with PyTorch

Understanding Convolutional Neural Networks | Part 3 / 3 - Transfer Learning and Explainable AI

Understanding Convolutional Neural Networks | Part 3 / 3 - Transfer Learning and Explainable AI

How to use edge features in Graph Neural Networks (and PyTorch Geometric)

How to use edge features in Graph Neural Networks (and PyTorch Geometric)

Explainable AI explained! | #1 Introduction

Explainable AI explained! | #1 Introduction

Explainable AI explained! | #2 By-design interpretable models with Microsofts InterpretML

Explainable AI explained! | #2 By-design interpretable models with Microsofts InterpretML

Explainable AI explained! | #3 LIME

Explainable AI explained! | #3 LIME

Explainable AI explained! | #4 SHAP

Explainable AI explained! | #4 SHAP

Explainable AI explained! | #5 Counterfactual explanations and adversarial attacks

Explainable AI explained! | #5 Counterfactual explanations and adversarial attacks

Explainable AI explained! | #6 Layerwise Relevance Propagation with MRI data

Explainable AI explained! | #6 Layerwise Relevance Propagation with MRI data

Understanding Graph Attention Networks

Understanding Graph Attention Networks

GNN Project #1 - Introduction to HIV dataset

GNN Project #1 - Introduction to HIV dataset

GNN Project #2 - Creating a Custom Dataset in Pytorch Geometric

GNN Project #2 - Creating a Custom Dataset in Pytorch Geometric

GNN Project #3.2 - Graph Transformer

GNN Project #3.2 - Graph Transformer

GNN Project #4.1 - Graph Variational Autoencoders

GNN Project #4.1 - Graph Variational Autoencoders

GNN Project #4.2 - GVAE Training and Adjacency reconstruction

GNN Project #4.2 - GVAE Training and Adjacency reconstruction

GNN Project #4.3 - One-shot molecule generation - Part 1

GNN Project #4.3 - One-shot molecule generation - Part 1

GNN Project #4.3 - Code explanation

GNN Project #4.3 - Code explanation

Machine Learning Model Deployment with Python (Streamlit + MLflow) | Part 1/2

Machine Learning Model Deployment with Python (Streamlit + MLflow) | Part 1/2

Machine Learning Model Deployment with Python (Streamlit + MLflow) | Part 2/2

Machine Learning Model Deployment with Python (Streamlit + MLflow) | Part 2/2

How to explain Graph Neural Networks (with XAI)

How to explain Graph Neural Networks (with XAI)

Explaining Twitch Predictions with GNNExplainer

Explaining Twitch Predictions with GNNExplainer

Python Graph Neural Network Libraries (an Overview)

Python Graph Neural Network Libraries (an Overview)

Friendly Introduction to Temporal Graph Neural Networks (and some Traffic Forecasting)

Friendly Introduction to Temporal Graph Neural Networks (and some Traffic Forecasting)

Traffic Forecasting with Pytorch Geometric Temporal

Traffic Forecasting with Pytorch Geometric Temporal

Fraud Detection with Graph Neural Networks

Fraud Detection with Graph Neural Networks

Fake News Detection using Graphs with Pytorch Geometric

Fake News Detection using Graphs with Pytorch Geometric

Recommender Systems using Graph Neural Networks

Recommender Systems using Graph Neural Networks

How to handle Uncertainty in Deep Learning #1.1

How to handle Uncertainty in Deep Learning #1.1

How to handle Uncertainty in Deep Learning #1.2

How to handle Uncertainty in Deep Learning #1.2

How to handle Uncertainty in Deep Learning #2.1

How to handle Uncertainty in Deep Learning #2.1

How to handle Uncertainty in Deep Learning #2.2

How to handle Uncertainty in Deep Learning #2.2

Converting a Tabular Dataset to a Graph Dataset for GNNs

Converting a Tabular Dataset to a Graph Dataset for GNNs

Converting a Tabular Dataset to a Temporal Graph Dataset for GNNs

Converting a Tabular Dataset to a Temporal Graph Dataset for GNNs

How to get started with Data Science (Career tracks and advice)

How to get started with Data Science (Career tracks and advice)

Causality and (Graph) Neural Networks

Causality and (Graph) Neural Networks

Diffusion models from scratch in PyTorch

Diffusion models from scratch in PyTorch

Self-/Unsupervised GNN Training

Self-/Unsupervised GNN Training

Contrastive Learning in PyTorch - Part 1: Introduction

Contrastive Learning in PyTorch - Part 1: Introduction

Contrastive Learning in PyTorch - Part 2: CL on Point Clouds

Contrastive Learning in PyTorch - Part 2: CL on Point Clouds

State of AI 2022 - My Highlights

State of AI 2022 - My Highlights

Equivariant Neural Networks | Part 1/3 - Introduction

Equivariant Neural Networks | Part 1/3 - Introduction

Equivariant Neural Networks | Part 2/3 - Generalized CNNs

Equivariant Neural Networks | Part 2/3 - Generalized CNNs

Equivariant Neural Networks | Part 3/3 - Transformers and GNNs

Equivariant Neural Networks | Part 3/3 - Transformers and GNNs

Personalized Image Generation (using Dreambooth) explained!

Personalized Image Generation (using Dreambooth) explained!

Vision Transformer Quick Guide - Theory and Code in (almost) 15 min

Vision Transformer Quick Guide - Theory and Code in (almost) 15 min

LoRA explained (and a bit about precision and quantization)

LoRA explained (and a bit about precision and quantization)

Dimensionality Reduction Techniques | Introduction and Manifold Learning (1/5)

Dimensionality Reduction Techniques | Introduction and Manifold Learning (1/5)

Principal Component Analysis (PCA) | Dimensionality Reduction Techniques (2/5)

Principal Component Analysis (PCA) | Dimensionality Reduction Techniques (2/5)

Multidimensional Scaling (MDS) | Dimensionality Reduction Techniques (3/5)

Multidimensional Scaling (MDS) | Dimensionality Reduction Techniques (3/5)

t-distributed Stochastic Neighbor Embedding (t-SNE) | Dimensionality Reduction Techniques (4/5)

t-distributed Stochastic Neighbor Embedding (t-SNE) | Dimensionality Reduction Techniques (4/5)

Uniform Manifold Approximation and Projection (UMAP) | Dimensionality Reduction Techniques (5/5)

Uniform Manifold Approximation and Projection (UMAP) | Dimensionality Reduction Techniques (5/5)

This video teaches how to use edge features in Graph Neural Networks (GNNs) and PyTorch Geometric, covering topics such as edge weights, edge types, and edge features, with references to research papers and implementations in PyTorch Geometric. It provides a comprehensive overview of how to include edge features in GNNs and how to implement them in PyTorch Geometric. By watching this video, viewers can learn how to improve their GNN models by incorporating edge features and how to apply these co

Key Takeaways

Understand the basics of Graph Neural Networks (GNNs)
Learn how to represent edge features in GNNs
Implement edge features in PyTorch Geometric
Apply edge features to improve GNN models
Read and reproduce research papers on edge features in GNNs
Use PyTorch Geometric's rtc and conf layer for edge features
Create edge embeddings using edge features and hierarchical dual level attention mechanism

💡 Edge features can be used to improve the performance of Graph Neural Networks (GNNs) by providing additional information about the relationships between nodes in the graph.

🔒 Pro feature: Ask AI to explain this lesson →

More on: Research Methods

View skill →

Mechanics of Materials III: Beam Bending

Mechanics of Materials III: Beam Bending

Inaugural Lecture: Juliane Reinecke

Inaugural Lecture: Juliane Reinecke

Saïd Business School, University of Oxford

Hands-On Learning: How and Why You Should Build a Home Lab

Hands-On Learning: How and Why You Should Build a Home Lab

SANS Live Online Interactive Remote Lab and Range Demo – SEC599: Defeating Advanced Adversaries

SANS Live Online Interactive Remote Lab and Range Demo – SEC599: Defeating Advanced Adversaries

Does Water Swirl the Other Way in the Southern Hemisphere?

Does Water Swirl the Other Way in the Southern Hemisphere?

Undergraduate Research Forum 2026

Undergraduate Research Forum 2026

Related AI Lessons

Want to get started with deep learning

Get started with deep learning by leveraging resources like Andrew Karpathy's playlist and frameworks such as TensorFlow or PyTorch

Reddit r/deeplearning

Building a Deepfake Detector From Scratch — What Nobody Tells You

Learn to build a deepfake detector from scratch and understand the challenges involved in detecting AI-generated fake media

Medium · Deep Learning

Unfolding the Meandering Path: High-Dimensional Invariance and the Flat 2D Plane of Neural…

Learn about high-dimensional invariance and its relation to the flat 2D plane of neural networks, and how to apply these concepts to improve model performance

Medium · Deep Learning

Implementing Neural Style Transfer from Scratch: The Project That Started It All

Learn to implement Neural Style Transfer from scratch and understand its significance in deep learning

Medium · Deep Learning

Chapters (6)

Introduction

5:10 Edge weights

6:10 Edge types / relations

9:21 Multidim. edge features

12:04 Edge feature embeddings

13:52 Pytorch Geometric

Image Classification with ml5.js

The Coding Train