Relational Inductive Biases, Deep Learning, and Graph Networks by Battaglia et al. (2018) Overview

Omar Hussein
3 min readJul 10, 2023

--

The paper aims to provide a unifying framework to bridge the gap between graph-based and deep learning methods. The authors argue that graph networks (GNs) offer a flexible approach to merge these techniques, as they provide a strong relational inductive bias that can handle a wide range of problems and data types.

Here are the main concepts:

  1. Relational Inductive Biases: An inductive bias is an assumption made by a learning algorithm that guides it towards a specific solution when faced with more than one solution. Relational inductive bias is the assumption that the relationships between entities (i.e., the graph structure) are important for learning.
  2. Graph Networks (GNs): The authors introduce a class of models called Graph Networks, which can reason about graph-structured data. GNs can be thought of as a generalization of both Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) to graph-structured data. The authors present a general GN block, which takes a graph as input and returns a graph as output. The GN block comprises three stages: edge update, node update, and global update. In each stage, information is aggregated from the relevant elements (edges, nodes, or the whole graph), transformed via a “model” function, and then combined with the existing features.
  3. Graph Representations: The authors use a graph representation that includes global attributes (features of the entire graph), node attributes (features of individual nodes), and edge attributes (features of individual edges). This is a more general representation than those used in standard graph theory, which typically includes only nodes and edges.
  4. Learning and Generalization: The authors also explore how GNs can be used for learning and generalization. They argue that GNs have strong generalization capabilities due to their relational inductive bias. GNs can be used to learn on one type of graph and generalize to other types of graphs with similar structural properties.
  5. Applications: The paper shows how GNs can be applied to a wide range of problems, including physical systems, combinatorial optimization, multi-agent systems, and social and biological networks.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Global features in a Graph Network:

  1. The global features, denoted by u in the paper, capture information about the entire graph. These are attributes or characteristics that represent the graph as a whole, not just individual nodes or edges. For example, in a social network graph, a global feature could be the overall density of the graph, the average number of friends per user, or the total number of communities.
  2. The calculation of these global features depends on the specific problem and the data you’re dealing with. In many cases, they are computed as a function of node and edge attributes. In the Graph Network (GN) block described in the paper, the global features are updated in the “global update” stage. Here, information from the nodes and edges is aggregated (for example, by summing or averaging), and then combined with the existing global features using a learnable function. This update process allows the global features to capture complex characteristics of the graph structure and the node and edge features.
  3. Process of a Graph Network with an example:
  4. Let’s take the example of a social network graph, where nodes represent people, edges represent friendships, and we have a global attribute representing the overall density of the network.
  • Edge Update: In this stage, the attributes of each edge are updated based on the attributes of the edge itself, the attributes of the nodes it connects (source and target nodes), and the global attribute. For example, if edge attributes represent the strength of friendship, this could be updated based on the individual’s attributes (like common interests), and the overall density of the network.
  • Node Update: Here, the attributes of each node are updated based on the node’s own attributes, the updated attributes of the edges connected to it, and the global attribute. For instance, a person’s “influence score” could be updated based on their individual characteristics, the strength of their friendships, and the overall network density.
  • Global Update: Finally, the global attributes are updated based on the updated node and edge attributes. In our example, the overall network density could be updated based on the updated friendship strengths and individual influence scores.
  • The output of the GN block is a new graph where the nodes, edges, and global attributes have all been updated.

--

--