Graph ML in 2023: The State of Affairs | by Michael Galkin | Jan, 2023

By Jessie Hobb On Jan 2, 2023

STATE OF THE ART DIGEST

Hot trends and major advancements

2022 comes to an end and it is about time to sit down and reflect upon the achievements made in Graph ML as well as to hypothesize about possible breakthroughs in 2023. Tune in 🎄☕

Background image generated by DALL-E 2, text added by Author.

The article is written together with Hongyu Ren (Stanford University), Zhaocheng Zhu (Mila & University of Montreal). We thank Christopher Morris and Johannes Brandstetter for the feedback and helping with the Theory and PDE sections, respectively. Follow Michael, Hongyu, Zhaocheng, Christopher, and Johannes here on Medium and Twitter for more graph ml-related discussions.

Generative diffusion models in the vision-language domain were the headline topic in the Deep Learning world in 2022. While generating images and videos is definitely a cool playground to try out different models and sampling techniques, we’d argue that

the most useful applications of diffusion models in 2022 were actually created in the Geometric Deep Learning area focusing on molecules and proteins

In our recent article, we were pondering whether “Denoising Diffusion Is All You Need?”.

There, we reviewed newest generative models for graph generation (DiGress), molecular conformer generation (EDM, GeoDiff, Torsional Diffusion), molecular docking (DiffDock), molecular linking (DiffLinker), and ligand generation (DiffSBDD). As soon as the post went public, several amazing protein generation models were released:

Chroma from Generate Biomedicines allows to impose functional and geometric constraints, and even use natural language queries like “Generate a protein with CHAD domain” thanks to a small GPT-Neo trained on protein captioning;

*Chroma protein generation. Source:* *Generate Biomedicines*

RoseTTaFold Diffusion (RF Diffusion) from the Baker Lab and MIT is packed with the similar functionality also allowing for text prompts like “Generate a protein that binds to X” as well as being capable of functional motif scaffolding, scaffolding enzyme active sites, and de novo protein design. Strong point: 1000 designs generated with RF Diffusion were experimentally synthesized and tested in the lab!

*RF Diffusion. Source:* *Watson et al.* *BakerLab*

The Meta AI FAIR team made amazing progress in protein design purely with language models: mid-2022, ESM-2 was released, a protein LM trained solely on protein sequences that outperforms ESM-1 and other baselines by a huge margin. Moreover, it was then shown that encoded LM representations are a very good starting point for obtaining the actual geometric configuration of a protein without the need for Multiple Sequence Alignments (MSAs) — this is done via ESMFold. A big shoutout to Meta AI and FAIR for publishing the model and the weights: it is available in the official GitHub repo and on HuggingFace as well!

Scaling ESM-2 leads to better folding prediction. Source: Lin, Akin, Rao, Hie et al

🍭 Later on, even more goodies arrived from the ESM team: Verkuil et al. find that ESM-2 can generate de novo protein sequences that can actually be synthesized in the lab and, more importantly, do not have any match among known natural proteins. Hie et al. propose pretty much a new programming language for protein designers (think of it as a query language for ESMFold) — production rules organized in a syntax tree with constraint functions. Then, each program is “compiled” into an energy function that governs the generative process. Meta AI also released the biggest Metagenomic Atlas, but more on that in the Datasets section of this article.

In the antibody design area, a similar LM-based approach is taken by IgLM by Shuai, Ruffolo, and Gray. IGLM generates antibody sequences conditioned on chain and species id tags.

Finally, we’d highlight a few works from Jian Tang’s lab at Mila. MoleculeSTM by Liu et al. is a CLIP-like text-to-molecule model (plus a new large pre-training dataset). MoleculeSTM can do 2 impressive things: (1) retrieve molecules by text description like “triazole derivatives” and retrieve text description from a given molecule in SMILES, (2) molecule editing from text prompts like “make the molecule soluble in water with low permeability” — and the model edits the molecular graph according to the description, mindblowing 🤯

Then, ProtSEED by Shi et al. is a generative model for protein sequence and structure simultaneously (for example, most existing diffusion models for proteins can do only one of those at a time). ProtSEED can be conditioned on residue features or pairs of residues. Model-wise, it is an equivariant iterative model with improved triangular attention. ProtSEED was evaluated on Antibody CDR co-design, Protein sequence-structure co-design, and Fixed backbone sequence design.

Molecule editing from text inputs. Source: Liu et al.

Besides generating the protein structures, there are also some works for generating protein sequences from structures, known as inverse folding. Don’t forget to check out the ESM-IF1 from Meta and the ProteinMPNN from the Baker Lab.

What to expect in 2023: (1) performance improvements of diffusion models such as faster sampling and more efficient solvers; (2) more powerful conditional protein generation models; (3) more successful applications of Generative Flow Networks (GFlowNets, check out the tutorial) to molecules and proteins.

AI4Science becomes the frontier of equivariant GNN research and its applications. Pairing GNNs with PDEs, we can now tackle much more complex prediction tasks.

In 2022, this frontier expanded to ML-based Density Functional Theory (DFT) and Force fields approximations used for molecular dynamics and material discovery. The other growing field is Weather simulations.

We would recommend the talk by Max Welling for a broader overview of AI4Science and what is now enabled by using Deep Learning in science.

Starting with models, 2022 has seen a surge in equivariant GNNs for molecular dynamics and simulations, e.g., building upon NequIP, Allegro by Musaelian, Batzner, et al. or MACE by Batatia et al. The design space for such models is very large, so refer to the recent survey by Batatia, Batzner, et al. for an overview. A crucial component for most of them is the e3nn library (paper by Geiger and Smidt) and the notion of tensor product. We highly recommend a great new course by Erik Bekkers on Group Equivariant Deep Learning to understand the mathematical foundations and catch up with the recent papers.

⚛️ Density Functional Theory (DFT) calculations are one of the main workhorses of molecular dynamics (and account for a great deal of computing time in big clusters). DFT is O(n³) to the input size though, so can ML help here? In Learned Force Fields Are Ready For Ground State Catalyst Discovery, Schaarschmidt et al. present the experimental study of models of learned potentials — turns out GNNs can do a very good job in linear O(n) time! The Easy Potentials approach (trained on Open Catalyst data) turns out to be quite a good predictor especially when paired with a postprocessing step. Model-wise, it is an MPNN with the Noisy Nodes self-supervised objective.

In Forces are not Enough, Fu et al. introduce a new benchmark for molecular dynamics — in addition to MD17, the authors add datasets on modeling liquids (Water), peptides (Alanine dipeptide), and solid-state materials (LiPS). More importantly, the authors consider a wide range of physical properties like stability of simulations, diffusivity, and radial distribution functions. Most SOTA molecular dynamics models were probed including SchNet, ForceNet, DimeNet, GemNet (-T and -dT), and NequIP.

In crystal structure modeling, we’d highlight Equivariant Crystal Networks by Kaba and Ravanbakhsh — a neat way to build representations of periodic structures with crystalline symmetries. Crystals can be described with lattices and unit cells with basis vectors that are subject to group transformations. Conceptually, ECN creates edge index masks corresponding to symmetry groups, performs message passing over this masked index, and aggregates the results of many symmetry groups.

Even more news on material discovery can found in the proceedings of the recent AI4Mat NeurIPS workshop!

☂️ ML-based weather forecasting made a huge progress as well. In particular, GraphCast by DeepMind and Pangu-Weather by Huawei demonstrated exceptionally good results outperforming traditional models by a large margin. While Pangu-Weather leverages 3D/visual inputs and Visual Transformers, GraphCast employs a mesh MPNN where Earth is split into several hierarchy levels of meshes. The deepest level has about 40K nodes with 474 input features and the model outputs 227 predicted variables. The MPNN follows the “encoder-processor-decoder” and has 16 layers. GraphCast is autoregressive model w.r.t. the next timestep prediction, that is, it takes previous two states and predicts the next one. GraphCast can build a 10-day forecast in <60 seconds on a single TPUv4 and is much more accurate than non-ML forecasting models. 👏

Encoder-Processor-Decoder mesh MPNN in GraphCast. Source: Lam, Sanchez-Gonzalez, Willson, Wirnsberger, Fortunato, Pritzel, et al.

What to expect in 2023: We expect to see a lot more focus on computational efficiency and scalability of GNNs. Current GNN-based force-fields are obtaining remarkable accuracy, but are still 2–3 orders of magnitude slower than classical force-fields and are typically only deployed on a few hundred atoms. For GNNs to truly have a transformative impact on materials science and drug discovery, we will see many folks tackling this issue, be it through architectural advances or smarter sampling.

In 2022, 1️⃣ we got a better understanding of oversmoothing and oversquashing phenomena in GNNs and their connections to algebraic topology; 2️⃣ using GNNs for PDE modeling is now mainstream.

1️⃣ Michael Bronstein’s lab made huge contributions to this problem — check those excellent posts on Neural Sheaf Diffusion and framing GNNs as gradient flows

And on GNNs as gradient flows:

2️⃣ Using GNNs for PDE modeling became a mainstream topic. Some papers require the 🤯 math alert 🤯 warning, but if you are familiar with the basics of ODEs and PDEs it should be much easier.

Message Passing Neural PDE Solvers by Brandstetter, Worrall, and Welling describe how message passing can help solving PDEs, generalize better, and get rid of manual heuristics. Furthermore, MP-PDEs representationally contain classic solvers like finite differences.

Source: Brandstetter, Worrall, and Welling

The topic was developed further by many recent works including continuous forecasting with implicit neural representations (Yin et al.), supporting mixed boundary conditions (Horie and Mitsume), or latent evolution of PDEs (Wu et al.)

What to expect in 2023: Neural PDEs and their applications are likely to expand to more physics-related AI4Science subfields, where especially computational fluid dynamics (CFD) will potentially be influenced by GNN based surrogates in the coming months. Classical CFD is applied to a wide range of research and engineering problems in many fields of study, including aerodynamics, hypersonic and environmental engineering, fluid flows, visual effects in video games, or weather simulations as discussed above. GNN based surrogates might augment/replace traditional well-tried techniques such as finite element methods (Lienen et al.), remeshing algorithms (Song et al.), boundary value problems (Loetsch et al.), or interactions with triangularized boundary geometries (Mayr et al.).

The neural PDE community is starting to build strong and commonly used baselines and frameworks, which will in return help to accelerate the progress, e.g. PDEBench (Takamoto et al.) or PDEArena (Gupta et al.)

Definitely one of the main community drivers in 2022, graph transformers (GTs) evolved a lot towards higher effectiveness and better scalability. Several outstanding models published in 2022:

👑 GraphGPS by Rampášek et al. takes the title of “GT of 2022” thanks to combining local message passing, global attention (optionally, linear for higher efficiency), and positional encodings that led to setting a new SOTA on ZINC and many other benchmarks. Check out a dedicated article on GraphGPS

GraphGPS served as a backbone of GPS++, the winning OGB Large Scale Challenge 2022 model on PCQM4M v2 (graph regression). GPS++, created by Graphcore, Valence Discovery, and Mila, incorporates more features including 3D coordinates and leverages sparse-optimized IPU hardware (more on that in the following section). GPS++ weights are already available on GitHub!

GraphGPS intuition. Source: Rampášek et al

Transformer-M by Luo et al. inspired many top OGB LSC models as well. Transformer-M adds 3D coordinates to the neat mix of joint 2D-3D pre-training. At inference time, when 3D info is not known, the model would infer a glimpse of 3D knowledge which improves the performance on PCQM4Mv2 by a good margin. Code is available either.

*Transformer-M joint 2D-3D pre-training scheme. Source:* *Luo et al.*

TokenGT by Kim et al goes even more explicit and adds all edges of the input graph (in addition to all nodes) to the sequence fed to the Transformer. With those inputs, encoder needs additional token types to distinguish nodes from edges. The authors prove several nice theoretical properties (although at the cost of higher computational complexity O((V+E)²) that can get to the 4th power in the worst case of a fully-connected graph). Code is available.

TokenGT adds both nodes and edges to the input sequence. Source: Kim et al

What to expect in 2023: for the coming year, we’d expect 1️⃣ GTs to scale up along the axes of both data and model parameters, from molecules of <50 nodes to graphs of millions of nodes, in order to witness the emergent behavior as in text & vision foundation models 2️⃣ similar to BLOOM by the BigScience Initiative, a big open-source pre-trained equivariant GT for molecular data, perhaps within the Open Drug Discovery project.

🔥 One of our favorites in 2022 is “Graph Neural Networks for Link Prediction with Subgraph Sketching” by Chamberlain, Shirobokov et al. — this is a neat combination of algorithms + ML techniques. It is known that SEAL-like labeling tricks dramatically improve link prediction performance compared to standard GNN encoders but suffer from big computation/memory overhead. In this work, the authors find that obtaining distances from two nodes of a query edge can be efficiently done with hashing (MinHashing) and cardinality estimation (HyperLogLog) algorithms. Essentially, message passing is done over minhashing and hyperloglog initial sketches of single nodes (min aggregation for minhash, max for hyperloglog sketches) — this is the core of the ELPH link prediction model (with a simple MLP decoder). The authors then design a more scalable BUDDY model where k-hop hash propagation can be precomputed before training. Experimentally, ELPH and BUDDY scale to large graphs that were previously way too large or resource hungry for labeling trick approaches. Great work and definitely a solid baseline for all future link prediction models! 👏

The motivation behind computing subgraph hashes to estimate cardinalities of neighborhoods and intersections. Source: Chamberlain, Shirobokov et al.

On the graph sampling and minibatching side, Gasteiger, Qian, and Günnemann design Influence-based Mini-Batching (IBMB), a good example how Personalized PageRank (PPR) can solve even graph batching! IBMB aims at creating the smallest minibatches whose nodes have the maximum influence on the node classification task. In fact, the influence score is equivalent to PPR. Practically, given a set of target nodes, IBMB (1) partitions the graph into permanent clusters, (2) runs PPR within each batch to select top-PPR nodes that would form a final subgraph minibatch. The resulting minibatches can be sent to any GNN encoder. IBMB is pretty much constant O(1) to the graph size where partitioning and PPRs can be precomputed at the pre-processing stage.

Although the resulting batches are fixed and do not change over training (not stochastic enough), the authors design momentum-like optimization terms to mitigate this non-stochasticity. IBMB can be used both in training and inference with massive speedups — up to 17x and 130x, respectively 🚀

Influence-based mini-batching. Source: Gasteiger, Qian, and Günnemann

The subtitle of this subsection could be “brought to you by Google” since the majority of the papers have authors from Google 😉

Carey et al. created Stars, a method for building sparse similarity graphs at the scale of tens of trillions of edges 🤯. Pairwise N² comparisons would obviously not work here — Stars employs two-hop spanner graphs (those are the graphs where similar points are connected with at most two hops) and SortingLSH that together enable almost linear time complexity and high sparsity.

Dhulipala et al. created ParHAC, an approximate (1+𝝐) parallel algorithm for hierarchical agglomerative clustering (HAC) on very large graphs and extensive theoretical foundations of the algorithm. ParHAC has O(V+E) complexity and poly-log depth and runs up to 60x faster than baselines on graphs with hundreds of billions of edges (here it is the Hyperlink graph with 1.7B nodes and 125B edges).

Devvrit et al. created S³GC, a scalable self-supervised graph clustering algorithm with one-layer GNN and constrastive training objective. S³GC uses both graph structure and node features and scales to graphs of up to 1.6B edges.

Finally, Epasto et al. created a differentially-private modification of PageRank!

LoG 2022 featured two tutorials on large-scale GNNs: Scaling GNNs in Production by Da Zheng, Vassilis N. Ioannidis, and Soji Adeshina and Parallel and Distributed GNNs by Torsten Hoefler and Maciej Besta.

What to expect in 2023: further reduction in compute costs and inference time for very large graphs. Perhaps models for OGB LSC graphs could run on commodity machines instead of huge clusters?

Tourists of the year! Source of the original portraits: Towards Geometric Deep Learning IV: Chemical Precursors of GNNs by Michael Bronstein

🏖 🌄 Weisfeiler and Leman, grandfathers of Graph ML and GNN theory, had a very prolific traveling year! After visiting Neural, Sparse, Topological, and Cellular places in previous years, in 2022 we have seen them in several new places:

WL Go Machine Learning — a comprehensive survey by Morris et al on the basics of the WL test, terminology, and various applications;
WL Go Relational — the first attempt by Barcelo et al to study expressiveness of relational GNNs used in multi-relational graphs and KGs. Turns out R-GCN and CompGCN are equally expressive and are bounded by the Relational 1-WL test, and the most expressive message function (aggregating entity-relation representations) is a Hadamard product;
WL Go Walking by Niels M. Kriege studies expressiveness of random walk kernels and finds that the RW kernel (with a small modification) is as expressive as a WL subtree kernel;
WL Go Geometric: Joshi, Bodnar et al propose Geometric WL test (GWL) to study expressiveness of equivariant and invariant GNNs (to ceratin symmetries: translation, rotation, reflection, permutation). Turns out, equivariant GNNs (such as E-GNN, NequIP or MACE) are provably more powerful than invariant GNNs (such as SchNet or DimeNet);
WL Go Temporal: Souza et al propose Temporal WL test to study expressiveness of temporal GNNs. The authors then propose a novel injective aggregation function (and the PINT model) that should be most expressive;
WL Go Gradual: Bause and Kriege propose to modify the original WL color refinement with a non-injective function where different multi-sets might get assigned the same color (under certain conditions). It thus enables more gradual color refinement and slower convergence to stable coloring that eventually retains expressiveness of 1-WL but gets a few distinguishing properties on the way.
WL Go Infinite: Feldman et al propose to change the initial node coloring with spectral features derived from the heat kernel of the Laplacian or with k-smallest eigenvectors of the Laplacian (for large graphs) which is quite close to Laplacian Positional Encodings (LPEs).
WL Go Hyperbolic: Nikolentzos et al note that the color refinement procedure of the WL test produces a tree hierarchy of colors. In order to preserve relative distances of nodes encoded by those colors, the authors propose to map output states of each layer/iteration into a hyperbolic space and update it after each next layer. The final embeddings are supposed to retain the notion of node distances.

📈 In the realm of more expressive (than 1-WL) architectures, subgraph GNNs are the biggest trend. Among those, three approaches stand out: 1️⃣ Subgraph Union Networks (SUN) by Frasca, Bevilacqua, et al. that provide a comprehensive analysis of subgraph GNNs design space and expressiveness showing they are bounded by 3-WL; 2️⃣ Ordered Subgraph Aggregation Networks (OSAN) by Qian, Rattan, et al devise a hierarchy of subgraph-enhanced GNNs (k-OSAN) and find that k-OSAN are incomparable to k-WL but are strictly limited by (k+1)-WL. One particularly cool part of OSAN is using Implicit MLE (NeurIPS’21), a differentiable discrete sampling technique, for sampling ordered subgraphs. ️3️⃣ SpeqNets by Morris et al. devise a permutation-equivariant hierarchy of graph networks that balances between scalability and expressivity. 4️⃣ GraphSNN by Wijesinghe and Wang derives expressive models based on the overlap of subgraph isomorphisms and subtree isomorpishms.

🤔 A few works rethink the WL framework as a general means for GNN expressiveness. Geerts and Reutter define k-order MPNNs that can be characterized with Tensor Languages (with a mapping between WL and Tensor Languages). A new anonymous ICLR’23 submission proposes to leverage graph biconnectivity and defines a Generalized Distance WL algorithm.

If you’d like to study the topic even deeper, check out a wonderful LOG 2022 tutorial by Fabrizio Frasca, Beatrice Bevilacqua, and Haggai Maron with practical examples!

What to expect in 2023: 1️⃣ More efforts on creating time- and memory-efficient subgraph GNNs. 2️⃣ Better understanding of generalization of GNNs. 3️⃣ Weisfeiler and Leman visit 10 new places!

Last year, we observed a major shift in KG representation learning: transductive-only approaches are being actively retired in favor of inductive models that can build meaningful representation for new, unseen nodes and perform node classification and link prediction.

In 2022, the field was expanding along two main axes: 1️⃣ inductive link prediction (LP) 2️⃣ and inductive (multi-hop) query answering that extends link prediction to much more complex prediction tasks.

1️⃣ In link prediction, the majority of inductive models (like NBFNet or NodePiece) transfer to unseen nodes at inference by assuming that the set of relation types is fixed during training and does not change over time so they can learn relation embeddings. What happens when the set of relations changes as well? In the hardest case, we’d want to transfer to KGs with completely different nodes and relation types.

So far, all such models supporting unseen relations resort to meta-learning which is slow and resource-hungry. In 2022, for the first time, Huang, Ren, and Leskovec proposed the Connected Subgraph Reasoner (CSR) framework that is inductive along both entities and relation types and does not need any meta-learning! 👀 Generally, for new relations at inference, models see at least k example triples with this relation (hence, a k-shot learning scenario). Conceptually, CSR extracts subgraphs around each example trying to learn common relational patterns (i.e., optimizing edge masks) and then apply the mask to the query subgraph (with the missing target link to predict).

Inductive CSR that supports KGs with unseen entities and relation types. Source: Huang, Ren, and Leskovec

ReFactor GNNs by Chen et al. is another insightful work on inductive qualities of shallow KG embedding models — particularly, the authors find that shallow factorization models like DistMult resemble infinitely deep GNNs when looking through the lens of backpropagation and how nodes update their representations from neighboring and non-neighboring nodes. Turns out that, theoretically, any factorization model can be turned into an inductive model!

2️⃣ Inductive representation learning arrived in the area of complex logical query answering as well. (shameless plug) In fact, it was one of the focuses of our team this year 😊 First, in Zhu et al., we found that Neural Bellman-Ford nets generalize well from simple link prediction to complex query answering tasks in a new GNN Query Executor (GNN-QE) model where a GNN based on NBF-Net performs relation projections while other logical operators are performed via fuzzy logic t-norms. Then, in Inductive Logical Query Answering in Knowledge Graphs we studied ⚗️ the essence of inductiveness ⚗️ and proposed two ways to answer logical queries over unseen entities at inference time, that is, via (1) inductive node representations obtained with NodePiece encoder paired with the inference-only decoder (less performant but scalable) or via (2) inductive relational structure representations akin to the one in GNN-QE (better quality but more resource-hungry and hard to scale). Overall we are able to scale to an inductive query setting on graphs with millions of nodes and 500k unseen nodes and 5m unseen edges during inference.

Inductive logical query answering approaches: via node representations (NodePiece-QE) and relational structure representations (GNN-QE). Source: Galkin et al.

The other cool work in the area is SMORE by Ren, Dai, et al. — it is a large-scale (transductive-only yet) system for complex query answering over very large graphs scaling up to the full Freebase with about 90M nodes and 300M edges 👀. In addition to CUDA, training, and pipeline optimizations, SMORE implements a bidirectional query sampler such that training queries can be generated on-the-fly right in the data loader instead of creating and storing huge datasets. Don’t forget to check out a fresh hands-on tutorial on large-scale graph reasoning from LOG 2022!

Last but not the least, Yang, Lin and Zhang brought up an interesting paper rethinking the evaluation of knowledge graph completion. They point out knowledge graphs tend to be open-world (i.e., there are facts not encoded by the knowledge graph) rather close-world assumed by most works. As a result, metrics observed under the close-world assumption exhibit a log trend w.r.t. the true metric — this means if you get 0.4 MRR for your model, chances are that the test knowledge graph is incomplete and your model has already done a good job👍. Maybe we can design some new dataset and evaluation to mitigate such an issue?

What to expect in 2023: an inductive model fully transferable to different KGs with new sets of entities and relations, e.g., training on Wikidata, and running inference on DBpedia or Freebase.

2022 was a year of major breakthroughs and milestones for algorithmic reasoning.

1️⃣ First, the CLRS benchmark by Veličković et al. is now available as the main playground to design and benchmark algorithmic reasoning models and tasks. CLRS already includes 30 tasks (such as classical sorting algorithms, string algorithms, and graph algorithms) but still allows you to bring your own formulations or modify existing ones.

2️⃣ Then, a Generalist Neural Algorithmic Learner by Ibarz et al. and DeepMind has shown that it is possible to train a single processor network that can be trained in the multi-task mode on different algorithms — previously, you’d train a single model for a single task repeating that for all 30 CLRS problems. The paper also describes several modifications and tricks to the model architecture side and training procedure to let the model generalize better and prevent forgetting, e.g., triplet reasoning similar to triangular attention (common for molecular models) and edge transformers. Overall, a new model brings a massive 25% absolute gain over baselines and solves 24 out of 30 CLRS tasks with 60%+ micro-F1.

3️⃣ Last year, we discussed the works on algorithmic alignment and saw the signs that GNNs can probably align well with dynamic programming. In 2022, Dudzik and Veličković prove that GNNs are Dynamic Programmers using category theory, abstract algebra, and notion of pushforward and pullback operations. This is a wonderful example of applying category theory that many people consider “abstract nonsense” 😉. Category theory is likely to have more impact in GNN theory and Graph ML in general, so check out a fresh course Cats4AI for a gentle introduction to the field.

4️⃣ Finally, the work of Beurer-Kellner et al. is one of the first practical application of the neural algorithmic reasoning framework, here it is applied to configuring computer networks, i.e., routing protocols like BGP that are at the core of the internet. There, the authors show that representing a routing config as a graph allows to frame the routing problem as node property prediction. This approach brings whopping 👀 490x 👀 speedups compared to traditional rule-based routing methods and stil maintain 90+% specification consistency.

If you want to follow algorithmic reasoning more closely, don’t miss a fresh LoG 2022 tutorial by Petar Veličković, Andreea Deac and Andrew Dudzik.

What to expect in 2023: 1️⃣ Algorithmic reasoning tasks are likely to scale to graphs of thousands of nodes and practical applications like in code analysis or databases, 2️⃣ even more algorithms in the benchmark, 3️⃣ most unlikely — there will appear a model capable of solving quickselect 😅

👃 Learning to Smell with GNNs. Back in 2019, Google AI started a project on learning representations of smells. From basic chemistry we know that aromaticity depends on the molecular structure, e.g., cyclic compounds. In fact, the whole group of ”aromatic hydrocarbons” was named aromatic because they actually has some smell (compared to many non-organic molecules). If we have a molecular structure, we can employ a GNN on top of it and learn some representations!

Recently, Google AI released a new blogpost and paper by Qian et al. describing the next phase of the project — the Principal Odor Map that is able to group molecules in “odor clusters”. The authors conducted 3 cool experiments: classifying 400 new molecules never smelled before and comparison to the averaged rating of a group of human panelists; linking odor quality to fundamental biology; and probing aromatic molecules on their mosquito repelling qualities. The GNN-based model shows very good results — now we can finally claim that GNNs can smell! Looking forward for GNNs transforming the perfume industry.

Embedding of odors. Source: Google AI blog

⚽ GNNs + Football. If you thought that sophisticated GNNs for modelling trajectories are only used for molecular dynamics and arcane quantum simulations, fear not! Here is a cool practical application with a very high potential outreach: Graph Imputer by Omidshafiei et al., DeepMind, and FC Liverpool predicts trajectories of football players (and the ball). Each game graph consists of 23 nodes, gets updated with a standard message passing encoder and a special time-dependent LSTM. The dataset is quite novel, too — it consists of 105 English Premier League matches (avg 90 min each), all players and the ball were tracked at 25 fps, and the resulting training trajectory sequences encode about 9.6 seconds of gameplay.

The paper is easy to read and has numerous football illustrations, check it out! Sports tech is actively growing those days, and football analysts now could go even deeper in studying their competitors. Will EPL clubs compete for GNN researchers in the upcoming transfer windows? Time to create transfermarkt for GNN researchers 😉

Football match simulation is like molecular dynamics simulation! Source: DeepMind

🪐 Galaxies and Astrophysics. For astrophysics aficionados: Mangrove by Jespersen et al. applies GraphSAGE to merger trees of dark matter to predict a variety of galactic properties like stellar mass, cold gas mass, star formation rate, and even black hole mass. The paper is a bit heavy on the terminology of astrophysics but pretty easy in terms of GNN parameterization and training. Mangrove works 4–9 orders of magnitude faster than standard models. Experimental charts are pieces of art that you can hang on a wall 🖼️.

Mangrove approach to present dark matter halos as merger trees and graphs. Source: Jespersen et al.

🤖 GNNs for code. Code generation models like AlphaCode and Codex have mindblowing capabilities. Although LLMs are at the core of those models, GNNs do help in a few neat ways: Instruction Pointer Attention GNNs (IPA-GNNs) first proposed by Bieber et al have been used to predict runtime errors in competitive programming tasks — so it is almost like a virtual code interpreter! CodeTrek by Pashakhanloo et al. proposes to model a program as a relational graph and embed it via random walks and Transformer encoder. Downstream applications include variable misuse, prediction exceptions, predicting shadowed variables.

🥇 2022 brought a huge success to Graphcore and IPUs — the hardware optimized for sparse operations that are so needed when working with graphs. The first success story was optimizing Temporal Graph Nets (TGN) for IPUs with massive performance gains (check the article in Michael Bronstein’s blog).

Later on, Graphcore stormed the leaderboards of OGB LSC’22 by winning 2 out of 3 tracks: link prediction on the WikiKG90M v2 knowledge graph and graph regression on the PCQM4M v2 molecular dataset. In addition to the sheer compute power, the authors took several clever model decisions: for link prediction it was Balanced Entity Sampling and Sharing (BESS) for training an ensemble of shallow LP models (check the blog post by Daniel Justus for more details), and GPS++ for the graph regression task (we covered GPS++ above in the GT section). You can try out the pre-trained models using IPUs-powered virtual machines on Paperspace. Congratulations to Graphcore and their team! 👏

PyG partnered with NVIDIA (post) and Intel (post) to increase the performance of core operations on GPUs and CPUs, respectively. Similarly, DGL incorporated new GPU optimizations in the recent 0.9 version. Massive gains for sparse matmuls and sampling procedures, so we’d encourage you to update your environments with the most recent versions!.

What to expect in 2023: major GNN libraries are likely to increase the breadth of supported hardware backends such as IPUs or upcoming Intel Max Series GPUs.

This year we witnessed the inauguration of two graph and geometric ML conferences: the Learning on Graphs Conference (LoG) and the Molecular ML Conference (MoML).

LoG is a more general all-around GraphML venue (held virtually this year) while MoML (held at MIT) has a broader mission and influence over the AI4Science community where graphs and geometry still plays a major role. Both conferences were received extremely well. MoML attracted 7 top speakers and 38 posters, LoG had ~3000 registrations, 266 submissions, 71 posters, 12 orals, and 7 awesome tutorials (all recordings of oral talks and tutorials are already on YouTube). Besides, LoG introduced a great monetary incentive for reviewers, resulting in a well-recognized improvement of the review quality! From our point of view, quality of LoG reviews was often better than those at NeurIPS or ICML.

This is a huge win and carnival for the graph ML community, and congrats to everyone working in the field of graph and geometric machine learning with a new “home” venue!

What to expect in 2023: LOG and MoML become main Graph ML venues to include into your submission calendar along with ICLR / NeurIPS / ICML

OGB Large-Scale Challenge 2022: The second large scale challenge held at NeurIPS2022 with large and realistic graph ML tasks covering node-, edge-, graph-level predictions.
Open Catalyst 2022 Challenge: the second edition of the challenge held at NeurIPS2022 with the task to design new machine learning models to predict the outcome of catalyst simulations used to understand activity
CASP 15: the protein structure prediction challenge disrupted by AlphaFold a few years ago at CASP 14. Detailed analysis is yet to come, but it seems that MSAs strike back and best performing models still rely on MSAs.
Long Range Graph Benchmark: for measuring GNNs and GTs capabilities of capturing long range interactions in graphs.
Taxonomy of Graph Benchmarks, Graph Learning Indexer: deeper studies of the dataset landscape in Graph ML outlining open challenges in benchmarking and trustworthiness of results.
GraphWorld: a framework for analyzing the performance of GNN architectures on millions of synthetic benchmark datasets
Chartalist — a collection of blockchain graph datasets
PEER protein learning benchmark: a multi-task benchmark for protein sequence understanding with 17 tasks of protein understanding lying in 5 task categories.
ESM Metagenomic Atlas: acomprehensive database of over 600 million predicted protein structures with nice visualizations and search UI.

Mainstream graph ML libraries: PyG 2.2 (PyTorch), DGL 0.9 (PyTorch, TensorFlow, MXNet), TF GNN (TensorFlow) and Jraph (Jax)
TorchDrug and TorchProtein: machine learning library for drug discovery and protein science
PyKEEN: the best platform for training and evaluating knowledge graph embeddings
Graphein: a package that provides a number of types of graph-based representations of proteins
GRAPE and Marius: scalable graph processing and embedding libraries over billion-scale graphs
MatSci ML Toolkit: a flexible framework for deep learning on the opencatalyst dataset
E3nn: the go-to library for E(3) equivariant neural networks

Created by Michael Galkin and Michael Bronstein