Graph Coarsening.
Graph problems are often too large to solve explicitly. Consider the task of clustering the entirety of the Facebook graph, or solving a numerical simulation on a billion node mesh. Each of these problems would be too large for most computers to solve explicitly. Even if we have access to a wealth of computation power we still may want to perform this analysis faster. One technique we may consider is graph coarsening.
Graph coarsening describes a broad class of algorithms that aim to simplify computational graphs by reducing the number of nodes [1]. There are many methods for graph coarsening. These include maximal matching methods [2], Galerkin methods, and metric methods like algebraic distance [3] and adjacency similarity [4]. Note, these are not necesarilly exclusive, for instance Galerkin projections are used alongside many metric methods. The fundamental idea is to reduce the size of the graph by either merging nodes, or restricting analysis to a subgraph. Often coarsening is used as a step in more complicated techniques which are broadly called multilevel techniques. Multilevel techniques aim to solve the initial problem in levels. These levels are obtained by some sort of simplification, in our case coarsening. These methods tend to look something like follows.
Perform several coarsenings, storing each intermediate graph.
Solve our problem on the simplest (coarsest) representation.
Use this solution as a “first guess” on the second simplest graph, and refine the solution there.
Repeat step three at each level until you have a solution on the original graph.
A simple visualization of what this may look like is shown in Figure 1 for a simple PDE problem on a grid mesh. The idea behind multilevel analysis is that, the solution at each level will approximate the solution at the level above, so it is a good guess. Because it is a good guess, refining the solution on that level is inexpensive by comparison. For more on this topic feel free to explore some of the citations below.
Depending on the task at hand, there are many objectives we may wish to optimize for. In the case of solving elliptic numerical PDE’s like Poisson’s equation, we may wish to coarsen in a way such that that the heat equation acts similarly between the graphs. Alternatively, when coarsening with clustering in mind we may wish to closely preserve the smallest cut values. In my research we found spectral properties of the graph Laplacian [6] to be important.
Without getting too technical, the spectral properties (eigenvalues and eigenvectors) of the graph Laplacian store a significant amount of information regarding clustering and many PDEs such as the wave and heat equations. Furthermore, the graph Laplacian is closely related to the Kipf and Welling GNN architecture, which may imply potential speedups to GNN training can be achieved through spectrum consistent coarsening. In my research, I collaborated on several papers, some of which can be found on Google Scholar page, that focus on this problem.
In Spectrum Consistent Coarsening Approximates Edge Weights [7] I developed a novel analysis of graph coarsening which argues that, in some cases, preserving the eigenvalues of the Laplacian is equivalent to preserving the structure of the graph exactly. This suggests that the Laplacian spectrum is a very compelling metric to preserve while coarsening. A bound is derived in the general case as well, but it is less optimistic. In this same vein of work, in Parallel Coarsening of Graph Data with Spectral Guarantees [8] I worked to parallelize spectrum preserving coarsening methods on GPU. To this end, a greedy algorithm was developed, and spectral approximation bounds were derived for that algorithm.
In Greedy Fiedler Spectral Partitioning for Data-driven Discrete Exterior Calculus [9] I worked as part of a team on developing an iterative spectral bisection method for constructing reduced order models of physics simulations. The general idea of this technique is to iteratively perform spectral clustering on portions of the computational mesh with the largest derivative sum, and derive a simplified model from that partition. It should be noted that this is a slow method in general. Spectral clustering is expensive, so this would not be useful for a multilevel algorithm for instance. It is instead focused on accurately reproducing the output of physical simulations in future test instances, which may then be done faster.
[1] Chen, Jie, Yousef Saad, and Zechen Zhang. "Graph coarsening: from scientific computing to machine learning." SeMA Journal (2022): 1-37.
[2] Gilbert, Michael S., et al. "Performance-portable graph coarsening for efficient multilevel graph analysis." 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2021.
[3] Chen, Jie, and Ilya Safro. "Algebraic distance on graphs." SIAM Journal on Scientific Computing 33.6 (2011): 3468-3490.
[4] Jin, Yu, Andreas Loukas, and Joseph JaJa. "Graph coarsening with preserved spectral properties." International Conference on Artificial Intelligence and Statistics. PMLR, 2020.
[5] Ruge, John W., and Klaus Stüben. "Algebraic multigrid." Multigrid methods. Society for Industrial and Applied Mathematics, 1987. 73-130.
[6] Chung, Fan RK. Spectral graph theory. Vol. 92. American Mathematical Soc., 1997.
[7] Brissette, Christopher, Andy Huang, and George Slota. "Spectrum Consistent Coarsening Approximates Edge Weights." SIAM Journal on Matrix Analysis and Applications 44.3 (2023): 1032-1046.
[8] Brissette, Christopher, Andy Huang, and George Slota. "Parallel coarsening of graph data with spectral guarantees." arXiv preprint arXiv:2204.11757 (2022).
[9] Huang, Andy, et al. "Greedy Fiedler Spectral Partitioning for Data-driven Discrete Exterior Calculus." AAAI Spring Symposium: MLPS. 2021.