Null Graph Generation.
When determining if observations are significant it is common practice to employ the use of null-models. Null-models make an assumption about the population our sample comes from, and they allow us to reason about the statistical likelihood of certain properties. For example, we may observe that the number of car crashes in a state has decreased since a new governor was elected. If our null-model assumes no change in the number of crashes elsewhere, we may be able to then connect this decrease with the election of the new governor. However, if our null-model assumes that crashes have decreased across the entire country, it is unlikely that our new governor had much to do with the state-level decrease. The same thing can be done for graphs!
Say that we have a graph āGā that represents relationships between one hundred individuals. Each node in this graph is a person, and each edge is a friendship. Also say we observe that ten people in the graph share far more connections between them than any other ten people in the graph. Within our single observation, it seems that this is unexpected. After all, no other group of ten individuals are as closely connected. Often, this is not what we care about. Instead, we may care about the likelihood of finding such a grouping of individuals for a population with a similar number of friendships. It is possible that a closely knit-community of ten friends arises in most instances of these graphs. In this case, this is an unremarkable result, contrary to our expectations.
As discussed, null model choice is incredibly important. For graphs, your null-model choice can drastically effect clustering tasks [1,2] in particular. Because of this, I have spent a considerable amount of time researching how the Chung-Lu [3] null model may be altered to better represent degree distributions. This work is spread out over several publications presented at IEEE and Complex Networks conferences. Since this is meant to be a very soft introduction to the importance of the problem, I will leave the details of the work out here. However, the papers can be found on my Google Scholar page.
Parallel Generation of Simple Null Graph Models [4] focuses on fast parallel techniques for constructing null models of edge-connection probabilities given an observed graph. Limitations of Chung-Lu Random Graph Generation [5] presents novel theoretical analysis of the Chung-Lu algorithm, and uses this analysis to generate more accurate graphs. Correcting Output Degree Sequences in Chung-Lu Random Graph Generation [6] builds upon the previously developed theory, and uses optimization methods to alter edge-connection probabilities. I am currently investigating the effects of null-model choice on modularity maximization [1].
Newman, Mark EJ. "Modularity and community structure in networks." Proceedings of the national academy of sciences 103.23 (2006): 8577-8582.
Fosdick, Bailey K., et al. "Configuring random graph models with fixed degree sequences." Siam Review 60.2 (2018): 315-355.
Winlaw, M., H. DeSterck, and G. Sanders. An in-depth analysis of the chung-lu model. No. LLNL-TR-678729. Lawrence Livermore National Lab.(LLNL), Livermore, CA (United States), 2015.
Garbus, Jack, Christopher Brissette, and George M. Slota. "Parallel generation of simple null graph models." 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 2020.
Brissette, Christopher, and George Slota. "Limitations of Chung Lu random graph generation." Complex Networks & Their Applications X: Volume 1, Proceedings of the Tenth International Conference on Complex Networks and Their Applications COMPLEX NETWORKS 2021 10. Springer International Publishing, 2022.
Brissette, Christopher, David Liu, and George M. Slota. "Correcting Output Degree Sequences in Chung-Lu Random Graph Generation." International Conference on Complex Networks and Their Applications. Cham: Springer International Publishing, 2022.