April 29, 2010
Communities are clusters of nodes with a higher than average density of internal connections. Their detection is of great relevance to better understand the structure and hierarchies present in a network. Modularity has become a standard tool in the area of community detection, providing at the same time a way to evaluate partitions and, by maximizing it, a method to find communities. In this work, we study the modularity from a combinatorial point of view. Our analysis (as t...
May 6, 2014
Random graph models have played a dominant role in the theoretical study of networked systems. The Poisson random graph of Erdos and Renyi, in particular, as well as the so-called configuration model, have served as the starting point for numerous calculations. In this paper we describe another large class of random graph models, which we call equitable random graphs and which are flexible enough to represent networks with diverse degree distributions and many nontrivial type...
September 8, 2014
We prove the existence of an asymptotic phase transition threshold on community detectability for the spectral modularity method [M. E. J. Newman, Phys. Rev. E 74, 036104 (2006) and Proc. National Academy of Sciences. 103, 8577 (2006)] under a stochastic block model. The phase transition on community detectability occurs as the inter-community edge connection probability $p$ grows. This phase transition separates a sub-critical regime of small $p$, where modularity-based comm...
July 3, 2015
The stochastic block model is a natural model for studying community detection in random networks. Its clustering properties have been extensively studied in the statistics, physics and computer science literature. Recently this area has experienced major mathematical breakthroughs, particularly for the binary (two-community) version, see Mossel, Neeman, Sly (2012, 2013) and Massoulie (2013). In this paper, we introduce a variant of the binary model which we call the regular ...
May 21, 2013
Many methods have been proposed for community detection in networks. Some of the most promising are methods based on statistical inference, which rest on solid mathematical foundations and return excellent results in practice. In this paper we show that two of the most widely used inference methods can be mapped directly onto versions of the standard minimum-cut graph partitioning problem, which allows us to apply any of the many well-understood partitioning algorithms to the...
October 12, 2022
Spectral algorithms are some of the main tools in optimization and inference problems on graphs. Typically, the graph is encoded as a matrix and eigenvectors and eigenvalues of the matrix are then used to solve the given graph problem. Spectral algorithms have been successfully used for graph partitioning, hidden clique recovery and graph coloring. In this paper, we study the power of spectral algorithms using two matrices in a graph partitioning problem. We use two different...
June 5, 2013
Communities are fundamental entities for the characterization of the structure of real networks. The standard approach to the identification of communities in networks is based on the optimization of a quality function known as "modularity". Although modularity has been at the center of an intense research activity and many methods for its maximization have been proposed, not much it is yet known about the necessary conditions that communities need to satisfy in order to be d...
March 12, 2018
We consider spectral clustering algorithms for community detection under a general bipartite stochastic block model (SBM). A modern spectral clustering algorithm consists of three steps: (1) regularization of an appropriate adjacency or Laplacian matrix (2) a form of spectral truncation and (3) a k-means type algorithm in the reduced spectral domain. We focus on the adjacency-based spectral clustering and for the first step, propose a new data-driven regularization that can r...
May 10, 2006
We consider the problem of detecting communities or modules in networks, groups of vertices with a higher-than-average density of edges connecting them. Previous work indicates that a robust approach to this problem is the maximization of the benefit function known as "modularity" over possible divisions of a network. Here we show that this maximization process can be written in terms of the eigenspectrum of a matrix we call the modularity matrix, which plays a role in commun...
March 29, 2017
The stochastic block model (SBM) is a random graph model with different group of vertices connecting differently. It is widely employed as a canonical model to study clustering and community detection, and provides a fertile ground to study the information-theoretic and computational tradeoffs that arise in combinatorial statistics and more generally data science. This monograph surveys the recent developments that establish the fundamental limits for community detection in...