September 15, 2023
Similar papers 2
January 12, 2022
Network-based clustering methods frequently require the number of communities to be specified \emph{a priori}. Moreover, most of the existing methods for estimating the number of communities assume the number of communities to be fixed and not scale with the network size $n$. The few methods that assume the number of communities to increase with the network size $n$ are only valid when the average degree $d$ of a network grows at least as fast as $O(n)$ (i.e., the dense case)...
November 5, 2019
We propose a new hierarchy of semidefinite programming relaxations for inference problems. As test cases, we consider the problem of community detection in block models. The vertices are partitioned into $k$ communities, and a graph is sampled conditional on a prescribed number of inter- and intra-community edges. The problem of detection, where we are to decide with high probability whether a graph was drawn from this model or the uniform distribution on regular graphs, is c...
September 4, 2018
We propose to estimate the number of communities in degree-corrected stochastic block models based on a pseudo likelihood ratio statistic. To this end, we introduce a method that combines spectral clustering with binary segmentation. This approach guarantees an upper bound for the pseudo likelihood ratio statistic when the model is over-fitted. We also derive its limiting distribution when the model is under-fitted. Based on these properties, we establish the consistency of o...
May 8, 2012
We study networks that display community structure -- groups of nodes within which connections are unusually dense. Using methods from random matrix theory, we calculate the spectra of such networks in the limit of large size, and hence demonstrate the presence of a phase transition in matrix methods for community detection, such as the popular modularity maximization method. The transition separates a regime in which such methods successfully detect the community structure f...
December 7, 2013
We analyze the performance of spectral clustering for community extraction in stochastic block models. We show that, under mild conditions, spectral clustering applied to the adjacency matrix of the network can consistently recover hidden communities even when the order of the maximum expected degree is as small as $\log n$, with $n$ the number of nodes. This result applies to some popular polynomial time spectral clustering algorithms and is further extended to degree correc...
March 12, 2018
We consider spectral clustering algorithms for community detection under a general bipartite stochastic block model (SBM). A modern spectral clustering algorithm consists of three steps: (1) regularization of an appropriate adjacency or Laplacian matrix (2) a form of spectral truncation and (3) a k-means type algorithm in the reduced spectral domain. We focus on the adjacency-based spectral clustering and for the first step, propose a new data-driven regularization that can r...
December 23, 2014
We consider the problem of community detection in the Stochastic Block Model with a finite number $K$ of communities of sizes linearly growing with the network size $n$. This model consists in a random graph such that each pair of vertices is connected independently with probability $p$ within communities and $q$ across communities. One observes a realization of this random graph, and the objective is to reconstruct the communities from this observation. We show that under sp...
July 8, 2018
We study the problem of community detection in a random hypergraph model which we call the stochastic block model for $k$-uniform hypergraphs ($k$-SBM). We investigate the exact recovery problem in $k$-SBM and show that a sharp phase transition occurs around a threshold: below the threshold it is impossible to recover the communities with non-vanishing probability, yet above the threshold there is an estimator which recovers the communities almost asymptotically surely. We al...
October 12, 2016
We consider the sparse stochastic block model in the case where the degrees are uninformative. The case where the two communities have approximately the same size has been extensively studied and we concentrate here on the community detection problem in the case of unbalanced communities. In this setting, spectral algorithms based on the non-backtracking matrix are known to solve the community detection problem (i.e. do strictly better than a random guess) when the signal is ...
February 10, 2015
We study random graphs with possibly different edge probabilities in the challenging sparse regime of bounded expected degrees. Unlike in the dense case, neither the graph adjacency matrix nor its Laplacian concentrate around their expectations due to the highly irregular distribution of node degrees. It has been empirically observed that simply adding a constant of order $1/n$ to each entry of the adjacency matrix substantially improves the behavior of Laplacian. Here we pro...