January 20, 2015
Similar papers 4
March 2, 2015
New phase transition phenomena have recently been discovered for the stochastic block model, for the special case of two non-overlapping symmetric communities. This gives raise in particular to new algorithmic challenges driven by the thresholds. This paper investigates whether a general phenomenon takes place for multiple communities, without imposing symmetry. In the general stochastic block model $\text{SBM}(n,p,Q)$, $n$ vertices are split into $k$ communities of relativ...
May 25, 2018
This article explores and analyzes the unsupervised clustering of large partially observed graphs. We propose a scalable and provable randomized framework for clustering graphs generated from the stochastic block model. The clustering is first applied to a sub-matrix of the graph's adjacency matrix associated with a reduced graph sketch constructed using random sampling. Then, the clusters of the full graph are inferred based on the clusters extracted from the sketch using a ...
November 6, 2014
We propose and analyze a generic method for community recovery in stochastic block models and degree corrected block models. This approach can exactly recover the hidden communities with high probability when the expected node degrees are of order $\log n$ or higher. Starting from a roughly correct community partition given by some conventional community recovery algorithm, this method refines the partition in a cross clustering step. Our results simplify and extend some of t...
July 8, 2018
We study the problem of community detection in a random hypergraph model which we call the stochastic block model for $k$-uniform hypergraphs ($k$-SBM). We investigate the exact recovery problem in $k$-SBM and show that a sharp phase transition occurs around a threshold: below the threshold it is impossible to recover the communities with non-vanishing probability, yet above the threshold there is an estimator which recovers the communities almost asymptotically surely. We al...
November 16, 2018
We consider the exact recovery problem in the hypergraph stochastic block model (HSBM) with $k$ blocks of equal size. More precisely, we consider a random $d$-uniform hypergraph $H$ with $n$ vertices partitioned into $k$ clusters of size $s = n / k$. Hyperedges $e$ are added independently with probability $p$ if $e$ is contained within a single cluster and $q$ otherwise, where $0 \leq q < p \leq 1$. We present a spectral algorithm which recovers the clusters exactly with high...
August 26, 2018
Real-world networks usually have community structure, that is, nodes are grouped into densely connected communities. Community detection is one of the most popular and best-studied research topics in network science and has attracted attention in many different fields, including computer science, statistics, social sciences, among others. Numerous approaches for community detection have been proposed in literature, from ad-hoc algorithms to systematic model-based approaches. ...
February 12, 2017
Spectral clustering is one of the most popular methods for community detection in graphs. A key step in spectral clustering algorithms is the eigen decomposition of the $n{\times}n$ graph Laplacian matrix to extract its $k$ leading eigenvectors, where $k$ is the desired number of clusters among $n$ objects. This is prohibitively complex to implement for very large datasets. However, it has recently been shown that it is possible to bypass the eigen decomposition by computing ...
September 15, 2023
We consider the spectral properties of balanced stochastic block models of which the average degree grows slower than the number of nodes (sparse regime) or proportional to it (dense regime). For both regimes, we prove a phase transition of the extreme eigenvalues of SBM at the Kesten--Stigum threshold. We also prove the central limit theorem for the linear spectral statistics for both regimes. We propose a hypothesis test for determining the presence of communities of the gr...
September 30, 2017
We propose an efficient meta-algorithm for Bayesian estimation problems that is based on low-degree polynomials, semidefinite programming, and tensor decomposition. The algorithm is inspired by recent lower bound constructions for sum-of-squares and related to the method of moments. Our focus is on sample complexity bounds that are as tight as possible (up to additive lower-order terms) and often achieve statistical thresholds or conjectured computational thresholds. Our al...
February 26, 2015
Resolving a conjecture of Abbe, Bandeira and Hall, the authors have recently shown that the semidefinite programming (SDP) relaxation of the maximum likelihood estimator achieves the sharp threshold for exactly recovering the community structure under the binary stochastic block model of two equal-sized clusters. The same was shown for the case of a single cluster and outliers. Extending the proof techniques, in this paper it is shown that SDP relaxations also achieve the sha...