November 12, 2013
Similar papers 4
November 16, 2022
The stochastic block model is a popular tool for detecting community structures in network data. Detecting the difference between two community structures is an important issue for stochastic block models. However, the two-sample test has been a largely under-explored domain, and too little work has been devoted to it. In this article, based on the maximum entry--wise deviation of the two centered and rescaled adjacency matrices, we propose a novel test statistic to test two ...
March 20, 2017
The stochastic block model is widely used for detecting community structures in network data. How to test the goodness-of-fit of the model is one of the fundamental problems and has gained growing interests in recent years. In this article, we propose a novel goodness-of-fit test based on the maximum entry of the centered and re-scaled adjacency matrix for the stochastic block model. One noticeable advantage of the proposed test is that the number of communities can be allowe...
May 10, 2023
This paper provides a selective review of the statistical network analysis literature focused on clustering and inference problems for stochastic blockmodels and their variants. We survey asymptotic normality results for stochastic blockmodels as a means of thematically linking classical statistical concepts to contemporary research in network data analysis. Of note, multiple different forms of asymptotically Gaussian behavior arise in stochastic blockmodels and are useful fo...
December 17, 2014
For a given graph, $G$, let $A$ be the adjacency matrix, $D$ is the diagonal matrix of degrees, $L' = D - A$ is the combinatorial Laplacian, and $L = D^{-1/2}L'D^{-1/2}$ is the normalized Laplacian. Recently, the eigenvectors corresponding to the smallest eigenvalues of $L$ and $L'$ have been of great interest because of their application to community detection, which is a nebulously defined problem that essentially seeks to find a vertex set $S$ such that there are few edges...
March 12, 2022
Community detection for large networks is a challenging task due to the high computational cost as well as the heterogeneous community structure. Stochastic block model (SBM) is a popular model to analyze community structure where nodes belonging to the same communities are connected with equal probability. Modularity optimization methods provide a fast and effective way for community detection under SBM with assortative community structure, where nodes within communities are...
March 29, 2017
The stochastic block model (SBM) is a random graph model with different group of vertices connecting differently. It is widely employed as a canonical model to study clustering and community detection, and provides a fertile ground to study the information-theoretic and computational tradeoffs that arise in combinatorial statistics and more generally data science. This monograph surveys the recent developments that establish the fundamental limits for community detection in...
November 5, 2019
Community detection tasks have received a lot of attention across statistics, machine learning, and information theory with a large body of work concentrating on theoretical guarantees for the stochastic block model. One line of recent work has focused on modeling the spectral embedding of a network using Gaussian mixture models (GMMs) in scaling regimes where the ability to detect community memberships improves with the size of the network. However, these regimes are not ver...
January 22, 2020
In bipartite networks, community structures are restricted to being disassortative, in that nodes of one type are grouped according to common patterns of connection with nodes of the other type. This makes the stochastic block model (SBM), a highly flexible generative model for networks with block structure, an intuitive choice for bipartite community detection. However, typical formulations of the SBM do not make use of the special structure of bipartite networks. Here we in...
July 10, 2010
Networks or graphs can easily represent a diverse set of data sources that are characterized by interacting units or actors. Social networks, representing people who communicate with each other, are one example. Communities or clusters of highly connected actors form an essential feature in the structure of several empirical networks. Spectral clustering is a popular and computationally feasible method to discover these communities. The stochastic blockmodel [Social Networks ...
February 13, 2014
In this paper, we consider networks consisting of a finite number of non-overlapping communities. To extract these communities, the interaction between pairs of nodes may be sampled from a large available data set, which allows a given node pair to be sampled several times. When a node pair is sampled, the observed outcome is a binary random variable, equal to 1 if nodes interact and to 0 otherwise. The outcome is more likely to be positive if nodes belong to the same communi...