December 27, 2011
Stochastic blockmodels are generative network models where the vertices are separated into discrete groups, and the probability of an edge existing between two vertices is determined solely by their group membership. In this paper, we derive expressions for the entropy of stochastic blockmodel ensembles. We consider several ensemble variants, including the traditional model as well as the newly introduced degree-corrected version [Karrer et al. Phys. Rev. E 83, 016107 (2011)], which imposes a degree sequence on the vertices, in addition to the block structure. The imposed degree sequence is implemented both as "soft" constraints, where only the expected degrees are imposed, and as "hard" constraints, where they are required to be the same on all samples of the ensemble. We also consider generalizations to multigraphs and directed graphs. We illustrate one of many applications of this measure by directly deriving a log-likelihood function from the entropy expression, and using it to infer latent block structure in observed data. Due to the general nature of the ensembles considered, the method works well for ensembles with intrinsic degree correlations (i.e. with entropic origin) as well as extrinsic degree correlations, which go beyond the block structure.
Similar papers 1
February 20, 2008
In this paper we generalize the concept of random networks to describe networks with non trivial features by a statistical mechanics approach. This framework is able to describe ensembles of undirected, directed as well as weighted networks. These networks might have not trivial community structure or, in the case of networks embedded in a given space, non trivial distance dependence of the link probability. These ensembles are characterized by their entropy which evaluate th...
August 1, 2007
Randomized network ensembles are the null models of real networks and are extensivelly used to compare a real system to a null hypothesis. In this paper we study network ensembles with the same degree distribution, the same degree-correlations or the same community structure of any given real network. We characterize these randomized network ensembles by their entropy, i.e. the normalized logarithm of the total number of networks which are part of these ensembles. We estima...
October 17, 2019
The statistical inference of stochastic block models as emerged as a mathematicaly principled method for identifying communities inside networks. Its objective is to find the node partition and the block-to-block adjacency matrix of maximum likelihood i.e. the one which has most probably generated the observed network. In practice, in the so-called microcanonical ensemble, it is frequently assumed that when comparing two models which have the same number and sizes of communit...
August 23, 2010
Stochastic blockmodels have been proposed as a tool for detecting community structure in networks as well as for generating synthetic networks for use as benchmarks. Most blockmodels, however, ignore variation in vertex degree, making them unsuitable for applications to real-world networks, which typically display broad degree distributions that can significantly distort the results. Here we demonstrate how the generalization of blockmodels to incorporate this missing element...
January 31, 2011
We generate new mathematical tools with which to quantify the macroscopic topological structure of large directed networks. This is achieved via a statistical mechanical analysis of constrained maximum entropy ensembles of directed random graphs with prescribed joint distributions for in- and outdegrees and prescribed degree-degree correlation functions. We calculate exact and explicit formulae for the leading orders in the system size of the Shannon entropies and complexitie...
July 17, 2012
The proliferation of models for networks raises challenging problems of model selection: the data are sparse and globally dependent, and models are typically high-dimensional and have large numbers of latent variables. Together, these issues mean that the usual model-selection criteria do not work properly for networks. We illustrate these challenges, and show one way to resolve them, by considering the key network-analysis problem of dividing a graph into communities or bloc...
November 11, 2013
In Stochastic blockmodels, which are among the most prominent statistical models for cluster analysis of complex networks, clusters are defined as groups of nodes with statistically similar link probabilities within and between groups. A recent extension by Karrer and Newman incorporates a node degree correction to model degree heterogeneity within each group. Although this demonstrably leads to better performance on several networks it is not obvious whether modelling node d...
May 25, 2004
We study the family of network models derived by requiring the expected properties of a graph ensemble to match a given set of measurements of a real-world network, while maximizing the entropy of the ensemble. Models of this type play the same role in the study of networks as is played by the Boltzmann distribution in classical statistical mechanics; they offer the best prediction of network properties subject to the constraints imposed by a given set of observations. We giv...
November 12, 2018
We provide a novel family of generative block-models for random graphs that naturally incorporates degree distributions: the block-constrained configuration model. Block-constrained configuration models build on the generalised hypergeometric ensemble of random graphs and extend the well-known configuration model by enforcing block-constraints on the edge generation process. The resulting models are analytically tractable and practical to fit even to large networks. These mod...
September 14, 2013
Generalised degrees provide a natural bridge between local and global topological properties of networks. We define the generalised degree to be the number of neighbours of a node within one and two steps respectively. Tailored random graph ensembles are used to quantify and compare topological properties of networks in a systematic and precise manner, using concepts from information theory. We calculate the Shannon entropy of random graph ensembles constrained with a specifi...