When Does Bottom-up Beat Top-down in Hierarchical Community Detection?

June 1, 2023

View on ArXiv

Maximilien Dreveton, Daichi Kuroda, Matthias Grossglauser, Patrick Thiran

Computer Science

Mathematics

Statistics

cs.SI

cs.LG

math.ST

stat.ME

stat.ML

stat.TH

Hierarchical clustering of networks consists in finding a tree of communities, such that lower levels of the hierarchy reveal finer-grained community structures. There are two main classes of algorithms tackling this problem. Divisive ($\textit{top-down}$) algorithms recursively partition the nodes into two communities, until a stopping rule indicates that no further split is needed. In contrast, agglomerative ($\textit{bottom-up}$) algorithms first identify the smallest community structure and then repeatedly merge the communities using a $\textit{linkage}$ method. In this article, we establish theoretical guarantees for the recovery of the hierarchical tree and community structure of a Hierarchical Stochastic Block Model by a bottom-up algorithm. We also establish that this bottom-up algorithm attains the information-theoretic threshold for exact recovery at intermediate levels of the hierarchy. Notably, these recovery conditions are less restrictive compared to those existing for top-down algorithms. This shows that bottom-up algorithms extend the feasible region for achieving exact recovery at intermediate levels. Numerical experiments on both synthetic and real data sets confirm the superiority of bottom-up algorithms over top-down algorithms. We also observe that top-down algorithms can produce dendrograms with inversions. These findings contribute to a better understanding of hierarchical clustering techniques and their applications in network analysis.

Hierarchical community detection by recursive partitioning

October 2, 2018

92% Match

Tianxi Li, Lihua Lei, Sharmodeep Bhattacharyya, Koen Van den Berge, Purnamrita Sarkar, ... , Levina Elizaveta

Methodology

Statistics Theory

Machine Learning

Statistics Theory

The problem of community detection in networks is usually formulated as finding a single partition of the network into some "correct" number of communities. We argue that it is more interpretable and in some regimes more accurate to construct a hierarchical tree of communities instead. This can be done with a simple top-down recursive partitioning algorithm, starting with a single community and separating the nodes into two communities by spectral clustering repeatedly, until...

Find SimilarView on arXiv

Community Detection and Classification in Hierarchical Stochastic Blockmodels

March 7, 2015

92% Match

Vince Lyzinski, Minh Tang, Avanti Athreya, ... , Priebe Carey E.

Machine Learning

Applications

We propose a robust, scalable, integrated methodology for community detection and community comparison in graphs. In our procedure, we first embed a graph into an appropriate Euclidean space to obtain a low-dimensional representation, and then cluster the vertices into communities. We next employ nonparametric graph inference techniques to identify structural similarity among these communities. These two steps are then applied recursively on the communities, allowing us to de...

Find SimilarView on arXiv

Hierarchical community structure in networks

September 15, 2020

92% Match

Michael T. Schaub, Jiaze Li, Leto Peel

Social and Information Netwo...

Machine Learning

Modular and hierarchical community structures are pervasive in real-world complex systems. A great deal of effort has gone into trying to detect and study these structures. Important theoretical advances in the detection of modular have included identifying fundamental limits of detectability by formally defining community structure using probabilistic generative models. Detecting hierarchical community structure introduces additional challenges alongside those inherited from...

Find SimilarView on arXiv

The art of community detection

July 11, 2008

92% Match

Natali Gulbahce, Sune Lehmann

Physics and Society

Statistical Mechanics

Data Analysis, Statistics an...

Quantitative Methods

Networks in nature possess a remarkable amount of structure. Via a series of data-driven discoveries, the cutting edge of network science has recently progressed from positing that the random graphs of mathematical graph theory might accurately describe real networks to the current viewpoint that networks in nature are highly complex and structured entities. The identification of high order structures in networks unveils insights into their functional organization. Recently, ...

Find SimilarView on arXiv

Fast and reliable inference algorithm for hierarchical stochastic block models

November 14, 2017

90% Match

Yongjin Park, Joel S. Bader

Machine Learning

Network clustering reveals the organization of a network or corresponding complex system with elements represented as vertices and interactions as edges in a (directed, weighted) graph. Although the notion of clustering can be somewhat loose, network clusters or groups are generally considered as nodes with enriched interactions and edges sharing common patterns. Statistical inference often treats groups as latent variables, with observed networks generated from latent group ...

Find SimilarView on arXiv

Community Detection and Stochastic Block Models

March 29, 2017

90% Match

Emmanuel Abbe

math.PR

cs.CC

cs.IT

cs.SI

math.IT

stat.ML

The stochastic block model (SBM) is a random graph model with different group of vertices connecting differently. It is widely employed as a canonical model to study clustering and community detection, and provides a fertile ground to study the information-theoretic and computational tradeoffs that arise in combinatorial statistics and more generally data science. This monograph surveys the recent developments that establish the fundamental limits for community detection in...

Find SimilarView on arXiv

A Method to Find Community Structures Based on Information Centrality

February 20, 2004

90% Match

Santo Fortunato, Vito Latora, Massimo Marchiori

Statistical Mechanics

Disordered Systems and Neura...

Community structures are an important feature of many social, biological and technological networks. Here we study a variation on the method for detecting such communities proposed by Girvan and Newman and based on the idea of using centrality measures to define the community boundaries (M. Girvan and M. E. J. Newman, Community structure in social and biological networks Proc. Natl. Acad. Sci. USA 99, 7821-7826 (2002)). We develop an algorithm of hierarchical clustering that ...

Find SimilarView on arXiv

Structural Inference of Hierarchies in Networks

October 9, 2006

90% Match

Aaron Clauset, Cristopher Moore, M. E. J. Newman

Physics and Society

Machine Learning

Data Analysis, Statistics an...

One property of networks that has received comparatively little attention is hierarchy, i.e., the property of having vertices that cluster together in groups, which then join to form groups of groups, and so forth, up through all levels of organization in the network. Here, we give a precise definition of hierarchical structure, give a generic model for generating arbitrary hierarchical structure in a random graph, and describe a statistically principled way to learn the set ...

Find SimilarView on arXiv

Hypothesis Testing for Automated Community Detection in Networks

November 12, 2013

90% Match

Peter J. Bickel, Purnamrita Sarkar

stat.ML

cs.LG

cs.SI

math.ST

physics.soc-ph

stat.TH

Community detection in networks is a key exploratory tool with applications in a diverse set of areas, ranging from finding communities in social and biological networks to identifying link farms in the World Wide Web. The problem of finding communities or clusters in a network has received much attention from statistics, physics and computer science. However, most clustering algorithms assume knowledge of the number of clusters k. In this paper we propose to automatically de...

Find SimilarView on arXiv

A Survey on Theoretical Advances of Community Detection in Networks

August 26, 2018

90% Match

Yunpeng Zhao

Social and Information Netwo...

Machine Learning

Real-world networks usually have community structure, that is, nodes are grouped into densely connected communities. Community detection is one of the most popular and best-studied research topics in network science and has attracted attention in many different fields, including computer science, statistics, social sciences, among others. Numerous approaches for community detection have been proposed in literature, from ad-hoc algorithms to systematic model-based approaches. ...

Find SimilarView on arXiv