Bayesian cluster analysis: Point estimat...

Search Algorithms and Loss Functions for Bayesian Clustering

May 10, 2021

87% Match

David B. Dahl, Devin J. Johnson, Peter Mueller

Methodology

We propose a randomized greedy search algorithm to find a point estimate for a random partition based on a loss function and posterior Monte Carlo samples. Given the large size and awkward discrete nature of the search space, the minimization of the posterior expected loss is challenging. Our approach is a stochastic search based on a series of greedy optimizations performed in a random order and is embarrassingly parallel. We consider several loss functions, including Binder...

Find SimilarView on arXiv

Dirichlet Process Parsimonious Mixtures for clustering

January 14, 2015

87% Match

Faicel Chamroukhi, Marius Bartcus, Hervé Glotin

Machine Learning

Methodology

The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimoniou...

Find SimilarView on arXiv

Bayesian contiguity constrained clustering, spanning trees and dendrograms

February 24, 2023

87% Match

Etienne Côme

Computation

Methodology

Clustering is a well-known and studied problem, one of its variants, called contiguity-constrained clustering, accepts as a second input a graph used to encode prior information about cluster structure by means of contiguity constraints i.e. clusters must form connected subgraphs of this graph. This paper discusses the interest of such a setting and proposes a new way to formalise it in a Bayesian setting, using results on spanning trees to compute exactly a posteriori probab...

Find SimilarView on arXiv

Spying on the prior of the number of data clusters and the partition distribution in Bayesian cluster analysis

December 22, 2020

87% Match

Jan Greve, Bettina Grün, ... , Frühwirth-Schnatter Sylvia

Methodology

Cluster analysis aims at partitioning data into groups or clusters. In applications, it is common to deal with problems where the number of clusters is unknown. Bayesian mixture models employed in such applications usually specify a flexible prior that takes into account the uncertainty with respect to the number of clusters. However, a major empirical challenge involving the use of these models is in the characterisation of the induced prior on the partitions. This work intr...

Find SimilarView on arXiv

On Bayesian "central clustering": Application to landscape classification of Western Ghats

November 30, 2011

87% Match

Sabyasachi Mukhopadhyay, Sourabh Bhattacharya, Kajal Dihidar

Applications

Landscape classification of the well-known biodiversity hotspot, Western Ghats (mountains), on the west coast of India, is an important part of a world-wide program of monitoring biodiversity. To this end, a massive vegetation data set, consisting of 51,834 4-variate observations has been clustered into different landscapes by Nagendra and Gadgil [Current Sci. 75 (1998) 264--271]. But a study of such importance may be affected by nonuniqueness of cluster analysis and the lack...

Find SimilarView on arXiv

Interactive Bayesian Hierarchical Clustering

February 10, 2016

87% Match

Sharad Vikram, Sanjoy Dasgupta

Machine Learning

Clustering is a powerful tool in data analysis, but it is often difficult to find a grouping that aligns with a user's needs. To address this, several methods incorporate constraints obtained from users into clustering algorithms, but unfortunately do not apply to hierarchical clustering. We design an interactive Bayesian algorithm that incorporates user interaction into hierarchical clustering while still utilizing the geometry of the data by sampling a constrained posterior...

Find SimilarView on arXiv

Decision Making Using Probabilistic Inference Methods

March 13, 2013

87% Match

Ross D. Shachter, Mark Alan Peot

Artificial Intelligence

The analysis of decision making under uncertainty is closely related to the analysis of probabilistic inference. Indeed, much of the research into efficient methods for probabilistic inference in expert systems has been motivated by the fundamental normative arguments of decision theory. In this paper we show how the developments underlying those efficient methods can be applied immediately to decision problems. In addition to general approaches which need know nothing about ...

Find SimilarView on arXiv

Bayesian information criteria for clustering normally distributed data

August 10, 2020

87% Match

Anthony J. Webster

Statistics Theory

Applications

Methodology

Statistics Theory

Maximum likelihood estimates (MLEs) are asymptotically normally distributed, and this property is used in meta-analyses to test the heterogeneity of estimates, either for a single cluster or for several sub-groups. More recently, MLEs for associations between risk factors and diseases have been hierarchically clustered to search for diseases with shared underlying causes, but an objective statistical criterion is needed to determine the number and composition of clusters. To ...

Find SimilarView on arXiv

clusterBMA: Bayesian model averaging for clustering

September 9, 2022

87% Match

Owen Forbes, Edgar Santos-Fernandez, Paul Pao-Yen Wu, Hong-Bo Xie, Paul E. Schwenn, Jim Lagopoulos, Lia Mills, Dashiell D. Sacks, ... , Mengersen Kerrie

Methodology

Applications

Machine Learning

Various methods have been developed to combine inference across multiple sets of results for unsupervised clustering, within the ensemble clustering literature. The approach of reporting results from one `best' model out of several candidate clustering models generally ignores the uncertainty that arises from model selection, and results in inferences that are sensitive to the particular model and parameters chosen. Bayesian model averaging (BMA) is a popular approach for com...

Find SimilarView on arXiv

Distributed Bayesian clustering using finite mixture of mixtures

March 31, 2020

87% Match

Hanyu Song, Yingjian Wang, David B. Dunson

Computation

Methodology

In many modern applications, there is interest in analyzing enormous data sets that cannot be easily moved across computers or loaded into memory on a single computer. In such settings, it is very common to be interested in clustering. Existing distributed clustering algorithms are mostly distance or density based without a likelihood specification, precluding the possibility of formal statistical inference. Model-based clustering allows statistical inference, yet research on...

Find SimilarView on arXiv

Bayesian cluster analysis: Point estimation and credible balls

Search Algorithms and Loss Functions for Bayesian Clustering

Dirichlet Process Parsimonious Mixtures for clustering

Bayesian contiguity constrained clustering, spanning trees and dendrograms

Spying on the prior of the number of data clusters and the partition distribution in Bayesian cluster analysis

On Bayesian "central clustering": Application to landscape classification of Western Ghats

Interactive Bayesian Hierarchical Clustering

Decision Making Using Probabilistic Inference Methods

Bayesian information criteria for clustering normally distributed data

clusterBMA: Bayesian model averaging for clustering

Distributed Bayesian clustering using finite mixture of mixtures