ID: 1505.03339

Bayesian cluster analysis: Point estimation and credible balls

May 13, 2015

View on ArXiv

Similar papers 5

Bayesian model averaging in model-based clustering and density estimation

June 30, 2015

87% Match
Niamh Russell, Thomas Brendan Murphy, Adrian E Raftery
Computation

We propose Bayesian model averaging (BMA) as a method for postprocessing the results of model-based clustering. Given a number of competing models, appropriate model summaries are averaged, using the posterior model probabilities, instead of being taken from a single "best" model. We demonstrate the use of BMA in model-based clustering for a number of datasets. We show that BMA provides a useful summary of the clustering of observations while taking model uncertainty into acc...

Find SimilarView on arXiv

Robust and Automatic Data Clustering: Dirichlet Process meets Median-of-Means

November 26, 2023

87% Match
Supratik Basu, Jyotishka Ray Choudhury, ... , Das Swagatam
Machine Learning
Machine Learning
Methodology

Clustering stands as one of the most prominent challenges within the realm of unsupervised machine learning. Among the array of centroid-based clustering algorithms, the classic $k$-means algorithm, rooted in Lloyd's heuristic, takes center stage as one of the extensively employed techniques in the literature. Nonetheless, both $k$-means and its variants grapple with noteworthy limitations. These encompass a heavy reliance on initial cluster centroids, susceptibility to conve...

Find SimilarView on arXiv

Information based clustering

November 26, 2005

87% Match
Noam Slonim, Gurinder Singh Atwal, ... , Bialek William
Quantitative Methods

In an age of increasingly large data sets, investigators in many different disciplines have turned to clustering as a tool for data analysis and exploration. Existing clustering methods, however, typically depend on several nontrivial assumptions about the structure of data. Here we reformulate the clustering problem from an information theoretic perspective which avoids many of these assumptions. In particular, our formulation obviates the need for defining a cluster "protot...

Find SimilarView on arXiv

A survey on Bayesian inference for Gaussian mixture model

August 20, 2021

87% Match
Jun Lu
Machine Learning
Artificial Intelligence
Machine Learning

Clustering has become a core technology in machine learning, largely due to its application in the field of unsupervised learning, clustering, classification, and density estimation. A frequentist approach exists to hand clustering based on mixture model which is known as the EM algorithm where the parameters of the mixture model are usually estimated into a maximum likelihood estimation framework. Bayesian approach for finite and infinite Gaussian mixture model generates poi...

Find SimilarView on arXiv

Optimal Bayesian clustering using non-negative matrix factorization

September 20, 2018

87% Match
Ketong Wang, Michael D. Porter
Methodology

Bayesian model-based clustering is a widely applied procedure for discovering groups of related observations in a dataset. These approaches use Bayesian mixture models, estimated with MCMC, which provide posterior samples of the model parameters and clustering partition. While inference on model parameters is well established, inference on the clustering partition is less developed. A new method is developed for estimating the optimal partition from the pairwise posterior sim...

Find SimilarView on arXiv

Flexible clustering via hidden hierarchical Dirichlet priors

January 18, 2022

86% Match
Antonio Lijoi, Igor PrĂ¼nster, Giovanni Rebaudo
Methodology
Statistics Theory
Statistics Theory

The Bayesian approach to inference stands out for naturally allowing borrowing information across heterogeneous populations, with different samples possibly sharing the same distribution. A popular Bayesian nonparametric model for clustering probability distributions is the nested Dirichlet process, which however has the drawback of grouping distributions in a single cluster when ties are observed across samples. With the goal of achieving a flexible and effective clustering ...

Find SimilarView on arXiv

Demystifying Information-Theoretic Clustering

October 15, 2013

86% Match
Greg Ver Steeg, Aram Galstyan, ... , DeDeo Simon
Machine Learning
Information Theory
Information Theory
Data Analysis, Statistics an...
Machine Learning

We propose a novel method for clustering data which is grounded in information-theoretic principles and requires no parametric assumptions. Previous attempts to use information theory to define clusters in an assumption-free way are based on maximizing mutual information between data and cluster labels. We demonstrate that this intuition suffers from a fundamental conceptual flaw that causes clustering performance to deteriorate as the amount of data increases. Instead, we re...

Find SimilarView on arXiv

Bayesian Inference

February 10, 2010

86% Match
Christian P. Robert, Jean-Michel Marin, Judith Rousseau
Methodology
Applications

This chapter provides a overview of Bayesian inference, mostly emphasising that it is a universal method for summarising uncertainty and making estimates and predictions using probability statements conditional on observed data and an assumed model (Gelman 2008). The Bayesian perspective is thus applicable to all aspects of statistical inference, while being open to the incorporation of information items resulting from earlier experiments and from expert opinions. We provide ...

Find SimilarView on arXiv

Bayesian Clustering via Fusing of Localized Densities

March 31, 2023

86% Match
Alexander Dombowsky, David B. Dunson
Methodology

Bayesian clustering typically relies on mixture models, with each component interpreted as a different cluster. After defining a prior for the component parameters and weights, Markov chain Monte Carlo (MCMC) algorithms are commonly used to produce samples from the posterior distribution of the component labels. The data are then clustered by minimizing the expectation of a clustering loss function that favours similarity to the component labels. Unfortunately, although these...

Find SimilarView on arXiv

Bayesian approach to clustering real value, categorical and network data: solution via variational methods

May 17, 2008

86% Match
Alexei Institute for Advanced Study Vazquez
Data Analysis, Statistics an...

Data clustering, including problems such as finding network communities, can be put into a systematic framework by means of a Bayesian approach. The application of Bayesian approaches to real problems can be, however, quite challenging. In most cases the solution is explored via Monte Carlo sampling or variational methods. Here we work further on the application of variational methods to clustering problems. We introduce generative models based on a hidden group structure and...

Find SimilarView on arXiv