Repulsion, Chaos and Equilibrium in Mixt...

Revisiting k-means: New Algorithms via Bayesian Nonparametrics

November 2, 2011

88% Match

Brian Kulis, Michael I. Jordan

Machine Learning

Bayesian models offer great flexibility for clustering applications---Bayesian nonparametrics can be used for modeling infinite mixtures, and hierarchical Bayesian models can be utilized for sharing clusters across multiple data sets. For the most part, such flexibility is lacking in classical clustering methods such as k-means. In this paper, we revisit the k-means clustering algorithm from a Bayesian nonparametric viewpoint. Inspired by the asymptotic connection between k-m...

Find SimilarView on arXiv

Minimum Message Length Clustering Using Gibbs Sampling

January 16, 2013

88% Match

Ian Davidson

Machine Learning

The K-Mean and EM algorithms are popular in clustering and mixture modeling, due to their simplicity and ease of implementation. However, they have several significant limitations. Both coverage to a local optimum of their respective objective functions (ignoring the uncertainty in the model space), require the apriori specification of the number of classes/clsuters, and are inconsistent. In this work we overcome these limitations by using the Minimum Message Length (MML) pri...

Find SimilarView on arXiv

Bayesian Finite Mixture Models

July 7, 2024

87% Match

Bettina Grün, Gertraud Malsiner-Walli

Methodology

Finite mixture models are a useful statistical model class for clustering and density approximation. In the Bayesian framework finite mixture models require the specification of suitable priors in addition to the data model. These priors allow to avoid spurious results and provide a principled way to define cluster shapes and a preference for specific cluster solutions. A generic model estimation scheme for finite mixtures with a fixed number of components is available using ...

Find SimilarView on arXiv

A survey on Bayesian inference for Gaussian mixture model

August 20, 2021

87% Match

Jun Lu

Machine Learning

Artificial Intelligence

Machine Learning

Clustering has become a core technology in machine learning, largely due to its application in the field of unsupervised learning, clustering, classification, and density estimation. A frequentist approach exists to hand clustering based on mixture model which is known as the EM algorithm where the parameters of the mixture model are usually estimated into a maximum likelihood estimation framework. Bayesian approach for finite and infinite Gaussian mixture model generates poi...

Find SimilarView on arXiv

Model-based clustering based on sparse finite Gaussian mixtures

June 22, 2016

87% Match

Gertraud Malsiner-Walli, Sylvia Frühwirth-Schnatter, Bettina Grün

Methodology

In the framework of Bayesian model-based clustering based on a finite mixture of Gaussian distributions, we present a joint approach to estimate the number of mixture components and identify cluster-relevant variables simultaneously as well as to obtain an identified model. Our approach consists in specifying sparse hierarchical priors on the mixture weights and component means. In a deliberately overfitting mixture model the sparse prior on the weights empties superfluous co...

Find SimilarView on arXiv

Model based clustering of multinomial count data

July 28, 2022

87% Match

Panagiotis Papastamoulis

Methodology

Computation

We consider the problem of inferring an unknown number of clusters in replicated multinomial data. Under a model based clustering point of view, this task can be treated by estimating finite mixtures of multinomial distributions with or without covariates. Both Maximum Likelihood (ML) as well as Bayesian estimation are taken into account. Under a Maximum Likelihood approach, we provide an Expectation--Maximization (EM) algorithm which exploits a careful initialization procedu...

Find SimilarView on arXiv

Identifying Mixtures of Mixtures Using Bayesian Estimation

February 23, 2015

87% Match

Gertraud Malsiner-Walli, Sylvia Frühwirth-Schnatter, Bettina Grün

Methodology

The use of a finite mixture of normal distributions in model-based clustering allows to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior where the hyperpar...

Find SimilarView on arXiv

Determinantal Clustering Processes - A Nonparametric Bayesian Approach to Kernel Based Semi-Supervised Clustering

September 26, 2013

87% Match

Amar Shah, Zoubin Ghahramani

Machine Learning

Semi-supervised clustering is the task of clustering data points into clusters where only a fraction of the points are labelled. The true number of clusters in the data is often unknown and most models require this parameter as an input. Dirichlet process mixture models are appealing as they can infer the number of clusters from the data. However, these models do not deal with high dimensional data well and can encounter difficulties in inference. We present a novel nonparame...

Find SimilarView on arXiv

From here to infinity - sparse finite versus Dirichlet process mixtures in model-based clustering

June 22, 2017

87% Match

Sylvia Frühwirth-Schnatter, Gertraud Malsiner-Walli

Methodology

In model-based-clustering mixture models are used to group data points into clusters. A useful concept introduced for Gaussian mixtures by Malsiner Walli et al (2016) are sparse finite mixtures, where the prior distribution on the weight distribution of a mixture with $K$ components is chosen in such a way that a priori the number of clusters in the data is random and is allowed to be smaller than $K$ with high probability. The number of cluster is then inferred a posteriori ...

Find SimilarView on arXiv

Bayesian clustering of high-dimensional data via latent repulsive mixtures

March 4, 2023

86% Match

Lorenzo Ghilotti, Mario Beraha, Alessandra Guglielmi

Methodology

Model-based clustering of moderate or large dimensional data is notoriously difficult. We propose a model for simultaneous dimensionality reduction and clustering by assuming a mixture model for a set of latent scores, which are then linked to the observations via a Gaussian latent factor model. This approach was recently investigated by Chandra et al. (2020). The authors use a factor-analytic representation and assume a mixture model for the latent factors. However, performa...

Find SimilarView on arXiv

Repulsion, Chaos and Equilibrium in Mixture Models

Revisiting k-means: New Algorithms via Bayesian Nonparametrics

Minimum Message Length Clustering Using Gibbs Sampling

Bayesian Finite Mixture Models

A survey on Bayesian inference for Gaussian mixture model

Model-based clustering based on sparse finite Gaussian mixtures

Model based clustering of multinomial count data

Identifying Mixtures of Mixtures Using Bayesian Estimation

Determinantal Clustering Processes - A Nonparametric Bayesian Approach to Kernel Based Semi-Supervised Clustering

From here to infinity - sparse finite versus Dirichlet process mixtures in model-based clustering

Bayesian clustering of high-dimensional data via latent repulsive mixtures