ID: 2306.10669

Repulsion, Chaos and Equilibrium in Mixture Models

June 19, 2023

View on ArXiv

Similar papers 2

Revisiting k-means: New Algorithms via Bayesian Nonparametrics

November 2, 2011

88% Match
Brian Kulis, Michael I. Jordan
Machine Learning
Machine Learning

Bayesian models offer great flexibility for clustering applications---Bayesian nonparametrics can be used for modeling infinite mixtures, and hierarchical Bayesian models can be utilized for sharing clusters across multiple data sets. For the most part, such flexibility is lacking in classical clustering methods such as k-means. In this paper, we revisit the k-means clustering algorithm from a Bayesian nonparametric viewpoint. Inspired by the asymptotic connection between k-m...

Find SimilarView on arXiv

Minimum Message Length Clustering Using Gibbs Sampling

January 16, 2013

88% Match
Ian Davidson
Machine Learning
Machine Learning

The K-Mean and EM algorithms are popular in clustering and mixture modeling, due to their simplicity and ease of implementation. However, they have several significant limitations. Both coverage to a local optimum of their respective objective functions (ignoring the uncertainty in the model space), require the apriori specification of the number of classes/clsuters, and are inconsistent. In this work we overcome these limitations by using the Minimum Message Length (MML) pri...

Find SimilarView on arXiv

Bayesian Finite Mixture Models

July 7, 2024

87% Match
Bettina Grün, Gertraud Malsiner-Walli
Methodology

Finite mixture models are a useful statistical model class for clustering and density approximation. In the Bayesian framework finite mixture models require the specification of suitable priors in addition to the data model. These priors allow to avoid spurious results and provide a principled way to define cluster shapes and a preference for specific cluster solutions. A generic model estimation scheme for finite mixtures with a fixed number of components is available using ...

Find SimilarView on arXiv

A survey on Bayesian inference for Gaussian mixture model

August 20, 2021

87% Match
Jun Lu
Machine Learning
Artificial Intelligence
Machine Learning

Clustering has become a core technology in machine learning, largely due to its application in the field of unsupervised learning, clustering, classification, and density estimation. A frequentist approach exists to hand clustering based on mixture model which is known as the EM algorithm where the parameters of the mixture model are usually estimated into a maximum likelihood estimation framework. Bayesian approach for finite and infinite Gaussian mixture model generates poi...

Find SimilarView on arXiv

Model-based clustering based on sparse finite Gaussian mixtures

June 22, 2016

87% Match
Gertraud Malsiner-Walli, Sylvia Frühwirth-Schnatter, Bettina Grün
Methodology

In the framework of Bayesian model-based clustering based on a finite mixture of Gaussian distributions, we present a joint approach to estimate the number of mixture components and identify cluster-relevant variables simultaneously as well as to obtain an identified model. Our approach consists in specifying sparse hierarchical priors on the mixture weights and component means. In a deliberately overfitting mixture model the sparse prior on the weights empties superfluous co...

Find SimilarView on arXiv

Model based clustering of multinomial count data

July 28, 2022

87% Match
Panagiotis Papastamoulis
Methodology
Computation

We consider the problem of inferring an unknown number of clusters in replicated multinomial data. Under a model based clustering point of view, this task can be treated by estimating finite mixtures of multinomial distributions with or without covariates. Both Maximum Likelihood (ML) as well as Bayesian estimation are taken into account. Under a Maximum Likelihood approach, we provide an Expectation--Maximization (EM) algorithm which exploits a careful initialization procedu...

Find SimilarView on arXiv

Identifying Mixtures of Mixtures Using Bayesian Estimation

February 23, 2015

87% Match
Gertraud Malsiner-Walli, Sylvia Frühwirth-Schnatter, Bettina Grün
Methodology

The use of a finite mixture of normal distributions in model-based clustering allows to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior where the hyperpar...

Find SimilarView on arXiv

Determinantal Clustering Processes - A Nonparametric Bayesian Approach to Kernel Based Semi-Supervised Clustering

September 26, 2013

87% Match
Amar Shah, Zoubin Ghahramani
Machine Learning
Machine Learning

Semi-supervised clustering is the task of clustering data points into clusters where only a fraction of the points are labelled. The true number of clusters in the data is often unknown and most models require this parameter as an input. Dirichlet process mixture models are appealing as they can infer the number of clusters from the data. However, these models do not deal with high dimensional data well and can encounter difficulties in inference. We present a novel nonparame...

Find SimilarView on arXiv

From here to infinity - sparse finite versus Dirichlet process mixtures in model-based clustering

June 22, 2017

87% Match
Sylvia Frühwirth-Schnatter, Gertraud Malsiner-Walli
Methodology

In model-based-clustering mixture models are used to group data points into clusters. A useful concept introduced for Gaussian mixtures by Malsiner Walli et al (2016) are sparse finite mixtures, where the prior distribution on the weight distribution of a mixture with $K$ components is chosen in such a way that a priori the number of clusters in the data is random and is allowed to be smaller than $K$ with high probability. The number of cluster is then inferred a posteriori ...

Find SimilarView on arXiv

Bayesian clustering of high-dimensional data via latent repulsive mixtures

March 4, 2023

86% Match
Lorenzo Ghilotti, Mario Beraha, Alessandra Guglielmi
Methodology

Model-based clustering of moderate or large dimensional data is notoriously difficult. We propose a model for simultaneous dimensionality reduction and clustering by assuming a mixture model for a set of latent scores, which are then linked to the observations via a Gaussian latent factor model. This approach was recently investigated by Chandra et al. (2020). The authors use a factor-analytic representation and assume a mixture model for the latent factors. However, performa...

Find SimilarView on arXiv