ID: 2306.10669

Repulsion, Chaos and Equilibrium in Mixture Models

June 19, 2023

View on ArXiv
Andrea Cremaschi, Timothy M. Wertz, Iorio Maria De
Statistics
Mathematics
Methodology
Statistics Theory
Statistics Theory

Mixture models are commonly used in applications with heterogeneity and overdispersion in the population, as they allow the identification of subpopulations. In the Bayesian framework, this entails the specification of suitable prior distributions for the weights and location parameters of the mixture. Widely used are Bayesian semi-parametric models based on mixtures with infinite or random number of components, such as Dirichlet process mixtures or mixtures with random number of components. Key in this context is the choice of the kernel for cluster identification. Despite their popularity, the flexibility of these models and prior distributions often does not translate into interpretability of the identified clusters. To overcome this issue, clustering methods based on repulsive mixtures have been recently proposed. The basic idea is to include a repulsive term in the prior distribution of the atoms of the mixture, which favours mixture locations far apart. This approach is increasingly popular and allows one to produce well-separated clusters, thus facilitating the interpretation of the results. However, the resulting models are usually not easy to handle due to the introduction of unknown normalising constants. Exploiting results from statistical mechanics, we propose in this work a novel class of repulsive prior distributions based on Gibbs measures. Specifically, we use Gibbs measures associated to joint distributions of eigenvalues of random matrices, which naturally possess a repulsive property. The proposed framework greatly simplifies the computations needed for the use of repulsive mixtures due to the availability of the normalising constant in closed form. We investigate theoretical properties of such class of prior distributions, and illustrate the novel class of priors and their properties, as well as their clustering performance, on benchmark datasets.

Similar papers 1

Parsimonious Hierarchical Modeling Using Repulsive Distributions

January 16, 2017

91% Match
J. J. Quinlan, F. A. Quintana, G. L. Page
Methodology

Employing nonparametric methods for density estimation has become routine in Bayesian statistical practice. Models based on discrete nonparametric priors such as Dirichlet Process Mixture (DPM) models are very attractive choices due to their flexibility and tractability. However, a common problem in fitting DPMs or other discrete models to data is that they tend to produce a large number of (sometimes) redundant clusters. In this work we propose a method that produces parsimo...

Find SimilarView on arXiv

Repulsive Mixtures

April 24, 2012

91% Match
Francesca Petralia, Vinayak Rao, David B. Dunson
Methodology

Discrete mixture models are routinely used for density estimation and clustering. While conducting inferences on the cluster-specific parameters, current frequentist and Bayesian methods often encounter problems when clusters are placed too close together to be scientifically meaningful. Current Bayesian practice generates component-specific parameters independently from a common prior, which tends to favor similar components and often leads to substantial probability assigne...

Find SimilarView on arXiv

Bayesian Repulsive Mixture Modeling with Mat\'ern Point Processes

October 9, 2022

90% Match
Hanxi Sun, Boqian Zhang, Vinayak Rao
Methodology

Mixture models are a standard tool in statistical analysis, widely used for density modeling and model-based clustering. Current approaches typically model the parameters of the mixture components as independent variables. This can result in overlapping or poorly separated clusters when either the number of clusters or the form of the mixture components is misspecified. Such model misspecification can undermine the interpretability and simplicity of these mixture models. To a...

Find SimilarView on arXiv

Bayesian Repulsive Gaussian Mixture Model

March 27, 2017

90% Match
Fangzheng Xie, Yanxun Xu
Methodology

We develop a general class of Bayesian repulsive Gaussian mixture models that encourage well-separated clusters, aiming at reducing potentially redundant components produced by independent priors for locations (such as the Dirichlet process). The asymptotic results for the posterior distribution of the proposed models are derived, including posterior consistency and posterior contraction rate in the context of nonparametric density estimation. More importantly, we show that c...

Find SimilarView on arXiv

Dirichlet Process Parsimonious Mixtures for clustering

January 14, 2015

89% Match
Faicel Chamroukhi, Marius Bartcus, Hervé Glotin
Machine Learning
Machine Learning
Methodology

The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimoniou...

Find SimilarView on arXiv

MCMC computations for Bayesian mixture models using repulsive point processes

November 12, 2020

89% Match
Mario Beraha, Raffaele Argiento, ... , Guglielmi Alessandra
Methodology
Computation

Repulsive mixture models have recently gained popularity for Bayesian cluster detection. Compared to more traditional mixture models, repulsive mixture models produce a smaller number of well separated clusters. The most commonly used methods for posterior inference either require to fix a priori the number of components or are based on reversible jump MCMC computation. We present a general framework for mixture models, when the prior of the `cluster centres' is a finite repu...

Find SimilarView on arXiv

Normalized Random Meaures with Interacting Atoms for Bayesian Nonparametric Mixtures

February 17, 2023

88% Match
Mario Beraha, Raffaele Argiento, ... , Guglielmi Alessandra
Statistics Theory
Probability
Methodology
Statistics Theory

The study of almost surely discrete random probability measures is an active line of research in Bayesian nonparametrics. The idea of assuming interaction across the atoms of the random probability measure has recently spurred significant interest in the context of Bayesian mixture models. This allows the definition of priors that encourage well separated and interpretable clusters. In this work, we provide a unified framework for the construction and the Bayesian analysis of...

Find SimilarView on arXiv

Variance matrix priors for Dirichlet process mixture models with Gaussian kernels

February 8, 2022

88% Match
Wei Jing, Michail Papathomas, Silvia Liverani
Methodology

The Dirichlet Process Mixture Model (DPMM) is a Bayesian non-parametric approach widely used for density estimation and clustering. In this manuscript, we study the choice of prior for the variance or precision matrix when Gaussian kernels are adopted. Typically, in the relevant literature, the assessment of mixture models is done by considering observations in a space of only a handful of dimensions. Instead, we are concerned with more realistic problems of higher dimensiona...

Find SimilarView on arXiv

Repulsive Mixture Models of Exponential Family PCA for Clustering

April 7, 2020

88% Match
Maoying Qiao, Tongliang Liu, Jun Yu, ... , Tao Dacheng
Machine Learning
Machine Learning

The mixture extension of exponential family principal component analysis (EPCA) was designed to encode much more structural information about data distribution than the traditional EPCA does. For example, due to the linearity of EPCA's essential form, nonlinear cluster structures cannot be easily handled, but they are explicitly modeled by the mixing extensions. However, the traditional mixture of local EPCAs has the problem of model redundancy, i.e., overlaps among mixing co...

Find SimilarView on arXiv

A Bayesian approach for clustering skewed data using mixtures of multivariate normal-inverse Gaussian distributions

May 6, 2020

88% Match
Yuan Fang, Dimitris Karlis, Sanjeena Subedi
Computation

Non-Gaussian mixture models are gaining increasing attention for mixture model-based clustering particularly when dealing with data that exhibit features such as skewness and heavy tails. Here, such a mixture distribution is presented, based on the multivariate normal inverse Gaussian (MNIG) distribution. For parameter estimation of the mixture, a Bayesian approach via Gibbs sampler is used; for this, a novel approach to simulate univariate generalized inverse Gaussian random...

Find SimilarView on arXiv