ID: 1204.5243

Repulsive Mixtures

April 24, 2012

View on ArXiv
Francesca Petralia, Vinayak Rao, David B. Dunson
Statistics
Methodology

Discrete mixture models are routinely used for density estimation and clustering. While conducting inferences on the cluster-specific parameters, current frequentist and Bayesian methods often encounter problems when clusters are placed too close together to be scientifically meaningful. Current Bayesian practice generates component-specific parameters independently from a common prior, which tends to favor similar components and often leads to substantial probability assigned to redundant components that are not needed to fit the data. As an alternative, we propose to generate components from a repulsive process, which leads to fewer, better separated and more interpretable clusters. We characterize this repulsive prior theoretically and propose a Markov chain Monte Carlo sampling algorithm for posterior computation. The methods are illustrated using simulated data as well as real datasets.

Similar papers 1

Parsimonious Hierarchical Modeling Using Repulsive Distributions

January 16, 2017

93% Match
J. J. Quinlan, F. A. Quintana, G. L. Page
Methodology

Employing nonparametric methods for density estimation has become routine in Bayesian statistical practice. Models based on discrete nonparametric priors such as Dirichlet Process Mixture (DPM) models are very attractive choices due to their flexibility and tractability. However, a common problem in fitting DPMs or other discrete models to data is that they tend to produce a large number of (sometimes) redundant clusters. In this work we propose a method that produces parsimo...

Find SimilarView on arXiv

Bayesian Repulsive Gaussian Mixture Model

March 27, 2017

93% Match
Fangzheng Xie, Yanxun Xu
Methodology

We develop a general class of Bayesian repulsive Gaussian mixture models that encourage well-separated clusters, aiming at reducing potentially redundant components produced by independent priors for locations (such as the Dirichlet process). The asymptotic results for the posterior distribution of the proposed models are derived, including posterior consistency and posterior contraction rate in the context of nonparametric density estimation. More importantly, we show that c...

Find SimilarView on arXiv

Bayesian Repulsive Mixture Modeling with Mat\'ern Point Processes

October 9, 2022

92% Match
Hanxi Sun, Boqian Zhang, Vinayak Rao
Methodology

Mixture models are a standard tool in statistical analysis, widely used for density modeling and model-based clustering. Current approaches typically model the parameters of the mixture components as independent variables. This can result in overlapping or poorly separated clusters when either the number of clusters or the form of the mixture components is misspecified. Such model misspecification can undermine the interpretability and simplicity of these mixture models. To a...

Find SimilarView on arXiv

MCMC computations for Bayesian mixture models using repulsive point processes

November 12, 2020

92% Match
Mario Beraha, Raffaele Argiento, ... , Guglielmi Alessandra
Methodology
Computation

Repulsive mixture models have recently gained popularity for Bayesian cluster detection. Compared to more traditional mixture models, repulsive mixture models produce a smaller number of well separated clusters. The most commonly used methods for posterior inference either require to fix a priori the number of components or are based on reversible jump MCMC computation. We present a general framework for mixture models, when the prior of the `cluster centres' is a finite repu...

Find SimilarView on arXiv

Repulsion, Chaos and Equilibrium in Mixture Models

June 19, 2023

91% Match
Andrea Cremaschi, Timothy M. Wertz, Iorio Maria De
Methodology
Statistics Theory
Statistics Theory

Mixture models are commonly used in applications with heterogeneity and overdispersion in the population, as they allow the identification of subpopulations. In the Bayesian framework, this entails the specification of suitable prior distributions for the weights and location parameters of the mixture. Widely used are Bayesian semi-parametric models based on mixtures with infinite or random number of components, such as Dirichlet process mixtures or mixtures with random numbe...

Find SimilarView on arXiv

Bayesian Finite Mixture Models

July 7, 2024

90% Match
Bettina Grün, Gertraud Malsiner-Walli
Methodology

Finite mixture models are a useful statistical model class for clustering and density approximation. In the Bayesian framework finite mixture models require the specification of suitable priors in addition to the data model. These priors allow to avoid spurious results and provide a principled way to define cluster shapes and a preference for specific cluster solutions. A generic model estimation scheme for finite mixtures with a fixed number of components is available using ...

Find SimilarView on arXiv

Bayesian clustering of high-dimensional data via latent repulsive mixtures

March 4, 2023

89% Match
Lorenzo Ghilotti, Mario Beraha, Alessandra Guglielmi
Methodology

Model-based clustering of moderate or large dimensional data is notoriously difficult. We propose a model for simultaneous dimensionality reduction and clustering by assuming a mixture model for a set of latent scores, which are then linked to the observations via a Gaussian latent factor model. This approach was recently investigated by Chandra et al. (2020). The authors use a factor-analytic representation and assume a mixture model for the latent factors. However, performa...

Find SimilarView on arXiv

Model based clustering of multinomial count data

July 28, 2022

88% Match
Panagiotis Papastamoulis
Methodology
Computation

We consider the problem of inferring an unknown number of clusters in replicated multinomial data. Under a model based clustering point of view, this task can be treated by estimating finite mixtures of multinomial distributions with or without covariates. Both Maximum Likelihood (ML) as well as Bayesian estimation are taken into account. Under a Maximum Likelihood approach, we provide an Expectation--Maximization (EM) algorithm which exploits a careful initialization procedu...

Find SimilarView on arXiv

Identifying Mixtures of Mixtures Using Bayesian Estimation

February 23, 2015

88% Match
Gertraud Malsiner-Walli, Sylvia Frühwirth-Schnatter, Bettina Grün
Methodology

The use of a finite mixture of normal distributions in model-based clustering allows to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior where the hyperpar...

Find SimilarView on arXiv

Distributed Bayesian clustering using finite mixture of mixtures

March 31, 2020

87% Match
Hanyu Song, Yingjian Wang, David B. Dunson
Computation
Methodology

In many modern applications, there is interest in analyzing enormous data sets that cannot be easily moved across computers or loaded into memory on a single computer. In such settings, it is very common to be interested in clustering. Existing distributed clustering algorithms are mostly distance or density based without a likelihood specification, precluding the possibility of formal statistical inference. Model-based clustering allows statistical inference, yet research on...

Find SimilarView on arXiv