Optimal Bayesian clustering using non-ne...

Bayesian cluster analysis: Point estimation and credible balls

May 13, 2015

87% Match

Sara Wade, Zoubin Ghahramani

Methodology

Clustering is widely studied in statistics and machine learning, with applications in a variety of fields. As opposed to classical algorithms which return a single clustering solution, Bayesian nonparametric models provide a posterior over the entire space of partitions, allowing one to assess statistical properties, such as uncertainty on the number of clusters. However, an important problem is how to summarize the posterior; the huge dimension of partition space and difficu...

Find SimilarView on arXiv

Minimum Message Length Clustering Using Gibbs Sampling

January 16, 2013

86% Match

Ian Davidson

Machine Learning

The K-Mean and EM algorithms are popular in clustering and mixture modeling, due to their simplicity and ease of implementation. However, they have several significant limitations. Both coverage to a local optimum of their respective objective functions (ignoring the uncertainty in the model space), require the apriori specification of the number of classes/clsuters, and are inconsistent. In this work we overcome these limitations by using the Minimum Message Length (MML) pri...

Find SimilarView on arXiv

Clustering and Latent Semantic Indexing Aspects of the Nonnegative Matrix Factorization

December 17, 2011

86% Match

Andri Mirzal

Machine Learning

This paper provides a theoretical support for clustering aspect of the nonnegative matrix factorization (NMF). By utilizing the Karush-Kuhn-Tucker optimality conditions, we show that NMF objective is equivalent to graph clustering objective, so clustering aspect of the NMF has a solid justification. Different from previous approaches which usually discard the nonnegativity constraints, our approach guarantees the stationary point being used in deriving the equivalence is loca...

Find SimilarView on arXiv

A Bayesian approach for clustering skewed data using mixtures of multivariate normal-inverse Gaussian distributions

May 6, 2020

86% Match

Yuan Fang, Dimitris Karlis, Sanjeena Subedi

Computation

Non-Gaussian mixture models are gaining increasing attention for mixture model-based clustering particularly when dealing with data that exhibit features such as skewness and heavy tails. Here, such a mixture distribution is presented, based on the multivariate normal inverse Gaussian (MNIG) distribution. For parameter estimation of the mixture, a Bayesian approach via Gibbs sampler is used; for this, a novel approach to simulate univariate generalized inverse Gaussian random...

Find SimilarView on arXiv

Variational Inference and Sparsity in High-Dimensional Deep Gaussian Mixture Models

May 4, 2021

86% Match

Lucas Kock, Nadja Klein, David J. Nott

Methodology

Gaussian mixture models are a popular tool for model-based clustering, and mixtures of factor analyzers are Gaussian mixture models having parsimonious factor covariance structure for mixture components. There are several recent extensions of mixture of factor analyzers to deep mixtures, where the Gaussian model for the latent factors is replaced by a mixture of factor analyzers. This construction can be iterated to obtain a model with many layers. These deep models are chall...

Find SimilarView on arXiv

An Experimental Comparison of Several Clustering and Initialization Methods

January 30, 2013

86% Match

Marina Meila, David Heckerman

Machine Learning

We examine methods for clustering in high dimensions. In the first part of the paper, we perform an experimental comparison between three batch clustering algorithms: the Expectation-Maximization (EM) algorithm, a winner take all version of the EM algorithm reminiscent of the K-means algorithm, and model-based hierarchical agglomerative clustering. We learn naive-Bayes models with a hidden root node, using high-dimensional discrete-variable data sets (both real and synthetic)...

Find SimilarView on arXiv

Clustering by Orthogonal NMF Model and Non-Convex Penalty Optimization

June 3, 2019

86% Match

Shuai Wang, Tsung-Hui Chang, ... , Pang Jong-Shi

Machine Learning

The non-negative matrix factorization (NMF) model with an additional orthogonality constraint on one of the factor matrices, called the orthogonal NMF (ONMF), has been found a promising clustering model and can outperform the classical K-means. However, solving the ONMF model is a challenging optimization problem because the coupling of the orthogonality and non-negativity constraints introduces a mixed combinatorial aspect into the problem due to the determination of the cor...

Find SimilarView on arXiv

Initialization for Nonnegative Matrix Factorization: a Comprehensive Review

September 8, 2021

86% Match

Sajad Fathi Hafshejani, Zahra Moaberfard

Optimization and Control

Machine Learning

Non-negative matrix factorization (NMF) has become a popular method for representing meaningful data by extracting a non-negative basis feature from an observed non-negative data matrix. Some of the unique features of this method in identifying hidden data put this method amongst the powerful methods in the machine learning area. The NMF is a known non-convex optimization problem and the initial point has a significant effect on finding an efficient local solution. In this pa...

Find SimilarView on arXiv

Learning Generative Models of Similarity Matrices

October 19, 2012

86% Match

Romer Rosales, Brendan J. Frey

Machine Learning

We describe a probabilistic (generative) view of affinity matrices along with inference algorithms for a subclass of problems associated with data clustering. This probabilistic view is helpful in understanding different models and algorithms that are based on affinity functions OF the data. IN particular, we show how(greedy) inference FOR a specific probabilistic model IS equivalent TO the spectral clustering algorithm.It also provides a framework FOR developing new algorith...

Find SimilarView on arXiv

Kernel learning approaches for summarising and combining posterior similarity matrices

September 27, 2020

86% Match

Alessandra Cabassi, Sylvia Richardson, Paul D. W. Kirk

Methodology

Machine Learning

When using Markov chain Monte Carlo (MCMC) algorithms to perform inference for Bayesian clustering models, such as mixture models, the output is typically a sample of clusterings (partitions) drawn from the posterior distribution. In practice, a key challenge is how to summarise this output. Here we build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models. A k...

Find SimilarView on arXiv

Optimal Bayesian clustering using non-negative matrix factorization

Bayesian cluster analysis: Point estimation and credible balls

Minimum Message Length Clustering Using Gibbs Sampling

Clustering and Latent Semantic Indexing Aspects of the Nonnegative Matrix Factorization

A Bayesian approach for clustering skewed data using mixtures of multivariate normal-inverse Gaussian distributions

Variational Inference and Sparsity in High-Dimensional Deep Gaussian Mixture Models

An Experimental Comparison of Several Clustering and Initialization Methods

Clustering by Orthogonal NMF Model and Non-Convex Penalty Optimization

Initialization for Nonnegative Matrix Factorization: a Comprehensive Review

Learning Generative Models of Similarity Matrices

Kernel learning approaches for summarising and combining posterior similarity matrices