ID: 1809.07850

Optimal Bayesian clustering using non-negative matrix factorization

September 20, 2018

View on ArXiv

Similar papers 2

On the clustering aspect of nonnegative matrix factorization

May 29, 2010

88% Match
Andri Mirzal, Masashi Furukawa
Machine Learning

This paper provides a theoretical explanation on the clustering aspect of nonnegative matrix factorization (NMF). We prove that even without imposing orthogonality nor sparsity constraint on the basis and/or coefficient matrix, NMF still can give clustering results, thus providing a theoretical support for many works, e.g., Xu et al. [1] and Kim et al. [2], that show the superiority of the standard NMF as a clustering method.

Find SimilarView on arXiv

A multiscale Bayesian nonparametric framework for partial hierarchical clustering

June 28, 2024

88% Match
Lorenzo Schiavon, Mattia Stival
Methodology

In recent years, there has been a growing demand to discern clusters of subjects in datasets characterized by a large set of features. Often, these clusters may be highly variable in size and present partial hierarchical structures. In this context, model-based clustering approaches with nonparametric priors are gaining attention in the literature due to their flexibility and adaptability to new data. However, current approaches still face challenges in recognizing hierarchic...

Find SimilarView on arXiv

Distributed Bayesian clustering using finite mixture of mixtures

March 31, 2020

88% Match
Hanyu Song, Yingjian Wang, David B. Dunson
Computation
Methodology

In many modern applications, there is interest in analyzing enormous data sets that cannot be easily moved across computers or loaded into memory on a single computer. In such settings, it is very common to be interested in clustering. Existing distributed clustering algorithms are mostly distance or density based without a likelihood specification, precluding the possibility of formal statistical inference. Model-based clustering allows statistical inference, yet research on...

Find SimilarView on arXiv

Identifying Mixtures of Mixtures Using Bayesian Estimation

February 23, 2015

88% Match
Gertraud Malsiner-Walli, Sylvia Frühwirth-Schnatter, Bettina Grün
Methodology

The use of a finite mixture of normal distributions in model-based clustering allows to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior where the hyperpar...

Find SimilarView on arXiv

A model selection approach for clustering a multinomial sequence with non-negative factorization

December 29, 2013

88% Match
Nam H. Lee, Runze Tang, ... , Rosen Michael
Machine Learning

We consider a problem of clustering a sequence of multinomial observations by way of a model selection criterion. We propose a form of a penalty term for the model selection procedure. Our approach subsumes both the conventional AIC and BIC criteria but also extends the conventional criteria in a way that it can be applicable also to a sequence of sparse multinomial observations, where even within a same cluster, the number of multinomial trials may be different for different...

Find SimilarView on arXiv

An Empirical Comparison of Sampling Quality Metrics: A Case Study for Bayesian Nonnegative Matrix Factorization

June 20, 2016

88% Match
Arjumand Masood, Weiwei Pan, Finale Doshi-Velez
Machine Learning
Machine Learning

In this work, we empirically explore the question: how can we assess the quality of samples from some target distribution? We assume that the samples are provided by some valid Monte Carlo procedure, so we are guaranteed that the collection of samples will asymptotically approximate the true distribution. Most current evaluation approaches focus on two questions: (1) Has the chain mixed, that is, is it sampling from the distribution? and (2) How independent are the samples (a...

Find SimilarView on arXiv

A Tutorial on Bayesian Nonparametric Models

June 14, 2011

88% Match
Samuel J. Gershman, David M. Blei
Machine Learning
Methodology

A key problem in statistical modeling is model selection, how to choose a model at an appropriate level of complexity. This problem appears in many settings, most prominently in choosing the number ofclusters in mixture models or the number of factors in factor analysis. In this tutorial we describe Bayesian nonparametric methods, a class of methods that side-steps this issue by allowing the data to determine the complexity of the model. This tutorial is a high-level introduc...

Find SimilarView on arXiv

Non-Negative Matrix Factorization with Scale Data Structure Preservation

September 22, 2022

88% Match
Rachid Hedjam, Abdelhamid Abdesselam, ... , Cheriet Mohamed
Machine Learning

The model described in this paper belongs to the family of non-negative matrix factorization methods designed for data representation and dimension reduction. In addition to preserving the data positivity property, it aims also to preserve the structure of data during matrix factorization. The idea is to add, to the NMF cost function, a penalty term to impose a scale relationship between the pairwise similarity matrices of the original and transformed data points. The solutio...

Find SimilarView on arXiv

Clustering Multivariate Data using Factor Analytic Bayesian Mixtures with an Unknown Number of Components

June 2, 2019

88% Match
Panagiotis Papastamoulis
Methodology
Computation

Recent work on overfitting Bayesian mixtures of distributions offers a powerful framework for clustering multivariate data using a latent Gaussian model which resembles the factor analysis model. The flexibility provided by overfitting mixture models yields a simple and efficient way in order to estimate the unknown number of clusters and model parameters by Markov chain Monte Carlo (MCMC) sampling. The present study extends this approach by considering a set of eight paramet...

Find SimilarView on arXiv

Flexible and Hierarchical Prior for Bayesian Nonnegative Matrix Factorization

May 23, 2022

88% Match
Jun Lu, Xuanyu Ye
Machine Learning
Information Theory
Information Theory
Machine Learning

In this paper, we introduce a probabilistic model for learning nonnegative matrix factorization (NMF) that is commonly used for predicting missing values and finding hidden patterns in the data, in which the matrix factors are latent variables associated with each data dimension. The nonnegativity constraint for the latent factors is handled by choosing priors with support on the nonnegative subspace. Bayesian inference procedure based on Gibbs sampling is employed. We evalua...

Find SimilarView on arXiv