ID: 1601.04650

Statistical Mechanics of High-Dimensional Inference

January 18, 2016

View on ArXiv

Similar papers 3

An equivalence between high dimensional Bayes optimal inference and M-estimation

September 22, 2016

87% Match
Madhu Advani, Surya Ganguli
Machine Learning
Disordered Systems and Neura...
Statistics Theory
Neurons and Cognition
Statistics Theory

When recovering an unknown signal from noisy measurements, the computational difficulty of performing optimal Bayesian MMSE (minimum mean squared error) inference often necessitates the use of maximum a posteriori (MAP) inference, a special case of regularized M-estimation, as a surrogate. However, MAP is suboptimal in high dimensions, when the number of unknown signal components is similar to the number of measurements. In this work we demonstrate, when the signal distributi...

Find SimilarView on arXiv

Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering

October 10, 2016

87% Match
Thibault Lesieur, Bacco Caterina De, Jess Banks, Florent Krzakala, ... , Zdeborová Lenka
Machine Learning
Disordered Systems and Neura...
Information Theory
Information Theory

We consider the problem of Gaussian mixture clustering in the high-dimensional limit where the data consists of $m$ points in $n$ dimensions, $n,m \rightarrow \infty$ and $\alpha = m/n$ stays finite. Using exact but non-rigorous methods from statistical physics, we determine the critical value of $\alpha$ and the distance between the clusters at which it becomes information-theoretically possible to reconstruct the membership into clusters better than chance. We also determin...

Find SimilarView on arXiv

High dimensionality: The latest challenge to data analysis

February 12, 2019

87% Match
A. M. Pires, J. A. Branco
Methodology

The advent of modern technology, permitting the measurement of thousands of characteristics simultaneously, has given rise to floods of data characterized by many large or even huge datasets. This new paradigm presents extraordinary challenges to data analysis and the question arises: how can conventional data analysis methods, devised for moderate or small datasets, cope with the complexities of modern data? The case of high dimensional data is particularly revealing of some...

Find SimilarView on arXiv

Information-Theoretic Limits for the Matrix Tensor Product

May 22, 2020

87% Match
Galen Reeves
Information Theory
Information Theory
Probability
Machine Learning

This paper studies a high-dimensional inference problem involving the matrix tensor product of random matrices. This problem generalizes a number of contemporary data science problems including the spiked matrix models used in sparse principal component analysis and covariance estimation and the stochastic block model used in network analysis. The main results are single-letter formulas (i.e., analytical expressions that can be approximated numerically) for the mutual informa...

Find SimilarView on arXiv

Statistical mechanical analysis of sparse linear regression as a variable selection problem

May 29, 2018

87% Match
Tomoyuki Obuchi, Yoshinori Nakanishi-Ohno, ... , Kabashima Yoshiyuki
Disordered Systems and Neura...
Information Theory
Information Theory
Machine Learning

An algorithmic limit of compressed sensing or related variable-selection problems is analytically evaluated when a design matrix is given by an overcomplete random matrix. The replica method from statistical mechanics is employed to derive the result. The analysis is conducted through evaluation of the entropy, an exponential rate of the number of combinations of variables giving a specific value of fit error to given data which is assumed to be generated from a linear proces...

Find SimilarView on arXiv

Four lectures on probabilistic methods for data science

December 20, 2016

87% Match
Roman Vershynin
math.PR
cs.DS
cs.IT
math.IT
math.ST
stat.TH

Methods of high-dimensional probability play a central role in applications for statistics, signal processing theoretical computer science and related fields. These lectures present a sample of particularly useful tools of high-dimensional probability, focusing on the classical and matrix Bernstein's inequality and the uniform matrix deviation inequality. We illustrate these tools with applications for dimension reduction, network analysis, covariance estimation, matrix compl...

Find SimilarView on arXiv

High dimensional statistical inference: theoretical development to data analytics

August 19, 2019

87% Match
Deepak Nag Ayyala
Statistics Theory
Methodology
Statistics Theory

This article is due to appear in the Handbook of Statistics, Vol. 43, Elsevier/North-Holland, Amsterdam, edited by Arni S. R. Srinivasa Rao and C. R. Rao. In modern day analytics, there is ever growing need to develop statistical models to study high dimensional data. Between dimension reduction, asymptotics-driven methods and random projection based methods, there are several approaches developed so far. For high dimensional parametric models, estimation and hypothesis tes...

Find SimilarView on arXiv

Fundamental Limits of Ridge-Regularized Empirical Risk Minimization in High Dimensions

June 16, 2020

87% Match
Hossein Taheri, Ramtin Pedarsani, Christos Thrampoulidis
Machine Learning
Information Theory
Machine Learning
Signal Processing
Information Theory

Empirical Risk Minimization (ERM) algorithms are widely used in a variety of estimation and prediction tasks in signal-processing and machine learning applications. Despite their popularity, a theory that explains their statistical properties in modern regimes where both the number of measurements and the number of unknown parameters is large is only recently emerging. In this paper, we characterize for the first time the fundamental limits on the statistical accuracy of conv...

Find SimilarView on arXiv

Optimal Shrinkage Estimator for High-Dimensional Mean Vector

October 28, 2016

87% Match
Taras Bodnar, Ostap Okhrin, Nestor Parolya
Statistics Theory
Statistical Finance
Statistics Theory

In this paper we derive the optimal linear shrinkage estimator for the high-dimensional mean vector using random matrix theory. The results are obtained under the assumption that both the dimension $p$ and the sample size $n$ tend to infinity in such a way that $p/n \to c\in(0,\infty)$. Under weak conditions imposed on the underlying data generating mechanism, we find the asymptotic equivalents to the optimal shrinkage intensities and estimate them consistently. The proposed ...

Find SimilarView on arXiv

Inference in High-dimensional Linear Regression

June 22, 2021

87% Match
Heather S. Battey, Nancy Reid
Methodology
Statistics Theory
Statistics Theory

This paper develops an approach to inference in a linear regression model when the number of potential explanatory variables is larger than the sample size. The approach treats each regression coefficient in turn as the interest parameter, the remaining coefficients being nuisance parameters, and seeks an optimal interest-respecting transformation, inducing sparsity on the relevant blocks of the notional Fisher information matrix. The induced sparsity is exploited through a m...

Find SimilarView on arXiv