Statistical Mechanics of High-Dimensional Inference

January 18, 2016

View on ArXiv

Madhu Advani, Surya Ganguli

Statistics

Condensed Matter

Mathematics

Quantitative Biology

stat.ML

cond-mat.dis-nn

cond-mat.stat-mech

math.ST

q-bio.QM

stat.TH

To model modern large-scale datasets, we need efficient algorithms to infer a set of $P$ unknown model parameters from $N$ noisy measurements. What are fundamental limits on the accuracy of parameter inference, given finite signal-to-noise ratios, limited measurements, prior information, and computational tractability requirements? How can we combine prior information with measurements to achieve these limits? Classical statistics gives incisive answers to these questions as the measurement density $\alpha = \frac{N}{P}\rightarrow \infty$. However, these classical results are not relevant to modern high-dimensional inference problems, which instead occur at finite $\alpha$. We formulate and analyze high-dimensional inference as a problem in the statistical physics of quenched disorder. Our analysis uncovers fundamental limits on the accuracy of inference in high dimensions, and reveals that widely cherished inference algorithms like maximum likelihood (ML) and maximum-a posteriori (MAP) inference cannot achieve these limits. We further find optimal, computationally tractable algorithms that can achieve these limits. Intriguingly, in high dimensions, these optimal algorithms become computationally simpler than MAP and ML, while still outperforming them. For example, such optimal algorithms can lead to as much as a 20% reduction in the amount of data to achieve the same performance relative to MAP. Moreover, our analysis reveals simple relations between optimal high dimensional inference and low dimensional scalar Bayesian inference, insights into the nature of generalization and predictive power in high dimensions, information theoretic limits on compressed sensing, phase transitions in quadratic inference, and connections to central mathematical objects in convex optimization theory and random matrix theory.

High-dimensional inference: a statistical mechanics perspective

October 28, 2020

92% Match

Jean Barbier

Disordered Systems and Neura...

Statistical Mechanics

Information Theory

Machine Learning

Information Theory

Statistical inference is the science of drawing conclusions about some system from data. In modern signal processing and machine learning, inference is done in very high dimension: very many unknown characteristics about the system have to be deduced from a lot of high-dimensional noisy data. This "high-dimensional regime" is reminiscent of statistical mechanics, which aims at describing the macroscopic behavior of a complex system based on the knowledge of its microscopic in...

Find SimilarView on arXiv

Statistical Physics and Information Theory Perspectives on Linear Inverse Problems

May 15, 2017

90% Match

Junan Zhu

Information Theory

Many real-world problems in machine learning, signal processing, and communications assume that an unknown vector $x$ is measured by a matrix A, resulting in a vector $y=Ax+z$, where $z$ denotes the noise; we call this a single measurement vector (SMV) problem. Sometimes, multiple dependent vectors $x^{(j)}, j\in \{1,...,J\}$, are measured at the same time, forming the so-called multi-measurement vector (MMV) problem. Both SMV and MMV are linear models (LM's), and the process...

Find SimilarView on arXiv

Exact results on high-dimensional linear regression via statistical physics

September 28, 2020

89% Match

Alexander Mozeika, Mansoor Sheikh, Fabian Aguirre-Lopez, ... , Coolen Anthony CC

Statistics Theory

Disordered Systems and Neura...

Statistics Theory

It is clear that conventional statistical inference protocols need to be revised to deal correctly with the high-dimensional data that are now common. Most recent studies aimed at achieving this revision rely on powerful approximation techniques, that call for rigorous results against which they can be tested. In this context, the simplest case of high-dimensional linear regression has acquired significant new relevance and attention. In this paper we use the statistical phys...

Find Similar View on arXiv

Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models

August 10, 2017

89% Match

Jean Barbier, Florent Krzakala, Nicolas Macris, ... , Zdeborová Lenka

cs.IT

cond-mat.dis-nn

cs.AI

cs.LG

math.IT

math.MP

Generalized linear models (GLMs) arise in high-dimensional machine learning, statistics, communications and signal processing. In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes or benchmark models in neural networks. We evaluate the mutual information (or "free entropy") from which we deduce the Bayes-optimal estimation and generalization errors. Our analysis applies to the high-dimensional...

Find SimilarView on arXiv

Mean-field methods and algorithmic perspectives for high-dimensional machine learning

March 10, 2021

89% Match

Benjamin Aubin

Disordered Systems and Neura...

Machine Learning

The main difficulty that arises in the analysis of most machine learning algorithms is to handle, analytically and numerically, a large number of interacting random variables. In this Ph.D manuscript, we revisit an approach based on the tools of statistical physics of disordered systems. Developed through a rich literature, they have been precisely designed to infer the macroscopic behavior of a large number of particles from their microscopic interactions. At the heart of th...

Find SimilarView on arXiv

Phase retrieval in high dimensions: Statistical and computational phase transitions

June 9, 2020

88% Match

Antoine Maillard, Bruno Loureiro, ... , Zdeborová Lenka

math.ST

cond-mat.dis-nn

cs.IT

cs.LG

math.IT

math.PR

stat.TH

We consider the phase retrieval problem of reconstructing a $n$-dimensional real or complex signal $\mathbf{X}^{\star}$ from $m$ (possibly noisy) observations $Y_\mu = | \sum_{i=1}^n \Phi_{\mu i} X^{\star}_i/\sqrt{n}|$, for a large class of correlated real and complex random sensing matrices $\mathbf{\Phi}$, in a high-dimensional setting where $m,n\to\infty$ while $\alpha = m/n=\Theta(1)$. First, we derive sharp asymptotics for the lowest possible estimation error achievable ...

Find SimilarView on arXiv

The estimation error of general first order methods

February 28, 2020

88% Match

Michael Celentano, Andrea Montanari, Yuchen Wu

Machine Learning

Statistics Theory

Modern large-scale statistical models require to estimate thousands to millions of parameters. This is often accomplished by iterative algorithms such as gradient descent, projected gradient descent or their accelerated versions. What are the fundamental limits to these approaches? This question is well understood from an optimization viewpoint when the underlying objective is convex. Work in this area characterizes the gap to global optimality as a function of the number of ...

Find SimilarView on arXiv

High-dimensional Learning with Noisy Labels

May 23, 2024

88% Match

Aymane El Firdoussi, Mohamed El Amine Seddik

Machine Learning

Artificial Intelligence

Machine Learning

This paper provides theoretical insights into high-dimensional binary classification with class-conditional noisy labels. Specifically, we study the behavior of a linear classifier with a label noisiness aware loss function, when both the dimension of data $p$ and the sample size $n$ are large and comparable. Relying on random matrix theory by supposing a Gaussian mixture data model, the performance of the linear classifier when $p,n\to \infty$ is shown to converge towards a ...

Find SimilarView on arXiv

Statistical physics of inference: Thresholds and algorithms

November 8, 2015

88% Match

Lenka Zdeborová, Florent Krzakala

Statistical Mechanics

Data Structures and Algorith...

Machine Learning

Many questions of fundamental interest in todays science can be formulated as inference problems: Some partial, or noisy, observations are performed over a set of variables and the goal is to recover, or infer, the values of the variables based on the indirect information contained in the measurements. For such problems, the central scientific questions are: Under what conditions is the information contained in the measurements sufficient for a satisfactory inference to be po...

Find SimilarView on arXiv

Understanding Phase Transitions via Mutual Information and MMSE

July 3, 2019

88% Match

Galen Reeves, Henry Pfister

Information Theory

Statistics Theory

The ability to understand and solve high-dimensional inference problems is essential for modern data science. This article examines high-dimensional inference problems through the lens of information theory and focuses on the standard linear model as a canonical example that is both rich enough to be practically useful and simple enough to be studied rigorously. In particular, this model can exhibit phase transitions where an arbitrarily small change in the model parameters c...

Find SimilarView on arXiv