October 28, 2020
Statistical inference is the science of drawing conclusions about some system from data. In modern signal processing and machine learning, inference is done in very high dimension: very many unknown characteristics about the system have to be deduced from a lot of high-dimensional noisy data. This "high-dimensional regime" is reminiscent of statistical mechanics, which aims at describing the macroscopic behavior of a complex system based on the knowledge of its microscopic interactions. It is by now clear that there are many connections between inference and statistical physics. This article aims at emphasizing some of the deep links connecting these apparently separated disciplines through the description of paradigmatic models of high-dimensional inference in the language of statistical mechanics. This article has been published in the issue on artificial intelligence of Ithaca, an Italian popularization-of-science journal. The selected topics and references are highly biased and not intended to be exhaustive in any ways. Its purpose is to serve as introduction to statistical mechanics of inference through a very specific angle that corresponds to my own tastes and limited knowledge.
Similar papers 1
January 18, 2016
To model modern large-scale datasets, we need efficient algorithms to infer a set of $P$ unknown model parameters from $N$ noisy measurements. What are fundamental limits on the accuracy of parameter inference, given finite signal-to-noise ratios, limited measurements, prior information, and computational tractability requirements? How can we combine prior information with measurements to achieve these limits? Classical statistics gives incisive answers to these questions as ...
It is clear that conventional statistical inference protocols need to be revised to deal correctly with the high-dimensional data that are now common. Most recent studies aimed at achieving this revision rely on powerful approximation techniques, that call for rigorous results against which they can be tested. In this context, the simplest case of high-dimensional linear regression has acquired significant new relevance and attention. In this paper we use the statistical phys...
September 6, 2002
Using a variational technique, we generalize the statistical physics approach of learning from random examples to make it applicable to real data. We demonstrate the validity and relevance of our method by computing approximate estimators for generalization errors that are based on training data alone.
May 15, 2017
Many real-world problems in machine learning, signal processing, and communications assume that an unknown vector $x$ is measured by a matrix A, resulting in a vector $y=Ax+z$, where $z$ denotes the noise; we call this a single measurement vector (SMV) problem. Sometimes, multiple dependent vectors $x^{(j)}, j\in \{1,...,J\}$, are measured at the same time, forming the so-called multi-measurement vector (MMV) problem. Both SMV and MMV are linear models (LM's), and the process...
June 5, 2017
We expand upon a natural analogy between Bayesian statistics and statistical physics in which sample size corresponds to inverse temperature. This analogy motivates the definition of two novel statistical quantities: a learning capacity and a Gibbs entropy. The analysis of the learning capacity, corresponding to the heat capacity in thermal physics, leads to new insight into the mechanism of learning and explains why some models have anomalously-high learning performance. We ...
July 20, 2022
This work in progress aims to provide a unified introduction to statistical learning, building up slowly from classical models like the GMM and HMM to modern neural networks like the VAE and diffusion models. There are today many internet resources that explain this or that new machine-learning algorithm in isolation, but they do not (and cannot, in so brief a space) connect these algorithms with each other or with the classical literature on statistical models, out of which ...
January 30, 2013
Recent experimental advances in neuroscience have opened new vistas into the immense complexity of neuronal networks. This proliferation of data challenges us on two parallel fronts. First, how can we form adequate theoretical frameworks for understanding how dynamical network processes cooperate across widely disparate spatiotemporal scales to solve important computational problems? And second, how can we extract meaningful models of neuronal systems from high dimensional da...
August 7, 2023
In these lecture notes we present different methods and concepts developed in statistical physics to analyze gradient descent dynamics in high-dimensional non-convex landscapes. Our aim is to show how approaches developed in physics, mainly statistical physics of disordered systems, can be used to tackle open questions on high-dimensional dynamics in Machine Learning.
February 11, 2022
The recent progresses in Machine Learning opened the door to actual applications of learning algorithms but also to new research directions both in the field of Machine Learning directly and, at the edges with other disciplines. The case that interests us is the interface with physics, and more specifically Statistical Physics. In this short lecture, I will try to present first a brief introduction to Machine Learning from the angle of neural networks. After explaining quickl...
December 1, 2021
Inferring the value of a property of a large stochastic system is a difficult task when the number of samples is insufficient to reliably estimate the probability distribution. The Bayesian estimator of the property of interest requires the knowledge of the prior distribution, and in many situations, it is not clear which prior should be used. Several estimators have been developed so far, in which the proposed prior was individually tailored for each property of interest; su...