July 3, 2019
Similar papers 2
March 10, 2021
The main difficulty that arises in the analysis of most machine learning algorithms is to handle, analytically and numerically, a large number of interacting random variables. In this Ph.D manuscript, we revisit an approach based on the tools of statistical physics of disordered systems. Developed through a rich literature, they have been precisely designed to infer the macroscopic behavior of a large number of particles from their microscopic interactions. At the heart of th...
May 15, 2017
Many real-world problems in machine learning, signal processing, and communications assume that an unknown vector $x$ is measured by a matrix A, resulting in a vector $y=Ax+z$, where $z$ denotes the noise; we call this a single measurement vector (SMV) problem. Sometimes, multiple dependent vectors $x^{(j)}, j\in \{1,...,J\}$, are measured at the same time, forming the so-called multi-measurement vector (MMV) problem. Both SMV and MMV are linear models (LM's), and the process...
July 8, 2016
We consider the estimation of a signal from the knowledge of its noisy linear random Gaussian projections, a problem relevant in compressed sensing, sparse superposition codes or code division multiple access just to cite few. There has been a number of works considering the mutual information for this problem using the heuristic replica method from statistical physics. Here we put these considerations on a firm rigorous basis. First, we show, using a Guerra-type interpolatio...
Nearly all statistical inference methods were developed for the regime where the number $N$ of data samples is much larger than the data dimension $p$. Inference protocols such as maximum likelihood (ML) or maximum a posteriori probability (MAP) are unreliable if $p=O(N)$, due to overfitting. This limitation has for many disciplines with increasingly high-dimensional data become a serious bottleneck. We recently showed that in Cox regression for time-to-event data the overfit...
December 2, 2020
We analyze phase transitions in the conditional entropy of a sequence caused by a change in the conditional variables. Such transitions happen, for example, when training to learn the parameters of a system, since the transition from the training phase to the data phase causes a discontinuous jump in the conditional entropy of the measured system response. For large-scale systems, we present a method of computing a bound on the mutual information obtained with one-shot traini...
August 29, 2024
Variable selection is a problem of statistics that aims to find the subset of the $N$-dimensional possible explanatory variables that are truly related to the generation process of the response variable. In high-dimensional setups, where the input dimension $N$ is comparable to the data size $M$, it is difficult to use classic methods based on $p$-values. Therefore, methods based on the ensemble learning are often used. In this review article, we introduce how the performance...
May 29, 2018
An algorithmic limit of compressed sensing or related variable-selection problems is analytically evaluated when a design matrix is given by an overcomplete random matrix. The replica method from statistical mechanics is employed to derive the result. The analysis is conducted through evaluation of the entropy, an exponential rate of the number of combinations of variables giving a specific value of fit error to given data which is assumed to be generated from a linear proces...
November 23, 2022
As one of the central tasks in machine learning, regression finds lots of applications in different fields. An existing common practice for solving regression problems is the mean square error (MSE) minimization approach or its regularized variants which require prior knowledge about the models. Recently, Yi et al., proposed a mutual information based supervised learning framework where they introduced a label entropy regularization which does not require any prior knowledge....
May 8, 2017
In recent years important progress has been achieved towards proving the validity of the replica predictions for the (asymptotic) mutual information (or "free energy") in Bayesian inference problems. The proof techniques that have emerged appear to be quite general, despite they have been worked out on a case-by-case basis. Unfortunately, a common point between all these schemes is their relatively high level of technicality. We present a new proof scheme that is quite straig...
June 22, 2021
This paper develops an approach to inference in a linear regression model when the number of potential explanatory variables is larger than the sample size. The approach treats each regression coefficient in turn as the interest parameter, the remaining coefficients being nuisance parameters, and seeks an optimal interest-respecting transformation, inducing sparsity on the relevant blocks of the notional Fisher information matrix. The induced sparsity is exploited through a m...