November 19, 2006
Similar papers 3
August 18, 2013
In this paper, we develop new statistical theory for probabilistic principal component analysis models in high dimensions. The focus is the estimation of the noise variance, which is an important and unresolved issue when the number of variables is large in comparison with the sample size. We first unveil the reasons of a widely observed downward bias of the maximum likelihood estimator of the variance when the data dimension is high. We then propose a bias-corrected estimato...
March 30, 2021
Large-dimensional random matrix theory, RMT for short, which originates from the research field of quantum physics, has shown tremendous capability in providing deep insights into large dimensional systems. With the fact that we have entered an unprecedented era full of massive amounts of data and large complex systems, RMT is expected to play more important roles in the analysis and design of modern systems. In this paper, we review the key results of RMT and its application...
October 28, 2020
Statistical inference is the science of drawing conclusions about some system from data. In modern signal processing and machine learning, inference is done in very high dimension: very many unknown characteristics about the system have to be deduced from a lot of high-dimensional noisy data. This "high-dimensional regime" is reminiscent of statistical mechanics, which aims at describing the macroscopic behavior of a complex system based on the knowledge of its microscopic in...
June 6, 2024
We study sample covariance matrices arising from multi-level components of variance. Thus, let $ B_n=\frac{1}{N}\sum_{j=1}^NT_{j}^{1/2}x_jx_j^TT_{j}^{1/2}$, where $x_j\in R^n$ are i.i.d. standard Gaussian, and $T_{j}=\sum_{r=1}^kl_{jr}^2\Sigma_{r}$ are $n\times n$ real symmetric matrices with bounded spectral norm, corresponding to $k$ levels of variation. As the matrix dimensions $n$ and $N$ increase proportionally, we show that the linear spectral statistics (LSS) of $B_n$ ...
February 3, 2009
In this paper, we give an explanation to the failure of two likelihood ratio procedures for testing about covariance matrices from Gaussian populations when the dimension is large compared to the sample size. Next, using recent central limit theorems for linear spectral statistics of sample covariance matrices and of random F-matrices, we propose necessary corrections for these LR tests to cope with high-dimensional effects. The asymptotic distributions of these corrected tes...
October 30, 2023
These lecture notes were written for the course 18.657, High Dimensional Statistics at MIT. They build on a set of notes that was prepared at Princeton University in 2013-14 that was modified (and hopefully improved) over the years.
May 21, 2018
The present work provides an original framework for random matrix analysis based on revisiting the concentration of measure theory from a probabilistic point of view. By providing various notions of vector concentration ($q$-exponential, linear, Lipschitz, convex), a set of elementary tools is laid out that allows for the immediate extension of classical results from random matrix theory involving random concentrated vectors in place of vectors with independent entries. These...
August 27, 2023
These lecture notes provide an overview of existing methodologies and recent developments for estimation and inference with high dimensional time series regression models. First, we present main limit theory results for high dimensional dependent data which is relevant to covariance matrix structures as well as to dependent time series sequences. Second, we present main aspects of the asymptotic theory related to time series regression models with many covariates. Third, we d...
December 20, 2016
Methods of high-dimensional probability play a central role in applications for statistics, signal processing theoretical computer science and related fields. These lectures present a sample of particularly useful tools of high-dimensional probability, focusing on the classical and matrix Bernstein's inequality and the uniform matrix deviation inequality. We illustrate these tools with applications for dimension reduction, network analysis, covariance estimation, matrix compl...
October 12, 2011
The purpose of this paper is to propose methodologies for statistical inference of low-dimensional parameters with high-dimensional data. We focus on constructing confidence intervals for individual coefficients and linear combinations of several of them in a linear regression model, although our ideas are applicable in a much broad context. The theoretical results presented here provide sufficient conditions for the asymptotic normality of the proposed estimators along with ...