High Dimensional Statistical Inference and Random Matrices

November 19, 2006

Iain M. Johnstone

Mathematics

Statistics

Statistics Theory

Probability

Statistics Theory

Multivariate statistical analysis is concerned with observations on several variables which are thought to possess some degree of inter-dependence. Driven by problems in genetics and the social sciences, it first flowered in the earlier half of the last century. Subsequently, random matrix theory (RMT) developed, initially within physics, and more recently widely in mathematics. While some of the central objects of study in RMT are identical to those of multivariate statistics, statistical theory was slow to exploit the connection. However, with vast data collection ever more common, data sets now often have as many or more variables than the number of individuals observed. In such contexts, the techniques and results of RMT have much to offer multivariate statistics. The paper reviews some of the progress to date.

High dimensional statistical inference: theoretical development to data analytics

August 19, 2019

90% Match

Deepak Nag Ayyala

Statistics Theory

Methodology

Statistics Theory

This article is due to appear in the Handbook of Statistics, Vol. 43, Elsevier/North-Holland, Amsterdam, edited by Arni S. R. Srinivasa Rao and C. R. Rao. In modern day analytics, there is ever growing need to develop statistical models to study high dimensional data. Between dimension reduction, asymptotics-driven methods and random projection based methods, there are several approaches developed so far. For high dimensional parametric models, estimation and hypothesis tes...

Find SimilarView on arXiv

Tests for High-Dimensional Covariance Matrices Using Random Matrix Projection

November 5, 2015

90% Match

Tung-Lung Wu, Ping Li

Methodology

The classic likelihood ratio test for testing the equality of two covariance matrices breakdowns due to the singularity of the sample covariance matrices when the data dimension $p$ is larger than the sample size $n$. In this paper, we present a conceptually simple method using random projection to project the data onto the one-dimensional random subspace so that the conventional methods can be applied. Both one-sample and two-sample tests for high-dimensional covariance matr...

Find SimilarView on arXiv

Simultaneous test of the mean vectors and covariance matrices for high-dimensional data using RMT

March 9, 2024

90% Match

Zhenzhen Niu, Jianghao Li, ... , Bai Zhidong

Applications

In this paper, we propose a new modified likelihood ratio test (LRT) for simultaneously testing mean vectors and covariance matrices of two-sample populations in high-dimensional settings. By employing tools from Random Matrix Theory (RMT), we derive the limiting null distribution of the modified LRT for generally distributed populations. Furthermore, we compare the proposed test with existing tests using simulation results, demonstrating that the modified LRT exhibits favora...

Find SimilarView on arXiv

Estimation of the Covariance Matrix of Large Dimensional Data

January 23, 2012

89% Match

Jianfeng LTCI Yao, Abla LTCI Kammoun, Jamal LTCI Najim

Information Theory

This paper deals with the problem of estimating the covariance matrix of a series of independent multivariate observations, in the case where the dimension of each observation is of the same order as the number of observations. Although such a regime is of interest for many current statistical signal processing and wireless communication issues, traditional methods fail to produce consistent estimators and only recently results relying on large random matrix theory have been ...

Find SimilarView on arXiv

High dimensionality: The latest challenge to data analysis

February 12, 2019

89% Match

A. M. Pires, J. A. Branco

Methodology

The advent of modern technology, permitting the measurement of thousands of characteristics simultaneously, has given rise to floods of data characterized by many large or even huge datasets. This new paradigm presents extraordinary challenges to data analysis and the question arises: how can conventional data analysis methods, devised for moderate or small datasets, cope with the complexities of modern data? The case of high dimensional data is particularly revealing of some...

Find SimilarView on arXiv

Regularization in High-Dimensional Regression and Classification via Random Matrix Theory

March 30, 2020

89% Match

Panagiotis Lolas

Statistics Theory

We study general singular value shrinkage estimators in high-dimensional regression and classification, when the number of features and the sample size both grow proportionally to infinity. We allow models with general covariance matrices that include a large class of data generating distributions. As far as the implications of our results are concerned, we find exact asymptotic formulas for both the training and test errors in regression models fitted by gradient descent, wh...

Find SimilarView on arXiv

Exact results on high-dimensional linear regression via statistical physics

September 28, 2020

89% Match

Alexander Mozeika, Mansoor Sheikh, Fabian Aguirre-Lopez, ... , Coolen Anthony CC

Statistics Theory

Disordered Systems and Neura...

Statistics Theory

It is clear that conventional statistical inference protocols need to be revised to deal correctly with the high-dimensional data that are now common. Most recent studies aimed at achieving this revision rely on powerful approximation techniques, that call for rigorous results against which they can be tested. In this context, the simplest case of high-dimensional linear regression has acquired significant new relevance and attention. In this paper we use the statistical phys...

Find Similar View on arXiv

Testing independence with high-dimensional correlated samples

March 26, 2017

89% Match

Xi Chen, Weidong Liu

Statistics Theory

Testing independence among a number of (ultra) high-dimensional random samples is a fundamental and challenging problem. By arranging $n$ identically distributed $p$-dimensional random vectors into a $p \times n$ data matrix, we investigate the problem of testing independence among columns under the matrix-variate normal modeling of data. We propose a computationally simple and tuning-free test statistic, characterize its limiting null distribution, analyze the statistical po...

Find SimilarView on arXiv

A unified framework for correlation mining in ultra-high dimension

January 12, 2021

89% Match

Yun Wei, Bala Rajaratnam, Alfred O. Hero

Statistics Theory

Many applications benefit from theory relevant to the identification of variables having large correlations or partial correlations in high dimension. Recently there has been progress in the ultra-high dimensional setting when the sample size $n$ is fixed and the dimension $p$ tends to infinity. Despite these advances, the correlation screening framework suffers from practical, methodological and theoretical deficiencies. For instance, previous correlation screening theory re...

Find SimilarView on arXiv

Signal Processing in Large Systems: a New Paradigm

April 30, 2011

89% Match

Romain Couillet, Merouane Debbah

Information Theory

For a long time, detection and parameter estimation methods for signal processing have relied on asymptotic statistics as the number $n$ of observations of a population grows large comparatively to the population size $N$, i.e. $n/N\to \infty$. Modern technological and societal advances now demand the study of sometimes extremely large populations and simultaneously require fast signal processing due to accelerated system dynamics. This results in not-so-large practical ratio...

Find SimilarView on arXiv