ID: math/0611589

High Dimensional Statistical Inference and Random Matrices

November 19, 2006

View on ArXiv

Similar papers 5

Optimal Shrinkage Estimator for High-Dimensional Mean Vector

October 28, 2016

88% Match
Taras Bodnar, Ostap Okhrin, Nestor Parolya
Statistics Theory
Statistical Finance
Statistics Theory

In this paper we derive the optimal linear shrinkage estimator for the high-dimensional mean vector using random matrix theory. The results are obtained under the assumption that both the dimension $p$ and the sample size $n$ tend to infinity in such a way that $p/n \to c\in(0,\infty)$. Under weak conditions imposed on the underlying data generating mechanism, we find the asymptotic equivalents to the optimal shrinkage intensities and estimate them consistently. The proposed ...

Find SimilarView on arXiv

Detecting Change Points of Covariance Matrices in High Dimensions

September 23, 2024

88% Match
Nina Dörnemann, Holger Dette
Statistics Theory
Probability
Statistics Theory

Testing for change points in sequences of high-dimensional covariance matrices is an important and equally challenging problem in statistical methodology with applications in various fields. Motivated by the observation that even in cases where the ratio between dimension and sample size is as small as $0.05$, tests based on a fixed-dimension asymptotics do not keep their preassigned level, we propose to derive critical values of test statistics using an asymptotic regime whe...

Find SimilarView on arXiv

Improved Gaussian Mean Matrix Estimators In High-Dimensional Data

November 24, 2023

88% Match
Arash A. Foroushani, Severien Nkurunziza
Statistics Theory
Statistics Theory

In this paper, we introduce a class of improved estimators for the mean parameter matrix of a multivariate normal distribution with an unknown variance-covariance matrix. In particular, the main results of [D.Ch\'etelat and M. T. Wells(2012). Improved Multivariate Normal Mean Estimation with Unknown Covariance when $p$ is Greater than $n$. The Annals of Statistics, Vol. 40, No.6, 3137--3160] are established in their full generalities and we provide the corrected version of th...

Find SimilarView on arXiv

Test of Independence for High-dimensional Random Vectors Based on Block Correlation Matrices

October 19, 2014

87% Match
Zhigang Bao, Jiang Hu, ... , Zhou Wang
Statistics Theory
Statistics Theory

In this paper, we are concerned with the independence test for $k$ high-dimensional sub-vectors of a normal vector, with fixed positive integer $k$. A natural high-dimensional extension of the classical sample correlation matrix, namely block correlation matrix, is raised for this purpose. We then construct the so-called Schott type statistic as our test statistic, which turns out to be a particular linear spectral statistic of the block correlation matrix. Interestingly, the...

Find SimilarView on arXiv

Foundational principles for large scale inference: Illustrations through correlation mining

May 11, 2015

87% Match
Alfred O. Hero, Bala Rajaratnam
Statistics Theory
Machine Learning
Statistics Theory

When can reliable inference be drawn in the "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics the dataset is often variable-rich but sample-starved: a regime where the number $n$ of acquired samples (statistical replicates) is far fewer than the number $p$ of obser...

Find SimilarView on arXiv

Statistical applications of random matrix theory: comparison of two populations I

February 28, 2020

87% Match
Rémy Mariétan, Stephan Morgenthaler
Statistics Theory
Methodology
Statistics Theory

This paper investigates a statistical procedure for testing the equality of two independent estimated covariance matrices when the number of potentially dependent data vectors is large and proportional to the size of the vectors, that is, the number of variables. Inspired by the spike models used in random matrix theory, we concentrate on the largest eigenvalues of the matrices in order to determine significance. To avoid false rejections we must guard against residual spikes...

Find SimilarView on arXiv

Statistical Mechanics of High-Dimensional Inference

January 18, 2016

87% Match
Madhu Advani, Surya Ganguli
stat.ML
cond-mat.dis-nn
cond-mat.stat-mech
math.ST
q-bio.QM
stat.TH

To model modern large-scale datasets, we need efficient algorithms to infer a set of $P$ unknown model parameters from $N$ noisy measurements. What are fundamental limits on the accuracy of parameter inference, given finite signal-to-noise ratios, limited measurements, prior information, and computational tractability requirements? How can we combine prior information with measurements to achieve these limits? Classical statistics gives incisive answers to these questions as ...

Find SimilarView on arXiv

Invariance-based Inference in High-Dimensional Regression with Finite-Sample Guarantees

December 22, 2023

87% Match
Wenxuan Guo, Panos Toulis
Methodology
Statistics Theory
Statistics Theory

In this paper, we develop invariance-based procedures for testing and inference in high-dimensional regression models. These procedures, also known as randomization tests, provide several important advantages. First, for the global null hypothesis of significance, our test is valid in finite samples. It is also simple to implement and comes with finite-sample guarantees on statistical power. Remarkably, despite its simplicity, this testing idea has escaped the attention of ea...

Find SimilarView on arXiv

Variable selection in multivariate linear models with high-dimensional covariance matrix estimation

July 13, 2017

87% Match
Marie Perrot-Dockès, Céline Lévy-Leduc, ... , Chiquet Julien
Statistics Theory
Statistics Theory

In this paper, we propose a novel variable selection approach in the framework of multivariate linear models taking into account the dependence that may exist between the responses. It consists in estimating beforehand the covariance matrix of the responses and to plug this estimator in a Lasso criterion, in order to obtain a sparse estimator of the coefficient matrix. The properties of our approach are investigated both from a theoretical and a numerical point of view. More ...

Find SimilarView on arXiv

Statistical properties of large data sets with linear latent features

November 8, 2021

87% Match
Philipp Fleig, Ilya Nemenman
Disordered Systems and Neura...
Statistics Theory
Data Analysis, Statistics an...
Statistics Theory

Analytical understanding of how low-dimensional latent features reveal themselves in large-dimensional data is still lacking. We study this by defining a linear latent feature model with additive noise constructed from probabilistic matrices, and analytically and numerically computing the statistical distributions of pairwise correlations and eigenvalues of the correlation matrix. This allows us to resolve the latent feature structure across a wide range of data regimes set b...

Find SimilarView on arXiv