High Dimensional Statistical Inference a...

Test of Independence for High-dimensional Random Vectors Based on Block Correlation Matrices

October 19, 2014

87% Match

Zhigang Bao, Jiang Hu, ... , Zhou Wang

Statistics Theory

In this paper, we are concerned with the independence test for $k$ high-dimensional sub-vectors of a normal vector, with fixed positive integer $k$. A natural high-dimensional extension of the classical sample correlation matrix, namely block correlation matrix, is raised for this purpose. We then construct the so-called Schott type statistic as our test statistic, which turns out to be a particular linear spectral statistic of the block correlation matrix. Interestingly, the...

Find SimilarView on arXiv

Foundational principles for large scale inference: Illustrations through correlation mining

May 11, 2015

87% Match

Alfred O. Hero, Bala Rajaratnam

Statistics Theory

Machine Learning

Statistics Theory

When can reliable inference be drawn in the "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics the dataset is often variable-rich but sample-starved: a regime where the number $n$ of acquired samples (statistical replicates) is far fewer than the number $p$ of obser...

Find SimilarView on arXiv

Statistical applications of random matrix theory: comparison of two populations I

February 28, 2020

87% Match

Rémy Mariétan, Stephan Morgenthaler

Statistics Theory

Methodology

Statistics Theory

This paper investigates a statistical procedure for testing the equality of two independent estimated covariance matrices when the number of potentially dependent data vectors is large and proportional to the size of the vectors, that is, the number of variables. Inspired by the spike models used in random matrix theory, we concentrate on the largest eigenvalues of the matrices in order to determine significance. To avoid false rejections we must guard against residual spikes...

Find SimilarView on arXiv

A review of 20 years of naive tests of significance for high-dimensional mean vectors and covariance matrices

March 3, 2016

87% Match

Jiang Hu, Zhidong Bai

Statistics Theory

In this paper, we will introduce the so called naive tests and give a brief review on the newly development. Naive testing methods are easy to understand and performs robust especially when the dimension is large. In this paper, we mainly focus on reviewing some naive testing methods for the mean vectors and covariance matrices of high dimensional populations and believe this naive test idea can be wildly used in many other testing problems.

Find SimilarView on arXiv

Statistical Mechanics of High-Dimensional Inference

January 18, 2016

87% Match

Madhu Advani, Surya Ganguli

stat.ML

cond-mat.dis-nn

cond-mat.stat-mech

math.ST

q-bio.QM

stat.TH

To model modern large-scale datasets, we need efficient algorithms to infer a set of $P$ unknown model parameters from $N$ noisy measurements. What are fundamental limits on the accuracy of parameter inference, given finite signal-to-noise ratios, limited measurements, prior information, and computational tractability requirements? How can we combine prior information with measurements to achieve these limits? Classical statistics gives incisive answers to these questions as ...

Find SimilarView on arXiv

Invariance-based Inference in High-Dimensional Regression with Finite-Sample Guarantees

December 22, 2023

87% Match

Wenxuan Guo, Panos Toulis

Methodology

Statistics Theory

In this paper, we develop invariance-based procedures for testing and inference in high-dimensional regression models. These procedures, also known as randomization tests, provide several important advantages. First, for the global null hypothesis of significance, our test is valid in finite samples. It is also simple to implement and comes with finite-sample guarantees on statistical power. Remarkably, despite its simplicity, this testing idea has escaped the attention of ea...

Find SimilarView on arXiv

Statistical properties of large data sets with linear latent features

November 8, 2021

87% Match

Philipp Fleig, Ilya Nemenman

Disordered Systems and Neura...

Statistics Theory

Data Analysis, Statistics an...

Statistics Theory

Analytical understanding of how low-dimensional latent features reveal themselves in large-dimensional data is still lacking. We study this by defining a linear latent feature model with additive noise constructed from probabilistic matrices, and analytically and numerically computing the statistical distributions of pairwise correlations and eigenvalues of the correlation matrix. This allows us to resolve the latent feature structure across a wide range of data regimes set b...

Find SimilarView on arXiv

Variable selection in multivariate linear models with high-dimensional covariance matrix estimation

July 13, 2017

87% Match

Marie Perrot-Dockès, Céline Lévy-Leduc, ... , Chiquet Julien

Statistics Theory

In this paper, we propose a novel variable selection approach in the framework of multivariate linear models taking into account the dependence that may exist between the responses. It consists in estimating beforehand the covariance matrix of the responses and to plug this estimator in a Lasso criterion, in order to obtain a sparse estimator of the coefficient matrix. The properties of our approach are investigated both from a theoretical and a numerical point of view. More ...

Find SimilarView on arXiv

Statistical applications of Random matrix theory: comparison of two populations III

May 29, 2020

87% Match

Rémy Mariétan, Stephan Morgenthaler

Methodology

This paper investigates a statistical procedure for testing the equality of two independently estimated covariance matrices when the number of potentially dependent data vectors is large and proportional to the size of the vectors, that is, the number of variables. Inspired by the spike models used in random matrix theory, we concentrate on the largest eigenvalues of the matrices in order to determine significant differences. To avoid false rejections we must guard against re...

Find SimilarView on arXiv

On empirical distribution function of high-dimensional Gaussian vector components with an application to multiple testing

October 9, 2012

87% Match

Sylvain LPMA Delattre, Etienne LPMA Roquain

Statistics Theory

This paper introduces a new framework to study the asymptotical behavior of the empirical distribution function (e.d.f.) of Gaussian vector components, whose correlation matrix $\Gamma^{(m)}$ is dimension-dependent. Hence, by contrast with the existing literature, the vector is not assumed to be stationary. Rather, we make a "vanishing second order" assumption ensuring that the covariance matrix $\Gamma^{(m)}$ is not too far from the identity matrix, while the behavior of the...

Find SimilarView on arXiv

High Dimensional Statistical Inference and Random Matrices

Test of Independence for High-dimensional Random Vectors Based on Block Correlation Matrices

Foundational principles for large scale inference: Illustrations through correlation mining

Statistical applications of random matrix theory: comparison of two populations I

A review of 20 years of naive tests of significance for high-dimensional mean vectors and covariance matrices

Statistical Mechanics of High-Dimensional Inference

Invariance-based Inference in High-Dimensional Regression with Finite-Sample Guarantees

Statistical properties of large data sets with linear latent features

Variable selection in multivariate linear models with high-dimensional covariance matrix estimation

Statistical applications of Random matrix theory: comparison of two populations III

On empirical distribution function of high-dimensional Gaussian vector components with an application to multiple testing