Sub-optimality of the Naive Mean Field a...

Inference in High-dimensional Linear Regression

June 22, 2021

86% Match

Heather S. Battey, Nancy Reid

Methodology

Statistics Theory

This paper develops an approach to inference in a linear regression model when the number of potential explanatory variables is larger than the sample size. The approach treats each regression coefficient in turn as the interest parameter, the remaining coefficients being nuisance parameters, and seeks an optimal interest-respecting transformation, inducing sparsity on the relevant blocks of the notional Fisher information matrix. The induced sparsity is exploited through a m...

Find SimilarView on arXiv

$\alpha$-Variational Inference with Statistical Guarantees

October 9, 2017

86% Match

Yun Yang, Debdeep Pati, Anirban Bhattacharya

Statistics Theory

Computation

Methodology

Machine Learning

Statistics Theory

We propose a family of variational approximations to Bayesian posterior distributions, called $\alpha$-VB, with provable statistical guarantees. The standard variational approximation is a special case of $\alpha$-VB with $\alpha=1$. When $\alpha \in(0,1]$, a novel class of variational inequalities are developed for linking the Bayes risk under the variational approximation to the objective function in the variational optimization problem, implying that maximizing the evidenc...

Find SimilarView on arXiv

Finite sample posterior concentration in high-dimensional regression

July 20, 2012

86% Match

Nate Strawn, Artin Armagan, Rayan Saab, ... , Dunson David

Statistics Theory

We study the behavior of the posterior distribution in high-dimensional Bayesian Gaussian linear regression models having $p\gg n$, with $p$ the number of predictors and $n$ the sample size. Our focus is on obtaining quantitative finite sample bounds ensuring sufficient posterior probability assigned in neighborhoods of the true regression coefficient vector, $\beta^0$, with high probability. We assume that $\beta^0$ is approximately $S$-sparse and obtain universal bounds, wh...

Find SimilarView on arXiv

Scalable Approximations for Generalized Linear Problems

November 21, 2016

86% Match

Murat A. Erdogdu, Mohsen Bayati, Lee H. Dicker

Machine Learning

Computation

In stochastic optimization, the population risk is generally approximated by the empirical risk. However, in the large-scale setting, minimization of the empirical risk may be computationally restrictive. In this paper, we design an efficient algorithm to approximate the population risk minimizer in generalized linear problems such as binary classification with surrogate losses and generalized linear regression models. We focus on large-scale problems, where the iterative min...

Find SimilarView on arXiv

A modern maximum-likelihood theory for high-dimensional logistic regression

March 19, 2018

86% Match

Pragya Sur, Emmanuel J. Candes

Statistics Theory

Methodology

Statistics Theory

Every student in statistics or data science learns early on that when the sample size largely exceeds the number of variables, fitting a logistic model produces estimates that are approximately unbiased. Every student also learns that there are formulas to predict the variability of these estimates which are used for the purpose of statistical inference; for instance, to produce p-values for testing the significance of regression coefficients. Although these formulas come fro...

Find SimilarView on arXiv

High-dimensional properties for empirical priors in linear regression with unknown error variance

February 11, 2022

86% Match

Xiao Fang, Malay Ghosh

Statistics Theory

We study full Bayesian procedures for high-dimensional linear regression. We adopt data-dependent empirical priors introduced in [1]. In their paper, these priors have nice posterior contraction properties and are easy to compute. Our paper extend their theoretical results to the case of unknown error variance . Under proper sparsity assumption, we achieve model selection consistency, posterior contraction rates as well as Bernstein von-Mises theorem by analyzing multivariate...

Find SimilarView on arXiv

The estimation error of general first order methods

February 28, 2020

86% Match

Michael Celentano, Andrea Montanari, Yuchen Wu

Machine Learning

Statistics Theory

Modern large-scale statistical models require to estimate thousands to millions of parameters. This is often accomplished by iterative algorithms such as gradient descent, projected gradient descent or their accelerated versions. What are the fundamental limits to these approaches? This question is well understood from an optimization viewpoint when the underlying objective is convex. Work in this area characterizes the gap to global optimality as a function of the number of ...

Find SimilarView on arXiv

The Asymptotic Distribution of the MLE in High-dimensional Logistic Models: Arbitrary Covariance

January 25, 2020

86% Match

Qian Zhao, Pragya Sur, Emmanuel J. Candès

Statistics Theory

Methodology

Statistics Theory

We study the distribution of the maximum likelihood estimate (MLE) in high-dimensional logistic models, extending the recent results from Sur (2019) to the case where the Gaussian covariates may have an arbitrary covariance structure. We prove that in the limit of large problems holding the ratio between the number $p$ of covariates and the sample size $n$ constant, every finite list of MLE coordinates follows a multivariate normal distribution. Concretely, the $j$th coordina...

Find SimilarView on arXiv

Sharp Asymptotics and Optimal Performance for Inference in Binary Models

February 17, 2020

86% Match

Hossein Taheri, Ramtin Pedarsani, Christos Thrampoulidis

math.ST

cs.IT

eess.SP

math.IT

stat.ML

stat.TH

We study convex empirical risk minimization for high-dimensional inference in binary models. Our first result sharply predicts the statistical performance of such estimators in the linear asymptotic regime under isotropic Gaussian features. Importantly, the predictions hold for a wide class of convex loss functions, which we exploit in order to prove a bound on the best achievable performance among them. Notably, we show that the proposed bound is tight for popular binary mod...

Find SimilarView on arXiv

Non-Local Priors for High-Dimensional Estimation

February 20, 2014

85% Match

David Rossell, Donatello Telesca

Statistics Theory

Computation

Statistics Theory

Simultaneously achieving parsimony and good predictive power in high dimensions is a main challenge in statistics. Non-local priors (NLPs) possess appealing properties for high-dimensional model choice, but their use for estimation has not been studied in detail. We show that, for regular models, Bayesian model averaging (BMA) estimates based on NLPs shrink spurious parameters either at fast polynomial or quasi-exponential rates as the sample size $n$ increases (depending on ...

Find SimilarView on arXiv

Sub-optimality of the Naive Mean Field approximation for proportional high-dimensional Linear Regression

Inference in High-dimensional Linear Regression

$\alpha$-Variational Inference with Statistical Guarantees

Finite sample posterior concentration in high-dimensional regression

Scalable Approximations for Generalized Linear Problems

A modern maximum-likelihood theory for high-dimensional logistic regression

High-dimensional properties for empirical priors in linear regression with unknown error variance

The estimation error of general first order methods

The Asymptotic Distribution of the MLE in High-dimensional Logistic Models: Arbitrary Covariance

Sharp Asymptotics and Optimal Performance for Inference in Binary Models

Non-Local Priors for High-Dimensional Estimation