Sub-optimality of the Naive Mean Field a...

Generalized Fiducial Inference for Ultrahigh Dimensional Regression

April 30, 2013

85% Match

Randy C. S. Lai, Jan Hannig, Thomas C. M. Lee

Methodology

In recent years the ultrahigh dimensional linear regression problem has attracted enormous attentions from the research community. Under the sparsity assumption most of the published work is devoted to the selection and estimation of the significant predictor variables. This paper studies a different but fundamentally important aspect of this problem: uncertainty quantification for parameter estimates and model choices. To be more specific, this paper proposes methods for der...

Find SimilarView on arXiv

Statistical Mechanics of High-Dimensional Inference

January 18, 2016

85% Match

Madhu Advani, Surya Ganguli

stat.ML

cond-mat.dis-nn

cond-mat.stat-mech

math.ST

q-bio.QM

stat.TH

To model modern large-scale datasets, we need efficient algorithms to infer a set of $P$ unknown model parameters from $N$ noisy measurements. What are fundamental limits on the accuracy of parameter inference, given finite signal-to-noise ratios, limited measurements, prior information, and computational tractability requirements? How can we combine prior information with measurements to achieve these limits? Classical statistics gives incisive answers to these questions as ...

Find SimilarView on arXiv

Quickly Finding the Best Linear Model in High Dimensions

July 3, 2019

85% Match

Yahya Sattar, Samet Oymak

Machine Learning

Information Theory

Machine Learning

We study the problem of finding the best linear model that can minimize least-squares loss given a data-set. While this problem is trivial in the low dimensional regime, it becomes more interesting in high dimensions where the population minimizer is assumed to lie on a manifold such as sparse vectors. We propose projected gradient descent (PGD) algorithm to estimate the population minimizer in the finite sample regime. We establish linear convergence rate and data dependent ...

Find SimilarView on arXiv

High-dimensional Learning with Noisy Labels

May 23, 2024

85% Match

Aymane El Firdoussi, Mohamed El Amine Seddik

Machine Learning

Artificial Intelligence

Machine Learning

This paper provides theoretical insights into high-dimensional binary classification with class-conditional noisy labels. Specifically, we study the behavior of a linear classifier with a label noisiness aware loss function, when both the dimension of data $p$ and the sample size $n$ are large and comparable. Relying on random matrix theory by supposing a Gaussian mixture data model, the performance of the linear classifier when $p,n\to \infty$ is shown to converge towards a ...

Find SimilarView on arXiv

The geometry of least squares in the 21st century

September 30, 2013

85% Match

Jonathan Taylor

Statistics Theory

It has been over 200 years since Gauss's and Legendre's famous priority dispute on who discovered the method of least squares. Nevertheless, we argue that the normal equations are still relevant in many facets of modern statistics, particularly in the domain of high-dimensional inference. Even today, we are still learning new things about the law of large numbers, first described in Bernoulli's Ars Conjectandi 300 years ago, as it applies to high dimensional inference. The ot...

Find SimilarView on arXiv

Laplace Approximation in High-dimensional Bayesian Regression

March 28, 2015

85% Match

Rina Foygel Barber, Mathias Drton, Kean Ming Tan

Statistics Theory

We consider Bayesian variable selection in sparse high-dimensional regression, where the number of covariates $p$ may be large relative to the samples size $n$, but at most a moderate number $q$ of covariates are active. Specifically, we treat generalized linear models. For a single fixed sparse model with well-behaved prior distribution, classical theory proves that the Laplace approximation to the marginal likelihood of the model is accurate for sufficiently large sample si...

Find SimilarView on arXiv

Near Optimal Heteroscedastic Regression with Symbiotic Learning

June 25, 2023

85% Match

Dheeraj Baby, Aniket Das, ... , Netrapalli Praneeth

Machine Learning

Statistics Theory

We consider the problem of heteroscedastic linear regression, where, given $n$ samples $(\mathbf{x}_i, y_i)$ from $y_i = \langle \mathbf{w}^{*}, \mathbf{x}_i \rangle + \epsilon_i \cdot \langle \mathbf{f}^{*}, \mathbf{x}_i \rangle$ with $\mathbf{x}_i \sim N(0,\mathbf{I})$, $\epsilon_i \sim N(0,1)$, we aim to estimate $\mathbf{w}^{*}$. Beyond classical applications of such models in statistics, econometrics, time series analysis etc., it is also particularly relevant in machine...

Find SimilarView on arXiv

Approximation Properties of Variational Bayes for Vector Autoregressions

March 2, 2019

85% Match

Reza Hajargasht

Machine Learning

Econometrics

Computation

Variational Bayes (VB) is a recent approximate method for Bayesian inference. It has the merit of being a fast and scalable alternative to Markov Chain Monte Carlo (MCMC) but its approximation error is often unknown. In this paper, we derive the approximation error of VB in terms of mean, mode, variance, predictive density and KL divergence for the linear Gaussian multi-equation regression. Our results indicate that VB approximates the posterior mean perfectly. Factors affect...

Find SimilarView on arXiv

Subsampled Optimization: Statistical Guarantees, Mean Squared Error Approximation, and Sampling Method

April 10, 2018

85% Match

Rong Zhu, Jiming Jiang

Machine Learning

For optimization on large-scale data, exactly calculating its solution may be computationally difficulty because of the large size of the data. In this paper we consider subsampled optimization for fast approximating the exact solution. In this approach, one gets a surrogate dataset by sampling from the full data, and then obtains an approximate solution by solving the subsampled optimization based on the surrogate. One main theoretical contributions are to provide the asympt...

Find SimilarView on arXiv

Bayesian Analysis for Over-parameterized Linear Model without Sparsity

May 25, 2023

85% Match

Tomoya Wakayama, Masaaki Imaizumi

Statistics Theory

Methodology

Machine Learning

Statistics Theory

In high-dimensional Bayesian statistics, several methods have been developed, including many prior distributions that lead to the sparsity of estimated parameters. However, such priors have limitations in handling the spectral eigenvector structure of data, and as a result, they are ill-suited for analyzing over-parameterized models (high-dimensional linear models that do not assume sparsity) that have been developed in recent years. This paper introduces a Bayesian approach ...

Find SimilarView on arXiv

Sub-optimality of the Naive Mean Field approximation for proportional high-dimensional Linear Regression

Generalized Fiducial Inference for Ultrahigh Dimensional Regression

Statistical Mechanics of High-Dimensional Inference

Quickly Finding the Best Linear Model in High Dimensions

High-dimensional Learning with Noisy Labels

The geometry of least squares in the 21st century

Laplace Approximation in High-dimensional Bayesian Regression

Near Optimal Heteroscedastic Regression with Symbiotic Learning

Approximation Properties of Variational Bayes for Vector Autoregressions

Subsampled Optimization: Statistical Guarantees, Mean Squared Error Approximation, and Sampling Method

Bayesian Analysis for Over-parameterized Linear Model without Sparsity