ID: 2412.15633

Lecture Notes on High Dimensional Linear Regression

December 20, 2024

View on ArXiv
Alberto Quaini
Statistics
Methodology
Computation
Machine Learning

These lecture notes cover advanced topics in linear regression, with an in-depth exploration of the existence, uniqueness, relations, computation, and non-asymptotic properties of the most prominent estimators in this setting. The covered estimators include least squares, ridgeless, ridge, and lasso. The content follows a proposition-proof structure, making it suitable for students seeking a formal and rigorous understanding of the statistical theory underlying machine learning methods.

Similar papers 1

Lecture notes on ridge regression

September 30, 2015

94% Match
Wieringen Wessel N. van
Methodology

The linear regression model cannot be fitted to high-dimensional data, as the high-dimensionality brings about empirical non-identifiability. Penalized regression overcomes this non-identifiability by augmentation of the loss function by a penalty (i.e. a function of regression coefficients). The ridge penalty is the sum of squared regression coefficients, giving rise to ridge regression. Here many aspect of ridge regression are reviewed e.g. moments, mean squared error, its ...

Find SimilarView on arXiv

Ridge Regularizaton: an Essential Concept in Data Science

May 30, 2020

91% Match
Trevor Hastie
Methodology
Machine Learning
Machine Learning

Ridge or more formally $\ell_2$ regularization shows up in many areas of statistics and machine learning. It is one of those essential devices that any good data scientist needs to master for their craft. In this brief ridge fest I have collected together some of the magic and beauty of ridge that my colleagues and I have encountered over the past 40 years in applied statistics.

Find SimilarView on arXiv

A Survey of Tuning Parameter Selection for High-dimensional Regression

August 10, 2019

91% Match
Yunan Wu, Lan Wang
Methodology
Machine Learning
Machine Learning

Penalized (or regularized) regression, as represented by Lasso and its variants, has become a standard technique for analyzing high-dimensional data when the number of variables substantially exceeds the sample size. The performance of penalized regression relies crucially on the choice of the tuning parameter, which determines the amount of regularization and hence the sparsity level of the fitted model. The optimal choice of tuning parameter depends on both the structure of...

Find SimilarView on arXiv

No penalty no tears: Least squares in high-dimensional linear models

June 7, 2015

90% Match
Xiangyu Wang, David Dunson, Chenlei Leng
Methodology
Machine Learning
Statistics Theory
Machine Learning
Statistics Theory

Ordinary least squares (OLS) is the default method for fitting linear models, but is not applicable for problems with dimensionality larger than the sample size. For these problems, we advocate the use of a generalized version of OLS motivated by ridge regression, and propose two novel three-step algorithms involving least squares fitting and hard thresholding. The algorithms are methodologically simple to understand intuitively, computationally easy to implement efficiently,...

Find SimilarView on arXiv

High Dimensional Time Series Regression Models: Applications to Statistical Learning Methods

August 27, 2023

90% Match
Christis Katsouris
Econometrics
Machine Learning

These lecture notes provide an overview of existing methodologies and recent developments for estimation and inference with high dimensional time series regression models. First, we present main limit theory results for high dimensional dependent data which is relevant to covariance matrix structures as well as to dependent time series sequences. Second, we present main aspects of the asymptotic theory related to time series regression models with many covariates. Third, we d...

Find SimilarView on arXiv

On the Distribution, Model Selection Properties and Uniqueness of the Lasso Estimator in Low and High Dimensions

August 31, 2017

90% Match
Karl Ewald, Ulrike Schneider
Statistics Theory
Methodology
Statistics Theory

We derive expressions for the finite-sample distribution of the Lasso estimator in the context of a linear regression model in low as well as in high dimensions by exploiting the structure of the optimization problem defining the estimator. In low dimensions, we assume full rank of the regressor matrix and present expressions for the cumulative distribution function as well as the densities of the absolutely continuous parts of the estimator. Our results are presented for the...

Find SimilarView on arXiv

High-Dimensional Statistics

October 30, 2023

90% Match
Philippe Rigollet, Jan-Christian Hütter
Statistics Theory
Statistics Theory

These lecture notes were written for the course 18.657, High Dimensional Statistics at MIT. They build on a set of notes that was prepared at Princeton University in 2013-14 that was modified (and hopefully improved) over the years.

Find SimilarView on arXiv

Tuning-free ridge estimators for high-dimensional generalized linear models

February 27, 2020

90% Match
Shih-Ting Huang, Fang Xie, Johannes Lederer
Methodology
Applications
Computation
Machine Learning

Ridge estimators regularize the squared Euclidean lengths of parameters. Such estimators are mathematically and computationally attractive but involve tuning parameters that can be difficult to calibrate. In this paper, we show that ridge estimators can be modified such that tuning parameters can be avoided altogether. We also show that these modified versions can improve on the empirical prediction accuracies of standard ridge estimators combined with cross-validation, and w...

Find SimilarView on arXiv

Comment: Fisher Lecture: Dimension Reduction in Regression

August 28, 2007

89% Match
Bing Li
Methodology

Comment: Fisher Lecture: Dimension Reduction in Regression [arXiv:0708.3774]

Find SimilarView on arXiv

Comment: Fisher Lecture: Dimension Reduction in Regression

August 28, 2007

89% Match
Ronald Christensen
Methodology

Comment: Fisher Lecture: Dimension Reduction in Regression [arXiv:0708.3774]

Find SimilarView on arXiv