The effect of regularization in high dimensional Cox regression

May 22, 2024

View on ArXiv

Emanuele Massa

Mathematics

Condensed Matter

Statistics

Statistics Theory

Disordered Systems and Neura...

Statistics Theory

We investigate analytically the behaviour of the penalized maximum partial likelihood estimator (PMPLE). Our results are derived for a generic separable regularization, but we focus on the elastic net. This penalization is routinely adopted for survival analysis in the high dimensional regime, where the Maximum Partial Likelihood estimator (no regularization) might not even exist. Previous theoretical results require that the number $s$ of non-zero association coefficients is $O(n^{\alpha})$, with $\alpha \in (0,1)$ and $n$ the sample size. Here we accurately characterize the behaviour of the PMPLE when $s$ is proportional to $n$ via the solution of a system of six non-linear equations that can be easily obtained by fixed point iteration. These equations are derived by means of the replica method and under the assumption that the covariates $\mathbf{X}\in \mathbb{R}^p$ follow a multivariate Gaussian law with covariance $\mathbf{I}_p/p$. The solution of the previous equations allows us to investigate the dependency of various metrics of interest and hence their dependency on the ratio $\zeta = p/n$, the fraction of true active components $\nu = s/p$, and the regularization strength. We validate our results by extensive numerical simulations.

Analysis of overfitting in the regularized Cox model

April 14, 2019

92% Match

M Sheikh, A. C. C. Coolen

stat.ME

cond-mat.dis-nn

cs.LG

math.ST

stat.ML

stat.TH

The Cox proportional hazards model is ubiquitous in the analysis of time-to-event data. However, when the data dimension p is comparable to the sample size $N$, maximum likelihood estimates for its regression parameters are known to be biased or break down entirely due to overfitting. This prompted the introduction of the so-called regularized Cox model. In this paper we use the replica method from statistical physics to investigate the relationship between the true and infer...

Find SimilarView on arXiv

A Modern Theory for High-dimensional Cox Regression Models

April 3, 2022

90% Match

Xianyang Zhang, Huijuan Zhou, Hanxuan Ye

Statistics Theory

The proportional hazards model has been extensively used in many fields such as biomedicine to estimate and perform statistical significance testing on the effects of covariates influencing the survival time of patients. The classical theory of maximum partial-likelihood estimation (MPLE) is used by most software packages to produce inference, e.g., the coxph function in R and the PHREG procedure in SAS. In this paper, we investigate the asymptotic behavior of the MPLE in the...

Find SimilarView on arXiv

Replica analysis of overfitting in regression models for time-to-event data

May 4, 2017

89% Match

ACC Coolen, JE Barrett, ... , Perez-Vicente CJ

Applications

Disordered Systems and Neura...

Data Analysis, Statistics an...

Overfitting, which happens when the number of parameters in a model is too large compared to the number of data points available for determining these parameters, is a serious and growing problem in survival analysis. While modern medicine presents us with data of unprecedented dimensionality, these data cannot yet be used effectively for clinical outcome prediction. Standard error measures in maximum likelihood regression, such as p-values and z-scores, are blind to overfitt...

Find SimilarView on arXiv

A Survey of Tuning Parameter Selection for High-dimensional Regression

August 10, 2019

88% Match

Yunan Wu, Lan Wang

Methodology

Machine Learning

Penalized (or regularized) regression, as represented by Lasso and its variants, has become a standard technique for analyzing high-dimensional data when the number of variables substantially exceeds the sample size. The performance of penalized regression relies crucially on the choice of the tuning parameter, which determines the amount of regularization and hence the sparsity level of the fitted model. The optimal choice of tuning parameter depends on both the structure of...

Find SimilarView on arXiv

Scalable Sparse Cox's Regression for Large-Scale Survival Data via Broken Adaptive Ridge

December 2, 2017

88% Match

Eric S. Kawaguchi, Marc A. Suchard, ... , Li Gang

Methodology

This paper develops a new scalable sparse Cox regression tool for sparse high-dimensional massive sample size (sHDMSS) survival data. The method is a local $L_0$-penalized Cox regression via repeatedly performing reweighted $L_2$-penalized Cox regression. We show that the resulting estimator enjoys the best of $L_0$- and $L_2$-penalized Cox regressions while overcoming their limitations. Specifically, the estimator is selection consistent, oracle for parameter estimation, and...

Find SimilarView on arXiv

Adaptive estimation of the baseline hazard function in the Cox model by model selection, with high-dimensional covariates

March 1, 2015

88% Match

Agathe LSTA Guilloux, Sarah LaMME Lemler, Marie-Luce Unité MIAJ, LaMME Taupin

Statistics Theory

Applications

Statistics Theory

The purpose of this article is to provide an adaptive estimator of the baseline function in the Cox model with high-dimensional covariates. We consider a two-step procedure : first, we estimate the regression parameter of the Cox model via a Lasso procedure based on the partial log-likelihood, secondly, we plug this Lasso estimator into a least-squares type criterion and then perform a model selection procedure to obtain an adaptive penalized contrast estimator of the baselin...

Find SimilarView on arXiv

High Dimensional Robust Inference for Cox Regression Models

November 1, 2018

88% Match

Shengchun Kong, Zhuqing Yu, ... , Cheng Guang

Statistics Theory

We consider high-dimensional inference for potentially misspecified Cox proportional hazard models based on low dimensional results by Lin and Wei [1989]. A de-sparsified Lasso estimator is proposed based on the log partial likelihood function and shown to converge to a pseudo-true parameter vector. Interestingly, the sparsity of the true parameter can be inferred from that of the above limiting parameter. Moreover, each component of the above (non-sparse) estimator is shown ...

Find SimilarView on arXiv

A Fast Divide-and-Conquer Sparse Cox Regression

April 2, 2018

88% Match

Yan Wang, Nathan Palmer, Qian Di, Joel Schwartz, ... , Cai Tianxi

Computation

Applications

We propose a computationally and statistically efficient divide-and-conquer (DAC) algorithm to fit sparse Cox regression to massive datasets where the sample size $n_0$ is exceedingly large and the covariate dimension $p$ is not small but $n_0\gg p$. The proposed algorithm achieves computational efficiency through a one-step linear approximation followed by a least square approximation to the partial likelihood (PL). These sequences of linearization enable us to maximize the ...

Find SimilarView on arXiv

High-dimensional regression in practice: an empirical study of finite-sample prediction, variable selection and ranking

August 2, 2018

88% Match

Fan Wang, Sach Mukherjee, ... , Hill Steven M.

Methodology

Machine Learning

Penalized likelihood approaches are widely used for high-dimensional regression. Although many methods have been proposed and the associated theory is now well-developed, the relative efficacy of different approaches in finite-sample settings, as encountered in practice, remains incompletely understood. There is therefore a need for empirical investigations in this area that can offer practical insight and guidance to users. In this paper we present a large-scale comparison o...

Find SimilarView on arXiv

Robust Variable Selection and Estimation Via Adaptive Elastic Net S-Estimators for Linear Regression

July 7, 2021

87% Match

David Kepplinger

Methodology

Machine Learning

Heavy-tailed error distributions and predictors with anomalous values are ubiquitous in high-dimensional regression problems and can seriously jeopardize the validity of statistical analyses if not properly addressed. For more reliable estimation under these adverse conditions, we propose a new robust regularized estimator for simultaneous variable selection and coefficient estimation. This estimator, called adaptive PENSE, possesses the oracle property without prior knowledg...

Find SimilarView on arXiv