ID: 2405.13690

The effect of regularization in high dimensional Cox regression

May 22, 2024

View on ArXiv
Emanuele Massa
Mathematics
Condensed Matter
Statistics
Statistics Theory
Disordered Systems and Neura...
Statistics Theory

We investigate analytically the behaviour of the penalized maximum partial likelihood estimator (PMPLE). Our results are derived for a generic separable regularization, but we focus on the elastic net. This penalization is routinely adopted for survival analysis in the high dimensional regime, where the Maximum Partial Likelihood estimator (no regularization) might not even exist. Previous theoretical results require that the number $s$ of non-zero association coefficients is $O(n^{\alpha})$, with $\alpha \in (0,1)$ and $n$ the sample size. Here we accurately characterize the behaviour of the PMPLE when $s$ is proportional to $n$ via the solution of a system of six non-linear equations that can be easily obtained by fixed point iteration. These equations are derived by means of the replica method and under the assumption that the covariates $\mathbf{X}\in \mathbb{R}^p$ follow a multivariate Gaussian law with covariance $\mathbf{I}_p/p$. The solution of the previous equations allows us to investigate the dependency of various metrics of interest and hence their dependency on the ratio $\zeta = p/n$, the fraction of true active components $\nu = s/p$, and the regularization strength. We validate our results by extensive numerical simulations.

Similar papers 1