ID: 2006.06997

Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

June 12, 2020

View on ArXiv
Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová
Computer Science
Condensed Matter
Mathematics
Statistics
Machine Learning
Disordered Systems and Neura...
Statistics Theory
Machine Learning
Statistics Theory

Despite the widespread use of gradient-based algorithms for optimizing high-dimensional non-convex functions, understanding their ability of finding good minima instead of being trapped in spurious ones remains to a large extent an open problem. Here we focus on gradient flow dynamics for phase retrieval from random measurements. When the ratio of the number of measurements over the input dimension is small the dynamics remains trapped in spurious minima with large basins of attraction. We find analytically that above a critical ratio those critical points become unstable developing a negative direction toward the signal. By numerical experiments we show that in this regime the gradient flow algorithm is not trapped; it drifts away from the spurious critical points along the unstable direction and succeeds in finding the global minimum. Using tools from statistical physics we characterize this phenomenon, which is related to a BBP-type transition in the Hessian of the spurious minima.

Similar papers 1

From Zero to Hero: How local curvature at artless initial conditions leads away from bad minima

March 4, 2024

94% Match
Tony Bonnaire, Giulio Biroli, Chiara Cammarota
Machine Learning
Disordered Systems and Neura...
Statistical Mechanics

We investigate the optimization dynamics of gradient descent in a non-convex and high-dimensional setting, with a focus on the phase retrieval problem as a case study for complex loss landscapes. We first study the high-dimensional limit where both the number $M$ and the dimension $N$ of the data are going to infinity at fixed signal-to-noise ratio $\alpha = M/N$. By analyzing how the local curvature changes during optimization, we uncover that for intermediate $\alpha$, the ...

Find SimilarView on arXiv

Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem

March 8, 2021

90% Match
Francesca Mignacco, Pierfrancesco Urbani, Lenka Zdeborová
Disordered Systems and Neura...
Machine Learning
Statistics Theory
Machine Learning
Statistics Theory

In this paper we investigate how gradient-based algorithms such as gradient descent, (multi-pass) stochastic gradient descent, its persistent variant, and the Langevin algorithm navigate non-convex loss-landscapes and which of them is able to reach the best generalization error at limited sample complexity. We consider the loss landscape of the high-dimensional phase retrieval problem as a prototypical highly non-convex example. We observe that for phase retrieval the stochas...

Find SimilarView on arXiv

The Local Landscape of Phase Retrieval Under Limited Samples

November 26, 2023

90% Match
Kaizhao Liu, Zihao Wang, Lei Wu
cs.IT
cs.LG
eess.SP
math.IT
math.OC
math.ST
stat.ML
stat.TH

In this paper, we provide a fine-grained analysis of the local landscape of phase retrieval under the regime with limited samples. Our aim is to ascertain the minimal sample size necessary to guarantee a benign local landscape surrounding global minima in high dimensions. Let $n$ and $d$ denote the sample size and input dimension, respectively. We first explore the local convexity and establish that when $n=o(d\log d)$, for almost every fixed point in the local ball, the Hess...

Find SimilarView on arXiv

Solving phase retrieval with random initial guess is nearly as good as by spectral initialization

January 10, 2021

90% Match
Jianfeng Cai, Meng Huang, ... , Wang Yang
Information Theory
Numerical Analysis
Information Theory
Numerical Analysis
Optimization and Control

The problem of recovering a signal $\mathbf{x}\in \mathbb{R}^n$ from a set of magnitude measurements $y_i=|\langle \mathbf{a}_i, \mathbf{x} \rangle |, \; i=1,\ldots,m$ is referred as phase retrieval, which has many applications in fields of physical sciences and engineering. In this paper we show that the smoothed amplitude flow model for phase retrieval has benign geometric structure under the optimal sampling complexity. In particular, we show that when the measurements $\m...

Find SimilarView on arXiv

High-Dimensional Non-Convex Landscapes and Gradient Descent Dynamics

August 7, 2023

89% Match
Tony Bonnaire, Davide Ghio, Kamesh Krishnamurthy, Francesca Mignacco, ... , Biroli Giulio
Disordered Systems and Neura...
Statistical Mechanics

In these lecture notes we present different methods and concepts developed in statistical physics to analyze gradient descent dynamics in high-dimensional non-convex landscapes. Our aim is to show how approaches developed in physics, mainly statistical physics of disordered systems, can be used to tackle open questions on high-dimensional dynamics in Machine Learning.

Find SimilarView on arXiv

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

February 1, 2022

89% Match
Rodrigo Veiga, Ludovic Stephan, Bruno Loureiro, ... , Zdeborová Lenka
Machine Learning
Disordered Systems and Neura...
Machine Learning

Despite the non-convex optimization landscape, over-parametrized shallow networks are able to achieve global convergence under gradient descent. The picture can be radically different for narrow networks, which tend to get stuck in badly-generalizing local minima. Here we investigate the cross-over between these two regimes in the high-dimensional setting, and in particular investigate the connection between the so-called mean-field/hydrodynamic regime and the seminal approac...

Find SimilarView on arXiv

Phase Retrieval via Wirtinger Flow: Theory and Algorithms

July 3, 2014

89% Match
Emmanuel Candes, Xiaodong Li, Mahdi Soltanolkotabi
cs.IT
math.FA
math.IT
math.NA
math.OC
math.ST
stat.TH

We study the problem of recovering the phase from magnitude measurements; specifically, we wish to reconstruct a complex-valued signal x of C^n about which we have phaseless samples of the form y_r = |< a_r,x >|^2, r = 1,2,...,m (knowledge of the phase of these samples would yield a linear system). This paper develops a non-convex formulation of the phase retrieval problem as well as a concrete solution algorithm. In a nutshell, this algorithm starts with a careful initializa...

Find SimilarView on arXiv

Gradient Descent with Random Initialization: Fast Global Convergence for Nonconvex Phase Retrieval

March 21, 2018

89% Match
Yuxin Chen, Yuejie Chi, ... , Ma Cong
stat.ML
cs.IT
cs.LG
cs.NA
math.IT
math.OC

This paper considers the problem of solving systems of quadratic equations, namely, recovering an object of interest $\mathbf{x}^{\natural}\in\mathbb{R}^{n}$ from $m$ quadratic equations/samples $y_{i}=(\mathbf{a}_{i}^{\top}\mathbf{x}^{\natural})^{2}$, $1\leq i\leq m$. This problem, also dubbed as phase retrieval, spans multiple domains including physical sciences and machine learning. We investigate the efficiency of gradient descent (or Wirtinger flow) designed for the no...

Find SimilarView on arXiv

A Geometric Analysis of Phase Retrieval

February 22, 2016

88% Match
Ju Sun, Qing Qu, John Wright
Information Theory
Information Theory
Optimization and Control
Machine Learning

Can we recover a complex signal from its Fourier magnitudes? More generally, given a set of $m$ measurements, $y_k = |\mathbf a_k^* \mathbf x|$ for $k = 1, \dots, m$, is it possible to recover $\mathbf x \in \mathbb{C}^n$ (i.e., length-$n$ complex vector)? This **generalized phase retrieval** (GPR) problem is a fundamental task in various disciplines, and has been the subject of much recent investigation. Natural nonconvex heuristics often work remarkably well for GPR in prac...

Find SimilarView on arXiv

The global landscape of phase retrieval II: quotient intensity models

December 15, 2021

88% Match
Jian-Feng Cai, Meng Huang, ... , Wang Yang
Numerical Analysis
Information Theory
Numerical Analysis
Information Theory

A fundamental problem in phase retrieval is to reconstruct an unknown signal from a set of magnitude-only measurements. In this work we introduce three novel quotient intensity-based models (QIMs) based a deep modification of the traditional intensity-based models. A remarkable feature of the new loss functions is that the corresponding geometric landscape is benign under the optimal sampling complexity. When the measurements $ a_i\in \Rn$ are Gaussian random vectors and the ...

Find SimilarView on arXiv