From complex to simple : hierarchical fr...

Unveiling the structure of wide flat minima in neural networks

July 2, 2021

87% Match

Carlo Baldassi, Clarissa Lauditi, Enrico M. Malatesta, ... , Zecchina Riccardo

Disordered Systems and Neura...

Machine Learning

Mathematical Physics

Probability

The success of deep learning has revealed the application potential of neural networks across the sciences and opened up fundamental theoretical problems. In particular, the fact that learning algorithms based on simple variants of gradient methods are able to find near-optimal minima of highly nonconvex loss functions is an unexpected feature of neural networks. Moreover, such algorithms are able to fit the data even in the presence of noise, and yet they have excellent pred...

Find SimilarView on arXiv

The Loss Surfaces of Neural Networks with General Activation Functions

April 8, 2020

87% Match

Nicholas P. Baskerville, Jonathan P. Keating, ... , Najnudel Joseph

Probability

Statistical Mechanics

Machine Learning

Mathematical Physics

The loss surfaces of deep neural networks have been the subject of several studies, theoretical and experimental, over the last few years. One strand of work considers the complexity, in the sense of local optima, of high dimensional random functions with the aim of informing how local optimisation methods may perform in such complicated settings. Prior work of Choromanska et al (2015) established a direct link between the training loss surfaces of deep multi-layer perceptron...

Find SimilarView on arXiv

Statistical Mechanics of Deep Linear Neural Networks: The Back-Propagating Kernel Renormalization

December 7, 2020

87% Match

Qianyi Li, Haim Sompolinsky

Machine Learning

Applied Physics

The success of deep learning in many real-world tasks has triggered an intense effort to understand the power and limitations of deep learning in the training and generalization of complex tasks, so far with limited progress. In this work, we study the statistical mechanics of learning in Deep Linear Neural Networks (DLNNs) in which the input-output function of an individual unit is linear. Despite the linearity of the units, learning in DLNNs is nonlinear, hence studying its...

Find SimilarView on arXiv

Density of states in neural networks: an in-depth exploration of learning in parameter space

September 27, 2024

87% Match

Margherita Mele, Roberto Menichetti, ... , Potestio Raffaello

Statistical Mechanics

Learning in neural networks critically hinges on the intricate geometry of the loss landscape associated with a given task. Traditionally, most research has focused on finding specific weight configurations that minimize the loss. In this work, born from the cross-fertilization of machine learning and theoretical soft matter physics, we introduce a novel, computationally efficient approach to examine the weight space across all loss values. Employing the Wang-Landau enhanced ...

Find SimilarView on arXiv

Phase diagram for two-layer ReLU neural networks at infinite-width limit

July 15, 2020

87% Match

Tao Luo, Zhi-Qin John Xu, ... , Zhang Yaoyu

Machine Learning

How neural network behaves during the training over different choices of hyperparameters is an important question in the study of neural networks. In this work, inspired by the phase diagram in statistical mechanics, we draw the phase diagram for the two-layer ReLU neural network at the infinite-width limit for a complete characterization of its dynamical regimes and their dependence on hyperparameters related to initialization. Through both experimental and theoretical appro...

Find SimilarView on arXiv

Statistical mechanics of complex neural systems and high dimensional data

January 30, 2013

87% Match

Madhu Advani, Subhaneil Lahiri, Surya Ganguli

Neurons and Cognition

Disordered Systems and Neura...

Machine Learning

Recent experimental advances in neuroscience have opened new vistas into the immense complexity of neuronal networks. This proliferation of data challenges us on two parallel fronts. First, how can we form adequate theoretical frameworks for understanding how dynamical network processes cooperate across widely disparate spatiotemporal scales to solve important computational problems? And second, how can we extract meaningful models of neuronal systems from high dimensional da...

Find SimilarView on arXiv

Combining Machine Learning and Physics to Understand Glassy Systems

September 23, 2017

87% Match

Samuel S. Schoenholz

Machine Learning

Soft Condensed Matter

Statistical Mechanics

Our understanding of supercooled liquids and glasses has lagged significantly behind that of simple liquids and crystalline solids. This is in part due to the many possibly relevant degrees of freedom that are present due to the disorder inherent to these systems and in part to non-equilibrium effects which are difficult to treat in the standard context of statistical physics. Together these issues have resulted in a field whose theories are under-constrained by experiment an...

Find SimilarView on arXiv

Resolution of similar patterns in a solvable model of unsupervised deep learning with structured data

November 10, 2023

87% Match

Andrea Baroffio, Pietro Rotondo, Marco Gherardi

Disordered Systems and Neura...

Empirical data, on which deep learning relies, has substantial internal structure, yet prevailing theories often disregard this aspect. Recent research has led to the definition of structured data ensembles, aimed at equipping established theoretical frameworks with interpretable structural elements, a pursuit that aligns with the broader objectives of spin glass theory. We consider a one-parameter structured ensemble where data consists of correlated pairs of patterns, and a...

Find SimilarView on arXiv

Machine learning glass caging order parameters with an artificial nested neural network

November 10, 2021

87% Match

Kaihua Zhang, Xinyang Li, ... , Jiang Ying

Disordered Systems and Neura...

Soft Condensed Matter

Statistical Mechanics

Around a glass transition, the dynamics of a supercooled liquid dramatically slow down, exhibited by caging of particles, while the structural changes remain subtle. In alternative to recent machine learning studies searching for structural predictors of glassy dynamics, here we propose to learn directly particle caging features defined purely according to dynamics. We focus on three transitions in a simulated hard sphere glass model, the melting of ultra-stable glasses, the ...

Find SimilarView on arXiv

The Loss Surfaces of Multilayer Networks

November 30, 2014

87% Match

Anna Choromanska, Mikael Henaff, Michael Mathieu, ... , LeCun Yann

Machine Learning

We study the connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network through the prism of the results from random matrix theory. We show that for larg...

Find SimilarView on arXiv

From complex to simple : hierarchical free-energy landscape renormalized in deep neural networks

Unveiling the structure of wide flat minima in neural networks

The Loss Surfaces of Neural Networks with General Activation Functions

Statistical Mechanics of Deep Linear Neural Networks: The Back-Propagating Kernel Renormalization

Density of states in neural networks: an in-depth exploration of learning in parameter space

Phase diagram for two-layer ReLU neural networks at infinite-width limit

Statistical mechanics of complex neural systems and high dimensional data

Combining Machine Learning and Physics to Understand Glassy Systems

Resolution of similar patterns in a solvable model of unsupervised deep learning with structured data

Machine learning glass caging order parameters with an artificial nested neural network

The Loss Surfaces of Multilayer Networks