October 22, 2019
Similar papers 3
January 31, 2025
Deep Neural Networks (DNNs) excel at many tasks, often rivaling or surpassing human performance. Yet their internal processes remain elusive, frequently described as "black boxes." While performance can be refined experimentally, achieving a fundamental grasp of their inner workings is still a challenge. Statistical Mechanics has long tackled computational problems, and this thesis applies physics-based insights to understand DNNs via three complementary approaches. First...
January 21, 2020
In this paper we study the properties of the quenched pressure of a multi-layer spin-glass model (a deep Boltzmann Machine in artificial intelligence jargon) whose pairwise interactions are allowed between spins lying in adjacent layers and not inside the same layer nor among layers at distance larger than one. We prove a theorem that bounds the quenched pressure of such a K-layer machine in terms of K Sherrington-Kirkpatrick spin glasses and use it to investigate its anneale...
May 13, 2018
As deep neural networks grow in size, from thousands to millions to billions of weights, the performance of those networks becomes limited by our ability to accurately train them. A common naive question arises: if we have a system with billions of degrees of freedom, don't we also need billions of samples to train it? Of course, the success of deep learning indicates that reliable models can be learned with reasonable amounts of data. Similar questions arise in protein foldi...
September 13, 2023
This paper first describes, from a high level viewpoint, the main challenges that had to be solved in order to develop a theory of spin glasses in the last fifty years. It then explains how important inference problems, notably those occurring in machine learning, can be formulated as problems in statistical physics of disordered systems. However, the main questions that we face in the analysis of deep networks require to develop a new chapter of spin glass theory, which will...
June 12, 2019
In the past decade, deep neural networks (DNNs) came to the fore as the leading machine learning algorithms for a variety of tasks. Their raise was founded on market needs and engineering craftsmanship, the latter based more on trial and error than on theory. While still far behind the application forefront, the theoretical study of DNNs has recently made important advancements in analyzing the highly over-parameterized regime where some exact results have been obtained. Leve...
November 25, 2022
We consider dense, associative neural-networks trained with no supervision and we investigate their computational capabilities analytically, via a statistical-mechanics approach, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as the quality and quantity of the training dataset and the network storage, valid in the limit of large network size and structureless dat...
May 25, 2022
This work reports deep-learning-unique first-order and second-order phase transitions, whose phenomenology closely follows that in statistical physics. In particular, we prove that the competition between prediction error and model complexity in the training loss leads to the second-order phase transition for nets with one hidden layer and the first-order phase transition for nets with more than one hidden layer. The proposed theory is directly relevant to the optimization of...
October 1, 2024
Deep neural network architectures often consist of repetitive structural elements. We introduce a new approach that reveals these patterns and can be broadly applied to the study of deep learning. Similar to how a power strip helps untangle and organize complex cable connections, this approach treats neurons as additional degrees of freedom in interactions, simplifying the structure and enhancing the intuitive understanding of interactions within deep neural networks. Further...
May 10, 2024
The purpose of this manuscript is to review my recent activity on three main research topics. The first concerns the nature of low temperature amorphous solids and their relation with the spin glass transition in a magnetic field. This is the subject of the first chapter where I discuss a new model, the KHGPS model, which allows to make some progress. In the second chapter I review a second research line that concerns the study of the rigidity/jamming transitions in particle ...
January 19, 2015
In this paper, we present a statistical-mechanical analysis of deep learning. We elucidate some of the essential components of deep learning---pre-training by unsupervised learning and fine tuning by supervised learning. We formulate the extraction of features from the training data as a margin criterion in a high-dimensional feature-vector space. The self-organized classifier is then supplied with small amounts of labelled data, as in deep learning. Although we employ a simp...