Why does Deep Learning work?

The Modern Mathematics of Deep Learning

May 9, 2021

89% Match

Julius Berner, Philipp Grohs, ... , Petersen Philipp

Machine Learning

We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent absence of the curse of dimensionality, the surprisingly successful optimization performance despite the non-convexity of the pro...

Find SimilarView on arXiv

A Toy Model of Universality: Reverse Engineering How Networks Learn Group Operations

February 6, 2023

89% Match

Bilal Chughtai, Lawrence Chan, Neel Nanda

Machine Learning

Artificial Intelligence

Representation Theory

Universality is a key hypothesis in mechanistic interpretability -- that different models learn similar features and circuits when trained on similar tasks. In this work, we study the universality hypothesis by examining how small neural networks learn to implement group composition. We present a novel algorithm by which neural networks may implement composition for any finite group via mathematical representation theory. We then show that networks consistently learn this alg...

Find SimilarView on arXiv

Harmonics of Learning: Universal Fourier Features Emerge in Invariant Networks

December 13, 2023

89% Match

Giovanni Luca Marchetti, Christopher Hillar, ... , Sanborn Sophia

Machine Learning

Artificial Intelligence

Signal Processing

In this work, we formally prove that, under certain conditions, if a neural network is invariant to a finite group then its weights recover the Fourier transform on that group. This provides a mathematical explanation for the emergence of Fourier features -- a ubiquitous phenomenon in both biological and artificial learning systems. The results hold even for non-commutative groups, in which case the Fourier transform encodes all the irreducible unitary group representations. ...

Find SimilarView on arXiv

Structure preserving deep learning

June 5, 2020

89% Match

Elena Celledoni, Matthias J. Ehrhardt, Christian Etmann, Robert I McLachlan, Brynjulf Owren, ... , Sherry Ferdia

Machine Learning

Numerical Analysis

Machine Learning

Over the past few years, deep learning has risen to the foreground as a topic of massive interest, mainly as a result of successes obtained in solving large-scale image processing tasks. There are multiple challenging mathematical problems involved in applying deep learning: most deep learning methods require the solution of hard optimisation problems, and a good understanding of the tradeoff between computational effort, amount of data and model complexity is required to suc...

Find SimilarView on arXiv

How deep learning works --The geometry of deep learning

October 30, 2017

89% Match

Xiao Dong, Jiasong Wu, Ling Zhou

Machine Learning

Why and how that deep learning works well on different tasks remains a mystery from a theoretical perspective. In this paper we draw a geometric picture of the deep learning system by finding its analogies with two existing geometric structures, the geometry of quantum computations and the geometry of the diffeomorphic template matching. In this framework, we give the geometric structures of different deep learning systems including convolutional neural networks, residual net...

Find SimilarView on arXiv

Towards Understanding Learning Representations: To What Extent Do Different Neural Networks Learn the Same Representation

October 28, 2018

88% Match

Liwei Wang, Lunjia Hu, Jiayuan Gu, Yue Wu, Zhiqiang Hu, ... , Hopcroft John

Machine Learning

It is widely believed that learning good representations is one of the main reasons for the success of deep neural networks. Although highly intuitive, there is a lack of theory and systematic approach quantitatively characterizing what representations do deep neural networks learn. In this work, we move a tiny step towards a theory and better understanding of the representations. Specifically, we study a simpler problem: How similar are the representations learned by two net...

Find SimilarView on arXiv

On the Symmetries of Deep Learning Models and their Internal Representations

May 27, 2022

88% Match

Charles Godfrey, Davis Brown, ... , Kvinge Henry

Machine Learning

Artificial Intelligence

Symmetry is a fundamental tool in the exploration of a broad range of complex systems. In machine learning symmetry has been explored in both models and data. In this paper we seek to connect the symmetries arising from the architecture of a family of models with the symmetries of that family's internal representation of data. We do this by calculating a set of fundamental symmetry groups, which we call the intertwiner groups of the model. We connect intertwiner groups to a m...

Find SimilarView on arXiv

Recent advances in deep learning theory

December 20, 2020

88% Match

Fengxiang He, Dacheng Tao

Machine Learning

Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic dif...

Find SimilarView on arXiv

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

April 27, 2021

88% Match

Michael M. Bronstein, Joan Bruna, ... , Veličković Petar

Machine Learning

Artificial Intelligence

Computational Geometry

Computer Vision and Pattern ...

Machine Learning

The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods. Indeed, many high-dimensional learning tasks previously thought to be beyond reach -- such as computer vision, playing Go, or protein folding -- are in fact feasible with appropriate computational scale. Remarkably, the essence of deep learning is built from two simple algorithmic principles: first, the notion of representation or feature learnin...

Find SimilarView on arXiv

When Representations Align: Universality in Representation Learning Dynamics

February 14, 2024

88% Match

Rossem Loek van, Andrew M. Saxe

Machine Learning

Neurons and Cognition

Deep neural networks come in many sizes and architectures. The choice of architecture, in conjunction with the dataset and learning algorithm, is commonly understood to affect the learned neural representations. Yet, recent results have shown that different architectures learn representations with striking qualitative similarities. Here we derive an effective theory of representation learning under the assumption that the encoding map from input to hidden representation and t...

Find SimilarView on arXiv

Why does Deep Learning work? - A perspective from Group Theory

The Modern Mathematics of Deep Learning

A Toy Model of Universality: Reverse Engineering How Networks Learn Group Operations

Harmonics of Learning: Universal Fourier Features Emerge in Invariant Networks

Structure preserving deep learning

How deep learning works --The geometry of deep learning

Towards Understanding Learning Representations: To What Extent Do Different Neural Networks Learn the Same Representation

On the Symmetries of Deep Learning Models and their Internal Representations

Recent advances in deep learning theory

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

When Representations Align: Universality in Representation Learning Dynamics