Why does Deep Learning work?

Dynamic neurons: A statistical physics approach for analyzing deep neural networks

October 1, 2024

88% Match

Donghee Lee, Hye-Sung Lee, Jaeok Yi

Statistical Mechanics

Disordered Systems and Neura...

Machine Learning

Deep neural network architectures often consist of repetitive structural elements. We introduce a new approach that reveals these patterns and can be broadly applied to the study of deep learning. Similar to how a power strip helps untangle and organize complex cable connections, this approach treats neurons as additional degrees of freedom in interactions, simplifying the structure and enhancing the intuitive understanding of interactions within deep neural networks. Further...

Find SimilarView on arXiv

A Study of the Mathematics of Deep Learning

April 28, 2021

88% Match

Anirbit Mukherjee

Machine Learning

Optimization and Control

Applications

Machine Learning

"Deep Learning"/"Deep Neural Nets" is a technological marvel that is now increasingly deployed at the cutting-edge of artificial intelligence tasks. This dramatic success of deep learning in the last few years has been hinged on an enormous amount of heuristics and it has turned out to be a serious mathematical challenge to be able to rigorously explain them. In this thesis, submitted to the Department of Applied Mathematics and Statistics, Johns Hopkins University we take se...

Find SimilarView on arXiv

Learning Stable Group Invariant Representations with Convolutional Networks

January 16, 2013

88% Match

Joan Bruna, Arthur Szlam, Yann LeCun

Artificial Intelligence

Numerical Analysis

Transformation groups, such as translations or rotations, effectively express part of the variability observed in many recognition problems. The group structure enables the construction of invariant signal representations with appealing mathematical properties, where convolutions, together with pooling operators, bring stability to additive and geometric perturbations of the input. Whereas physical transformation groups are ubiquitous in image and audio applications, they do ...

Find SimilarView on arXiv

What do AI algorithms actually learn? - On false structures in deep learning

June 4, 2019

88% Match

Laura Thesing, Vegard Antun, Anders C. Hansen

Machine Learning

Cryptography and Security

Computer Vision and Pattern ...

Machine Learning

There are two big unsolved mathematical questions in artificial intelligence (AI): (1) Why is deep learning so successful in classification problems and (2) why are neural nets based on deep learning at the same time universally unstable, where the instabilities make the networks vulnerable to adversarial attacks. We present a solution to these questions that can be summed up in two words; false structures. Indeed, deep learning does not learn the original structures that hum...

Find SimilarView on arXiv

Why Unsupervised Deep Networks Generalize

December 7, 2020

88% Match

Anita de Mello Koch, Ellen de Mello Koch, Robert de Mello Koch

Machine Learning

Artificial Intelligence

Machine Learning

Promising resolutions of the generalization puzzle observe that the actual number of parameters in a deep network is much smaller than naive estimates suggest. The renormalization group is a compelling example of a problem which has very few parameters, despite the fact that naive estimates suggest otherwise. Our central hypothesis is that the mechanisms behind the renormalization group are also at work in deep learning, and that this leads to a resolution of the generalizati...

Find SimilarView on arXiv

Algebraically-Informed Deep Networks (AIDN): A Deep Learning Approach to Represent Algebraic Structures

December 2, 2020

88% Match

Mustafa Hajij, Ghada Zamzmi, ... , Muller Greg

Machine Learning

Algebraic Topology

Group Theory

Geometric Topology

Representation Theory

One of the central problems in the interface of deep learning and mathematics is that of building learning systems that can automatically uncover underlying mathematical laws from observed data. In this work, we make one step towards building a bridge between algebraic structures and deep learning, and introduce \textbf{AIDN}, \textit{Algebraically-Informed Deep Networks}. \textbf{AIDN} is a deep learning algorithm to represent any finitely-presented algebraic object with a s...

Find SimilarView on arXiv

Why does deep and cheap learning work so well?

August 29, 2016

88% Match

Henry W. Harvard Lin, Max MIT Tegmark, David MIT Rolnick

Disordered Systems and Neura...

Machine Learning

Neural and Evolutionary Comp...

Machine Learning

We show how the success of deep learning could depend not only on mathematics but also on physics: although well-known mathematical theorems guarantee that neural networks can approximate arbitrary functions well, the class of functions of practical interest can frequently be approximated through "cheap learning" with exponentially fewer parameters than generic ones. We explore how properties frequently encountered in physics such as symmetry, locality, compositionality, and ...

Find SimilarView on arXiv

Deep Feature Space: A Geometrical Perspective

June 30, 2020

88% Match

Ioannis Kansizoglou, Loukas Bampis, Antonios Gasteratos

Computer Vision and Pattern ...

Computational Geometry

Machine Learning

One of the most prominent attributes of Neural Networks (NNs) constitutes their capability of learning to extract robust and descriptive features from high dimensional data, like images. Hence, such an ability renders their exploitation as feature extractors particularly frequent in an abundant of modern reasoning systems. Their application scope mainly includes complex cascade tasks, like multi-modal recognition and deep Reinforcement Learning (RL). However, NNs induce impli...

Find SimilarView on arXiv

On the Generalization Mystery in Deep Learning

March 18, 2022

88% Match

Satrajit Chatterjee, Piotr Zielinski

Machine Learning

The generalization mystery in deep learning is the following: Why do over-parameterized neural networks trained with gradient descent (GD) generalize well on real datasets even though they are capable of fitting random datasets of comparable size? Furthermore, from among all solutions that fit the training data, how does GD find one that generalizes well (when such a well-generalizing solution exists)? We argue that the answer to both questions lies in the interaction of the ...

Find SimilarView on arXiv

A Probabilistic Representation of Deep Learning

August 26, 2019

88% Match

Xinjie Lan, Kenneth E. Barner

Machine Learning

In this work, we introduce a novel probabilistic representation of deep learning, which provides an explicit explanation for the Deep Neural Networks (DNNs) in three aspects: (i) neurons define the energy of a Gibbs distribution; (ii) the hidden layers of DNNs formulate Gibbs distributions; and (iii) the whole architecture of DNNs can be interpreted as a Bayesian neural network. Based on the proposed probabilistic representation, we investigate two fundamental properties of d...

Find SimilarView on arXiv

Why does Deep Learning work? - A perspective from Group Theory

Dynamic neurons: A statistical physics approach for analyzing deep neural networks

A Study of the Mathematics of Deep Learning

Learning Stable Group Invariant Representations with Convolutional Networks

What do AI algorithms actually learn? - On false structures in deep learning

Why Unsupervised Deep Networks Generalize

Algebraically-Informed Deep Networks (AIDN): A Deep Learning Approach to Represent Algebraic Structures

Why does deep and cheap learning work so well?

Deep Feature Space: A Geometrical Perspective

On the Generalization Mystery in Deep Learning

A Probabilistic Representation of Deep Learning