Why does Deep Learning work?

Deep Learning: A Critical Appraisal

January 2, 2018

87% Match

Gary Marcus

Artificial Intelligence

Machine Learning

Although deep learning has historical roots going back decades, neither the term "deep learning" nor the approach was popular just over five years ago, when the field was reignited by papers such as Krizhevsky, Sutskever and Hinton's now classic (2012) deep network model of Imagenet. What has the field discovered in the five subsequent years? Against a background of considerable progress in areas such as speech recognition, image recognition, and game playing, and considerabl...

Find SimilarView on arXiv

Deep Learning: An Introduction for Applied Mathematicians

January 17, 2018

87% Match

Catherine F. Higham, Desmond J. Higham

History and Overview

Machine Learning

Numerical Analysis

Machine Learning

Multilayered artificial neural networks are becoming a pervasive tool in a host of application fields. At the heart of this deep learning revolution are familiar concepts from applied and computational mathematics; notably, in calculus, approximation theory, optimization and linear algebra. This article provides a very brief introduction to the basic ideas that underlie deep learning from an applied mathematics perspective. Our target audience includes postgraduate and final ...

Find SimilarView on arXiv

Hyper-Representations: Learning from Populations of Neural Networks

October 7, 2024

87% Match

Konstantin Schürholt

Machine Learning

This thesis addresses the challenge of understanding Neural Networks through the lens of their most fundamental component: the weights, which encapsulate the learned information and determine the model behavior. At the core of this thesis is a fundamental question: Can we learn general, task-agnostic representations from populations of Neural Network models? The key contribution of this thesis to answer that question are hyper-representations, a self-supervised method to lear...

Find SimilarView on arXiv

An Overview on Data Representation Learning: From Traditional Feature Learning to Recent Deep Learning

November 25, 2016

87% Match

Guoqiang Zhong, Li-Na Wang, Junyu Dong

Machine Learning

Since about 100 years ago, to learn the intrinsic structure of data, many representation learning approaches have been proposed, including both linear ones and nonlinear ones, supervised ones and unsupervised ones. Particularly, deep architectures are widely applied for representation learning in recent years, and have delivered top results in many tasks, such as image classification, object detection and speech recognition. In this paper, we review the development of data re...

Find SimilarView on arXiv

Unsupervised Learning of Group Invariant and Equivariant Representations

February 15, 2022

87% Match

Robin Winter, Marco Bertolini, Tuan Le, ... , Clevert Djork-Arné

Machine Learning

Equivariant neural networks, whose hidden features transform according to representations of a group G acting on the data, exhibit training efficiency and an improved generalisation performance. In this work, we extend group invariant and equivariant representation learning to the field of unsupervised deep learning. We propose a general learning strategy based on an encoder-decoder framework in which the latent representation is separated in an invariant term and an equivari...

Find SimilarView on arXiv

Why Deep Learning Generalizes

November 17, 2022

87% Match

Benjamin L. Badger

Machine Learning

Very large deep learning models trained using gradient descent are remarkably resistant to memorization given their huge capacity, but are at the same time capable of fitting large datasets of pure noise. Here methods are introduced by which models may be trained to memorize datasets that normally are generalized. We find that memorization is difficult relative to generalization, but that adding noise makes memorization easier. Increasing the dataset size exaggerates the char...

Find SimilarView on arXiv

LieGG: Studying Learned Lie Group Generators

October 9, 2022

87% Match

Artem Moskalev, Anna Sepliarskaia, ... , Smeulders Arnold

Machine Learning

Symmetries built into a neural network have appeared to be very beneficial for a wide range of tasks as it saves the data to learn them. We depart from the position that when symmetries are not built into a model a priori, it is advantageous for robust networks to learn symmetries directly from the data to fit a task function. In this paper, we present a method to extract symmetries learned by a neural network and to evaluate the degree to which a network is invariant to them...

Find SimilarView on arXiv

On Generalization and Regularization in Deep Learning

April 5, 2017

87% Match

Pirmin Lemberger

Machine Learning

Statistics Theory

Why do large neural network generalize so well on complex tasks such as image classification or speech recognition? What exactly is the role regularization for them? These are arguably among the most important open questions in machine learning today. In a recent and thought provoking paper [C. Zhang et al.] several authors performed a number of numerical experiments that hint at the need for novel theoretical concepts to account for this phenomenon. The paper stirred quit a ...

Find SimilarView on arXiv

Mathematics of Deep Learning

December 13, 2017

87% Match

Rene Vidal, Joan Bruna, ... , Soatto Stefano

Machine Learning

Computer Vision and Pattern ...

Recently there has been a dramatic increase in the performance of recognition systems due to the introduction of deep architectures for representation learning and classification. However, the mathematical reasons for this success remain elusive. This tutorial will review recent work that aims to provide a mathematical justification for several properties of deep networks, such as global optimality, geometric stability, and invariance of the learned representations.

Find SimilarView on arXiv

Mechanisms of dimensionality reduction and decorrelation in deep neural networks

October 4, 2017

87% Match

Haiping Huang

Machine Learning

Statistical Mechanics

Machine Learning

Deep neural networks are widely used in various domains. However, the nature of computations at each layer of the deep networks is far from being well understood. Increasing the interpretability of deep neural networks is thus important. Here, we construct a mean-field framework to understand how compact representations are developed across layers, not only in deterministic deep networks with random weights but also in generative deep networks where an unsupervised learning i...

Find SimilarView on arXiv

Why does Deep Learning work? - A perspective from Group Theory

Deep Learning: A Critical Appraisal

Deep Learning: An Introduction for Applied Mathematicians

Hyper-Representations: Learning from Populations of Neural Networks

An Overview on Data Representation Learning: From Traditional Feature Learning to Recent Deep Learning

Unsupervised Learning of Group Invariant and Equivariant Representations

Why Deep Learning Generalizes

LieGG: Studying Learned Lie Group Generators

On Generalization and Regularization in Deep Learning

Mathematics of Deep Learning

Mechanisms of dimensionality reduction and decorrelation in deep neural networks