A Group Theoretic Perspective on Unsupervised Deep Learning

April 8, 2015

Arnab Paul, Suresh Venkatasubramanian

Computer Science

Statistics

Machine Learning

Neural and Evolutionary Comp...

Machine Learning

Why does Deep Learning work? What representations does it capture? How do higher-order representations emerge? We study these questions from the perspective of group theory, thereby opening a new approach towards a theory of Deep learning. One factor behind the recent resurgence of the subject is a key algorithmic step called {\em pretraining}: first search for a good generative model for the input samples, and repeat the process one layer at a time. We show deeper implications of this simple principle, by establishing a connection with the interplay of orbits and stabilizers of group actions. Although the neural networks themselves may not form groups, we show the existence of {\em shadow} groups whose elements serve as close approximations. Over the shadow groups, the pre-training step, originally introduced as a mechanism to better initialize a network, becomes equivalent to a search for features with minimal orbits. Intuitively, these features are in a way the {\em simplest}. Which explains why a deep learning network learns simple features first. Next, we show how the same principle, when repeated in the deeper layers, can capture higher order representations, and why representation complexity increases as the layers get deeper.

Why does Deep Learning work? - A perspective from Group Theory

December 20, 2014

99% Match

Arnab Paul, Suresh Venkatasubramanian

Machine Learning

Neural and Evolutionary Comp...

Machine Learning

Find SimilarView on arXiv

Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations

October 24, 2023

90% Match

Leonardo Petrini

Machine Learning

Artificial intelligence, particularly the subfield of machine learning, has seen a paradigm shift towards data-driven models that learn from and adapt to data. This has resulted in unprecedented advancements in various domains such as natural language processing and computer vision, largely attributed to deep learning, a special class of machine learning models. Deep learning arguably surpasses traditional approaches by learning the relevant features from raw data through a s...

Find SimilarView on arXiv

The Unreasonable Effectiveness of Deep Learning in Artificial Intelligence

February 12, 2020

89% Match

Terrence J. Sejnowski

Neurons and Cognition

Artificial Intelligence

Machine Learning

Neural and Evolutionary Comp...

Deep learning networks have been trained to recognize speech, caption photographs and translate text between languages at high levels of performance. Although applications of deep learning networks to real world problems have become ubiquitous, our understanding of why they are so effective is lacking. These empirical results should not be possible according to sample complexity in statistics and non-convex optimization theory. However, paradoxes in the training and effective...

Find SimilarView on arXiv

A Selective Overview of Deep Learning

April 10, 2019

89% Match

Jianqing Fan, Cong Ma, Yiqiao Zhong

Machine Learning

Statistics Theory

Methodology

Statistics Theory

Deep learning has arguably achieved tremendous success in recent years. In simple words, deep learning uses the composition of many nonlinear functions to model the complex dependency between input features and labels. While neural networks have a long history, recent advances have greatly improved their performance in computer vision, natural language processing, etc. From the statistical and scientific perspective, it is natural to ask: What is deep learning? What are the n...

Find SimilarView on arXiv

Why Unsupervised Deep Networks Generalize

December 7, 2020

88% Match

Anita de Mello Koch, Ellen de Mello Koch, Robert de Mello Koch

Machine Learning

Artificial Intelligence

Machine Learning

Promising resolutions of the generalization puzzle observe that the actual number of parameters in a deep network is much smaller than naive estimates suggest. The renormalization group is a compelling example of a problem which has very few parameters, despite the fact that naive estimates suggest otherwise. Our central hypothesis is that the mechanisms behind the renormalization group are also at work in deep learning, and that this leads to a resolution of the generalizati...

Find SimilarView on arXiv

Deep Learning of Representations: Looking Forward

May 2, 2013

88% Match

Yoshua Bengio

Machine Learning

Deep learning research aims at discovering learning algorithms that discover multiple levels of distributed representations, with higher levels representing more abstract concepts. Although the study of deep learning has already led to impressive theoretical results, learning algorithms and breakthrough experiments, several challenges lie ahead. This paper proposes to examine some of these challenges, centering on the questions of scaling deep learning algorithms to much larg...

Find SimilarView on arXiv

The many faces of deep learning

August 25, 2019

88% Match

Raul Vicente

Machine Learning

Data Analysis, Statistics an...

Neurons and Cognition

Machine Learning

Deep learning has sparked a network of mutual interactions between different disciplines and AI. Naturally, each discipline focuses and interprets the workings of deep learning in different ways. This diversity of perspectives on deep learning, from neuroscience to statistical physics, is a rich source of inspiration that fuels novel developments in the theory and applications of machine learning. In this perspective, we collect and synthesize different intuitions scattered a...

Find SimilarView on arXiv

Unsupervised Learning of Group Invariant and Equivariant Representations

February 15, 2022

88% Match

Robin Winter, Marco Bertolini, Tuan Le, ... , Clevert Djork-Arné

Machine Learning

Equivariant neural networks, whose hidden features transform according to representations of a group G acting on the data, exhibit training efficiency and an improved generalisation performance. In this work, we extend group invariant and equivariant representation learning to the field of unsupervised deep learning. We propose a general learning strategy based on an encoder-decoder framework in which the latent representation is separated in an invariant term and an equivari...

Find SimilarView on arXiv

Structure preserving deep learning

June 5, 2020

88% Match

Elena Celledoni, Matthias J. Ehrhardt, Christian Etmann, Robert I McLachlan, Brynjulf Owren, ... , Sherry Ferdia

Machine Learning

Numerical Analysis

Machine Learning

Over the past few years, deep learning has risen to the foreground as a topic of massive interest, mainly as a result of successes obtained in solving large-scale image processing tasks. There are multiple challenging mathematical problems involved in applying deep learning: most deep learning methods require the solution of hard optimisation problems, and a good understanding of the tradeoff between computational effort, amount of data and model complexity is required to suc...

Find SimilarView on arXiv

The Principles of Deep Learning Theory

June 18, 2021

88% Match

Daniel A. Roberts, Sho Yaida, Boris Hanin

Machine Learning

Artificial Intelligence

Machine Learning

This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio...

Find SimilarView on arXiv