ID: 1412.6621

Why does Deep Learning work? - A perspective from Group Theory

December 20, 2014

View on ArXiv
Arnab Paul, Suresh Venkatasubramanian
Computer Science
Statistics
Machine Learning
Neural and Evolutionary Comp...
Machine Learning

Why does Deep Learning work? What representations does it capture? How do higher-order representations emerge? We study these questions from the perspective of group theory, thereby opening a new approach towards a theory of Deep learning. One factor behind the recent resurgence of the subject is a key algorithmic step called pre-training: first search for a good generative model for the input samples, and repeat the process one layer at a time. We show deeper implications of this simple principle, by establishing a connection with the interplay of orbits and stabilizers of group actions. Although the neural networks themselves may not form groups, we show the existence of {\em shadow} groups whose elements serve as close approximations. Over the shadow groups, the pre-training step, originally introduced as a mechanism to better initialize a network, becomes equivalent to a search for features with minimal orbits. Intuitively, these features are in a way the {\em simplest}. Which explains why a deep learning network learns simple features first. Next, we show how the same principle, when repeated in the deeper layers, can capture higher order representations, and why representation complexity increases as the layers get deeper.

Similar papers 1

A Group Theoretic Perspective on Unsupervised Deep Learning

April 8, 2015

99% Match
Arnab Paul, Suresh Venkatasubramanian
Machine Learning
Neural and Evolutionary Comp...
Machine Learning

Why does Deep Learning work? What representations does it capture? How do higher-order representations emerge? We study these questions from the perspective of group theory, thereby opening a new approach towards a theory of Deep learning. One factor behind the recent resurgence of the subject is a key algorithmic step called {\em pretraining}: first search for a good generative model for the input samples, and repeat the process one layer at a time. We show deeper implicat...

Find SimilarView on arXiv

Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations

October 24, 2023

90% Match
Leonardo Petrini
Machine Learning

Artificial intelligence, particularly the subfield of machine learning, has seen a paradigm shift towards data-driven models that learn from and adapt to data. This has resulted in unprecedented advancements in various domains such as natural language processing and computer vision, largely attributed to deep learning, a special class of machine learning models. Deep learning arguably surpasses traditional approaches by learning the relevant features from raw data through a s...

Find SimilarView on arXiv

Why & When Deep Learning Works: Looking Inside Deep Learnings

May 10, 2017

90% Match
Ronny Ronen
Machine Learning

The Intel Collaborative Research Institute for Computational Intelligence (ICRI-CI) has been heavily supporting Machine Learning and Deep Learning research from its foundation in 2012. We have asked six leading ICRI-CI Deep Learning researchers to address the challenge of "Why & When Deep Learning works", with the goal of looking inside Deep Learning, providing insights on how deep networks function, and uncovering key observations on their expressiveness, limitations, and po...

Find SimilarView on arXiv

The Unreasonable Effectiveness of Deep Learning in Artificial Intelligence

February 12, 2020

90% Match
Terrence J. Sejnowski
Neurons and Cognition
Artificial Intelligence
Machine Learning
Neural and Evolutionary Comp...

Deep learning networks have been trained to recognize speech, caption photographs and translate text between languages at high levels of performance. Although applications of deep learning networks to real world problems have become ubiquitous, our understanding of why they are so effective is lacking. These empirical results should not be possible according to sample complexity in statistics and non-convex optimization theory. However, paradoxes in the training and effective...

Find SimilarView on arXiv

The many faces of deep learning

August 25, 2019

90% Match
Raul Vicente
Machine Learning
Data Analysis, Statistics an...
Neurons and Cognition
Machine Learning

Deep learning has sparked a network of mutual interactions between different disciplines and AI. Naturally, each discipline focuses and interprets the workings of deep learning in different ways. This diversity of perspectives on deep learning, from neuroscience to statistical physics, is a rich source of inspiration that fuels novel developments in the theory and applications of machine learning. In this perspective, we collect and synthesize different intuitions scattered a...

Find SimilarView on arXiv

What Really is Deep Learning Doing?

November 6, 2017

89% Match
Chuyu Xiong
Machine Learning
Neural and Evolutionary Comp...

Deep learning has achieved a great success in many areas, from computer vision to natural language processing, to game playing, and much more. Yet, what deep learning is really doing is still an open question. There are a lot of works in this direction. For example, [5] tried to explain deep learning by group renormalization, and [6] tried to explain deep learning from the view of functional approximation. In order to address this very crucial question, here we see deep learn...

Find SimilarView on arXiv

A Selective Overview of Deep Learning

April 10, 2019

89% Match
Jianqing Fan, Cong Ma, Yiqiao Zhong
Machine Learning
Machine Learning
Statistics Theory
Methodology
Statistics Theory

Deep learning has arguably achieved tremendous success in recent years. In simple words, deep learning uses the composition of many nonlinear functions to model the complex dependency between input features and labels. While neural networks have a long history, recent advances have greatly improved their performance in computer vision, natural language processing, etc. From the statistical and scientific perspective, it is natural to ask: What is deep learning? What are the n...

Find SimilarView on arXiv

The Principles of Deep Learning Theory

June 18, 2021

89% Match
Daniel A. Roberts, Sho Yaida, Boris Hanin
Machine Learning
Artificial Intelligence
Machine Learning

This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio...

Find SimilarView on arXiv

Deep Learning of Representations: Looking Forward

May 2, 2013

89% Match
Yoshua Bengio
Machine Learning

Deep learning research aims at discovering learning algorithms that discover multiple levels of distributed representations, with higher levels representing more abstract concepts. Although the study of deep learning has already led to impressive theoretical results, learning algorithms and breakthrough experiments, several challenges lie ahead. This paper proposes to examine some of these challenges, centering on the questions of scaling deep learning algorithms to much larg...

Find SimilarView on arXiv

Deep representation learning: Fundamentals, Perspectives, Applications, and Open Challenges

November 27, 2022

89% Match
Kourosh T. Baghaei, Amirreza Payandeh, Pooya Fayyazsanavi, Shahram Rahimi, ... , Ramezani Somayeh Bakhtiari
Machine Learning
Artificial Intelligence
Computer Vision and Pattern ...

Machine Learning algorithms have had a profound impact on the field of computer science over the past few decades. These algorithms performance is greatly influenced by the representations that are derived from the data in the learning process. The representations learned in a successful learning process should be concise, discrete, meaningful, and able to be applied across a variety of tasks. A recent effort has been directed toward developing Deep Learning models, which hav...

Find SimilarView on arXiv