Exploring the Function Space of Deep-Lea...

Towards a theory of machine learning

April 15, 2020

87% Match

Vitaly Vanchurin

Machine Learning

Disordered Systems and Neura...

We define a neural network as a septuple consisting of (1) a state vector, (2) an input projection, (3) an output projection, (4) a weight matrix, (5) a bias vector, (6) an activation map and (7) a loss function. We argue that the loss function can be imposed either on the boundary (i.e. input and/or output neurons) or in the bulk (i.e. hidden neurons) for both supervised and unsupervised systems. We apply the principle of maximum entropy to derive a canonical ensemble of the...

Find SimilarView on arXiv

Learning Curves for Deep Neural Networks: A Gaussian Field Theory Perspective

June 12, 2019

87% Match

Omry Cohen, Or Malka, Zohar Ringel

Machine Learning

Statistical Mechanics

Neural and Evolutionary Comp...

Data Analysis, Statistics an...

Machine Learning

In the past decade, deep neural networks (DNNs) came to the fore as the leading machine learning algorithms for a variety of tasks. Their raise was founded on market needs and engineering craftsmanship, the latter based more on trial and error than on theory. While still far behind the application forefront, the theoretical study of DNNs has recently made important advancements in analyzing the highly over-parameterized regime where some exact results have been obtained. Leve...

Find SimilarView on arXiv

The many faces of deep learning

August 25, 2019

87% Match

Raul Vicente

Machine Learning

Data Analysis, Statistics an...

Neurons and Cognition

Machine Learning

Deep learning has sparked a network of mutual interactions between different disciplines and AI. Naturally, each discipline focuses and interprets the workings of deep learning in different ways. This diversity of perspectives on deep learning, from neuroscience to statistical physics, is a rich source of inspiration that fuels novel developments in the theory and applications of machine learning. In this perspective, we collect and synthesize different intuitions scattered a...

Find SimilarView on arXiv

Flow of Information in Feed-Forward Deep Neural Networks

March 20, 2016

87% Match

Pejman Khadivi, Ravi Tandon, Naren Ramakrishnan

Information Theory

Machine Learning

Information Theory

Feed-forward deep neural networks have been used extensively in various machine learning applications. Developing a precise understanding of the underling behavior of neural networks is crucial for their efficient deployment. In this paper, we use an information theoretic approach to study the flow of information in a neural network and to determine how entropy of information changes between consecutive layers. Moreover, using the Information Bottleneck principle, we develop ...

Find SimilarView on arXiv

On the Temperature of Machine Learning Systems

April 20, 2024

87% Match

Dong Zhang

Machine Learning

Artificial Intelligence

Neural and Evolutionary Comp...

We develop a thermodynamic theory for machine learning (ML) systems. Similar to physical thermodynamic systems which are characterized by energy and entropy, ML systems possess these characteristics as well. This comparison inspire us to integrate the concept of temperature into ML systems grounded in the fundamental principles of thermodynamics, and establish a basic thermodynamic framework for machine learning systems with non-Boltzmann distributions. We introduce the conce...

Find SimilarView on arXiv

Shaping the learning landscape in neural networks around wide flat minima

May 20, 2019

87% Match

Carlo Baldassi, Fabrizio Pittorino, Riccardo Zecchina

Machine Learning

Disordered Systems and Neura...

Machine Learning

Learning in Deep Neural Networks (DNN) takes place by minimizing a non-convex high-dimensional loss function, typically by a stochastic gradient descent (SGD) strategy. The learning process is observed to be able to find good minimizers without getting stuck in local critical points, and that such minimizers are often satisfactory at avoiding overfitting. How these two features can be kept under control in nonlinear devices composed of millions of tunable connections is a pro...

Find SimilarView on arXiv

Full error analysis for the training of deep neural networks

September 30, 2019

87% Match

Christan Beck, Arnulf Jentzen, Benno Kuckuck

Numerical Analysis

Machine Learning

Numerical Analysis

Deep learning algorithms have been applied very successfully in recent years to a range of problems out of reach for classical solution paradigms. Nevertheless, there is no completely rigorous mathematical error and convergence analysis which explains the success of deep learning algorithms. The error of a deep learning algorithm can in many situations be decomposed into three parts, the approximation error, the generalization error, and the optimization error. In this work w...

Find SimilarView on arXiv

Why & When Deep Learning Works: Looking Inside Deep Learnings

May 10, 2017

87% Match

Ronny Ronen

Machine Learning

The Intel Collaborative Research Institute for Computational Intelligence (ICRI-CI) has been heavily supporting Machine Learning and Deep Learning research from its foundation in 2012. We have asked six leading ICRI-CI Deep Learning researchers to address the challenge of "Why & When Deep Learning works", with the goal of looking inside Deep Learning, providing insights on how deep networks function, and uncovering key observations on their expressiveness, limitations, and po...

Find SimilarView on arXiv

Geometry of energy landscapes and the optimizability of deep neural networks

August 1, 2018

86% Match

Simon Becker, Yao Zhang, Alpha A. Lee

Disordered Systems and Neura...

Machine Learning

Deep neural networks are workhorse models in machine learning with multiple layers of non-linear functions composed in series. Their loss function is highly non-convex, yet empirically even gradient descent minimisation is sufficient to arrive at accurate and predictive models. It is hitherto unknown why are deep neural networks easily optimizable. We analyze the energy landscape of a spin glass model of deep neural networks using random matrix theory and algebraic geometry. ...

Find SimilarView on arXiv

Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations

October 24, 2023

86% Match

Leonardo Petrini

Machine Learning

Artificial intelligence, particularly the subfield of machine learning, has seen a paradigm shift towards data-driven models that learn from and adapt to data. This has resulted in unprecedented advancements in various domains such as natural language processing and computer vision, largely attributed to deep learning, a special class of machine learning models. Deep learning arguably surpasses traditional approaches by learning the relevant features from raw data through a s...

Find SimilarView on arXiv

Exploring the Function Space of Deep-Learning Machines

Towards a theory of machine learning

Learning Curves for Deep Neural Networks: A Gaussian Field Theory Perspective

The many faces of deep learning

Flow of Information in Feed-Forward Deep Neural Networks

On the Temperature of Machine Learning Systems

Shaping the learning landscape in neural networks around wide flat minima

Full error analysis for the training of deep neural networks

Why & When Deep Learning Works: Looking Inside Deep Learnings

Geometry of energy landscapes and the optimizability of deep neural networks

Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations