Neural networks are a priori biased towa...

Random deep neural networks are biased towards simple functions

December 25, 2018

88% Match

Palma Giacomo De, Bobak Toussi Kiani, Seth Lloyd

Machine Learning

Disordered Systems and Neura...

Machine Learning

Mathematical Physics

We prove that the binary classifiers of bit strings generated by random wide deep neural networks with ReLU activation function are biased towards simple functions. The simplicity is captured by the following two properties. For any given input bit string, the average Hamming distance of the closest input bit string with a different classification is at least sqrt(n / (2{\pi} log n)), where n is the length of the string. Moreover, if the bits of the initial string are flipped...

Find SimilarView on arXiv

Space of Functions Computed by Deep-Layered Machines

April 19, 2020

86% Match

Alexander Mozeika, Bo Li, David Saad

Machine Learning

Disordered Systems and Neura...

Machine Learning

We study the space of functions computed by random-layered machines, including deep neural networks and Boolean circuits. Investigating the distribution of Boolean functions computed on the recurrent and layer-dependent architectures, we find that it is the same in both models. Depending on the initial conditions and computing elements used, we characterize the space of functions computed at the large depth limit and show that the macroscopic entropy of Boolean functions is e...

Find Similar View on arXiv

Lower bounds over Boolean inputs for deep neural networks with ReLU gates

November 8, 2017

85% Match

Anirbit Mukherjee, Amitabh Basu

Computational Complexity

Machine Learning

Neural and Evolutionary Comp...

Machine Learning

Motivated by the resurgence of neural networks in being able to solve complex learning tasks we undertake a study of high depth networks using ReLU gates which implement the function $x \mapsto \max\{0,x\}$. We try to understand the role of depth in such neural networks by showing size lowerbounds against such network architectures in parameter regimes hitherto unexplored. In particular we show the following two main results about neural nets computing Boolean functions of in...

Find SimilarView on arXiv

Exploiting the equivalence between quantum neural networks and perceptrons

July 5, 2024

85% Match

Chris Mingard, Jessica Pointing, Charles London, ... , Louis Ard A.

Artificial Intelligence

Quantum machine learning models based on parametrized quantum circuits, also called quantum neural networks (QNNs), are considered to be among the most promising candidates for applications on near-term quantum devices. Here we explore the expressivity and inductive bias of QNNs by exploiting an exact mapping from QNNs with inputs $x$ to classical perceptrons acting on $x \otimes x$ (generalised to complex inputs). The simplicity of the perceptron architecture allows us to pr...

Find SimilarView on arXiv

Do deep neural networks have an inbuilt Occam's razor?

April 13, 2023

85% Match

Chris Mingard, Henry Rees, ... , Louis Ard A.

Machine Learning

Artificial Intelligence

Machine Learning

The remarkable performance of overparameterized deep neural networks (DNNs) must arise from an interplay between network architecture, training algorithms, and structure in the data. To disentangle these three components, we apply a Bayesian picture, based on the functions expressed by a DNN, to supervised learning. The prior over functions is determined by the network, and is varied by exploiting a transition between ordered and chaotic regimes. For Boolean function classifi...

Find SimilarView on arXiv

Deep learning generalizes because the parameter-function map is biased towards simple functions

May 22, 2018

85% Match

Guillermo Valle-Pérez, Chico Q. Camargo, Ard A. Louis

Machine Learning

Artificial Intelligence

Machine Learning

Neural and Evolutionary Comp...

Deep neural networks (DNNs) generalize remarkably well without explicit regularization even in the strongly over-parametrized regime where classical learning theory would instead predict that they would severely overfit. While many proposals for some kind of implicit regularization have been made to rationalise this success, there is no consensus for the fundamental reason why DNNs do not strongly overfit. In this paper, we provide a new explanation. By applying a very genera...

Find SimilarView on arXiv

Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions

November 22, 2022

85% Match

Satwik Bhattamishra, Arkil Patel, ... , Blunsom Phil

Machine Learning

Computation and Language

Despite the widespread success of Transformers on NLP tasks, recent works have found that they struggle to model several formal languages when compared to recurrent models. This raises the question of why Transformers perform well in practice and whether they have any properties that enable them to generalize better than recurrent models. In this work, we conduct an extensive empirical study on Boolean functions to demonstrate the following: (i) Random Transformers are relati...

Find SimilarView on arXiv

The empirical size of trained neural networks

November 29, 2016

84% Match

Kevin K. Chen, Anthony Gamst, Alden Walker

Machine Learning

ReLU neural networks define piecewise linear functions of their inputs. However, initializing and training a neural network is very different from fitting a linear spline. In this paper, we expand empirically upon previous theoretical work to demonstrate features of trained neural networks. Standard network initialization and training produce networks vastly simpler than a naive parameter count would suggest and can impart odd features to the trained network. However, we also...

Find SimilarView on arXiv

On the Spectral Bias of Neural Networks

June 22, 2018

84% Match

Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred A. Hamprecht, ... , Courville Aaron

Machine Learning

Neural networks are known to be a class of highly expressive functions able to fit even random input-output mappings with $100\%$ accuracy. In this work, we present properties of neural networks that complement this aspect of expressivity. By using tools from Fourier analysis, we show that deep ReLU networks are biased towards low frequency functions, meaning that they cannot have local fluctuations without affecting their global behavior. Intuitively, this property is in lin...

Find SimilarView on arXiv

Bayesian Perceptron: Towards fully Bayesian Neural Networks

September 3, 2020

84% Match

Marco F. Huber

Machine Learning

Artificial neural networks (NNs) have become the de facto standard in machine learning. They allow learning highly nonlinear transformations in a plethora of applications. However, NNs usually only provide point estimates without systematically quantifying corresponding uncertainties. In this paper a novel approach towards fully Bayesian NNs is proposed, where training and predictions of a perceptron are performed within the Bayesian inference framework in closed-form. The we...

Find SimilarView on arXiv

Neural networks are a priori biased towards Boolean functions with low entropy

Random deep neural networks are biased towards simple functions

Space of Functions Computed by Deep-Layered Machines

Lower bounds over Boolean inputs for deep neural networks with ReLU gates

Exploiting the equivalence between quantum neural networks and perceptrons

Do deep neural networks have an inbuilt Occam's razor?

Deep learning generalizes because the parameter-function map is biased towards simple functions

Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions

The empirical size of trained neural networks

On the Spectral Bias of Neural Networks

Bayesian Perceptron: Towards fully Bayesian Neural Networks