Neural networks are a priori biased towa...

Do Quantum Neural Networks have Simplicity Bias?

July 3, 2024

84% Match

Jessica Pointing

Artificial Intelligence

Machine Learning

Applied Physics

One hypothesis for the success of deep neural networks (DNNs) is that they are highly expressive, which enables them to be applied to many problems, and they have a strong inductive bias towards solutions that are simple, known as simplicity bias, which allows them to generalise well on unseen data because most real-world data is structured (i.e. simple). In this work, we explore the inductive bias and expressivity of quantum neural networks (QNNs), which gives us a way to co...

Find SimilarView on arXiv

Limits on representing Boolean functions by linear combinations of simple functions: thresholds, ReLUs, and low-degree polynomials

February 26, 2018

84% Match

R. Ryan Williams

Computational Complexity

Discrete Mathematics

Neural and Evolutionary Comp...

We consider the problem of representing Boolean functions exactly by "sparse" linear combinations (over $\mathbb{R}$) of functions from some "simple" class ${\cal C}$. In particular, given ${\cal C}$ we are interested in finding low-complexity functions lacking sparse representations. When ${\cal C}$ is the set of PARITY functions or the set of conjunctions, this sort of problem has a well-understood answer, the problem becomes interesting when ${\cal C}$ is "overcomplete" an...

Find SimilarView on arXiv

A measure for the complexity of Boolean functions related to their implementation in neural networks

November 9, 2001

84% Match

Leonardo Franco

Disordered Systems and Neura...

Neurons and Cognition

We define a measure for the complexity of Boolean functions related to their implementation in neural networks, and in particular close related to the generalization ability that could be obtained through the learning process. The measure is computed through the calculus of the number of neighbor examples that differ in their output value. Pairs of these examples have been previously shown to be part of the minimum size training set needed to obtain perfect generalization in ...

Find SimilarView on arXiv

Random Feature Amplification: Feature Learning and Generalization in Neural Networks

February 15, 2022

83% Match

Spencer Frei, Niladri S. Chatterji, Peter L. Bartlett

Machine Learning

Statistics Theory

Machine Learning

Statistics Theory

In this work, we provide a characterization of the feature-learning process in two-layer ReLU networks trained by gradient descent on the logistic loss following random initialization. We consider data with binary labels that are generated by an XOR-like function of the input features. We permit a constant fraction of the training labels to be corrupted by an adversary. We show that, although linear classifiers are no better than random guessing for the distribution we consid...

Find SimilarView on arXiv

Deep ReLU Networks Have Surprisingly Few Activation Patterns

June 3, 2019

83% Match

Boris Hanin, David Rolnick

Machine Learning

Statistics Theory

The success of deep networks has been attributed in part to their expressivity: per parameter, deep networks can approximate a richer class of functions than shallow networks. In ReLU networks, the number of activation patterns is one measure of expressivity; and the maximum number of patterns grows exponentially with the depth. However, recent work has showed that the practical expressivity of deep networks - the functions they can learn rather than express - is often far fr...

Find SimilarView on arXiv

A Study of the Mathematics of Deep Learning

April 28, 2021

83% Match

Anirbit Mukherjee

Machine Learning

Optimization and Control

Applications

Machine Learning

"Deep Learning"/"Deep Neural Nets" is a technological marvel that is now increasingly deployed at the cutting-edge of artificial intelligence tasks. This dramatic success of deep learning in the last few years has been hinged on an enormous amount of heuristics and it has turned out to be a serious mathematical challenge to be able to rigorously explain them. In this thesis, submitted to the Department of Applied Mathematics and Statistics, Johns Hopkins University we take se...

Find SimilarView on arXiv

Towards Exact Computation of Inductive Bias

June 22, 2024

83% Match

Akhilan Boopathy, William Yue, Jaedong Hwang, ... , Fiete Ila

Machine Learning

Much research in machine learning involves finding appropriate inductive biases (e.g. convolutional neural networks, momentum-based optimizers, transformers) to promote generalization on tasks. However, quantification of the amount of inductive bias associated with these architectures and hyperparameters has been limited. We propose a novel method for efficiently computing the inductive bias required for generalization on a task with a fixed training data budget; formally, th...

Find SimilarView on arXiv

The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks

February 5, 2022

83% Match

Samuel Lippl, L. F. Abbott, SueYeon Chung

Machine Learning

Neurons and Cognition

Understanding the asymptotic behavior of gradient-descent training of deep neural networks is essential for revealing inductive biases and improving network performance. We derive the infinite-time training limit of a mathematically tractable class of deep nonlinear neural networks, gated linear networks (GLNs), and generalize these results to gated networks described by general homogeneous polynomials. We study the implications of our results, focusing first on two-layer GLN...

Find SimilarView on arXiv

Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated Inputs

June 10, 2023

83% Match

Dmitry Chistikov, Matthias Englert, Ranko Lazic

Machine Learning

We prove that, for the fundamental regression task of learning a single neuron, training a one-hidden layer ReLU network of any width by gradient flow from a small initialisation converges to zero loss and is implicitly biased to minimise the rank of network parameters. By assuming that the training points are correlated with the teacher neuron, we complement previous work that considered orthogonal datasets. Our results are based on a detailed non-asymptotic analysis of the ...

Find SimilarView on arXiv

Adaptive n-ary Activation Functions for Probabilistic Boolean Logic

March 16, 2022

83% Match

Jed A. Duersch, Thomas A. Catanach, Niladri Das

Machine Learning

Artificial Intelligence

Balancing model complexity against the information contained in observed data is the central challenge to learning. In order for complexity-efficient models to exist and be discoverable in high dimensions, we require a computational framework that relates a credible notion of complexity to simple parameter representations. Further, this framework must allow excess complexity to be gradually removed via gradient-based optimization. Our n-ary, or n-argument, activation function...

Find SimilarView on arXiv

Neural networks are a priori biased towards Boolean functions with low entropy

Do Quantum Neural Networks have Simplicity Bias?

Limits on representing Boolean functions by linear combinations of simple functions: thresholds, ReLUs, and low-degree polynomials

A measure for the complexity of Boolean functions related to their implementation in neural networks

Random Feature Amplification: Feature Learning and Generalization in Neural Networks

Deep ReLU Networks Have Surprisingly Few Activation Patterns

A Study of the Mathematics of Deep Learning

Towards Exact Computation of Inductive Bias

The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks

Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated Inputs

Adaptive n-ary Activation Functions for Probabilistic Boolean Logic