Lower bounds over Boolean inputs for dee...

Learning Deep ReLU Networks Is Fixed-Parameter Tractable

September 28, 2020

86% Match

Sitan Chen, Adam R. Klivans, Raghu Meka

Machine Learning

Data Structures and Algorith...

Machine Learning

We consider the problem of learning an unknown ReLU network with respect to Gaussian inputs and obtain the first nontrivial results for networks of depth more than two. We give an algorithm whose running time is a fixed polynomial in the ambient dimension and some (exponentially large) function of only the network's parameters. Our bounds depend on the number of hidden units, depth, spectral norm of the weight matrices, and Lipschitz constant of the overall network (we show...

Find SimilarView on arXiv

Lower Bounds on the Depth of Integral ReLU Neural Networks via Lattice Polytopes

February 24, 2023

86% Match

Christian Haase, Christoph Hertrich, Georg Loho

Machine Learning

Discrete Mathematics

Neural and Evolutionary Comp...

Combinatorics

Machine Learning

We prove that the set of functions representable by ReLU neural networks with integer weights strictly increases with the network depth while allowing arbitrary width. More precisely, we show that $\lceil\log_2(n)\rceil$ hidden layers are indeed necessary to compute the maximum of $n$ numbers, matching known upper bounds. Our results are based on the known duality between neural networks and Newton polytopes via tropical geometry. The integrality assumption implies that these...

Find SimilarView on arXiv

How Many Neurons Does it Take to Approximate the Maximum?

July 18, 2023

86% Match

Itay Safran, Daniel Reichman, Paul Valiant

Machine Learning

We study the size of a neural network needed to approximate the maximum function over $d$ inputs, in the most basic setting of approximating with respect to the $L_2$ norm, for continuous distributions, for a network that uses ReLU activations. We provide new lower and upper bounds on the width required for approximation across various depths. Our results establish new depth separations between depth 2 and 3, and depth 3 and 5 networks, as well as providing a depth $\mathcal{...

Find SimilarView on arXiv

On the Complexity of Learning Neural Networks

July 14, 2017

86% Match

Le Song, Santosh Vempala, ... , Xie Bo

Machine Learning

Computational Complexity

The stunning empirical successes of neural networks currently lack rigorous theoretical explanation. What form would such an explanation take, in the face of existing complexity-theoretic lower bounds? A first step might be to show that data generated by neural networks with a single hidden layer, smooth activation functions and benign input distributions can be learned efficiently. We demonstrate here a comprehensive lower bound ruling out this possibility: for a wide class ...

Find SimilarView on arXiv

Neural Networks with Small Weights and Depth-Separation Barriers

May 31, 2020

85% Match

Gal Vardi, Ohad Shamir

Machine Learning

Computational Complexity

Machine Learning

In studying the expressiveness of neural networks, an important question is whether there are functions which can only be approximated by sufficiently deep networks, assuming their size is bounded. However, for constant depths, existing results are limited to depths $2$ and $3$, and achieving results for higher depths has been an important open question. In this paper, we focus on feedforward ReLU networks, and prove fundamental barriers to proving such results beyond depth $...

Find SimilarView on arXiv

On the growth of the parameters of approximating ReLU neural networks

June 21, 2024

85% Match

Erion Morina, Martin Holler

Machine Learning

Numerical Analysis

This work focuses on the analysis of fully connected feed forward ReLU neural networks as they approximate a given, smooth function. In contrast to conventionally studied universal approximation properties under increasing architectures, e.g., in terms of width or depth of the networks, we are concerned with the asymptotic growth of the parameters of approximating networks. Such results are of interest, e.g., for error analysis or consistency results for neural network traini...

Find SimilarView on arXiv

Neural networks are a priori biased towards Boolean functions with low entropy

September 25, 2019

85% Match

Chris Mingard, Joar Skalse, Guillermo Valle-Pérez, David Martínez-Rubio, ... , Louis Ard A.

Machine Learning

Understanding the inductive bias of neural networks is critical to explaining their ability to generalise. Here, for one of the simplest neural networks -- a single-layer perceptron with n input neurons, one output neuron, and no threshold bias term -- we prove that upon random initialisation of weights, the a priori probability $P(t)$ that it represents a Boolean function that classifies t points in ${0,1}^n$ as 1 has a remarkably simple form: $P(t) = 2^{-n}$ for $0\leq t < ...

Find SimilarView on arXiv

Sharp Representation Theorems for ReLU Networks with Precise Dependence on Depth

June 7, 2020

85% Match

Guy Bresler, Dheeraj Nagaraj

Machine Learning

We prove sharp dimension-free representation results for neural networks with $D$ ReLU layers under square loss for a class of functions $\mathcal{G}_D$ defined in the paper. These results capture the precise benefits of depth in the following sense: 1. The rates for representing the class of functions $\mathcal{G}_D$ via $D$ ReLU layers is sharp up to constants, as shown by matching lower bounds. 2. For each $D$, $\mathcal{G}_{D} \subseteq \mathcal{G}_{D+1}$ and as $D$ g...

Find SimilarView on arXiv

Learning ReLU networks to high uniform accuracy is intractable

May 26, 2022

85% Match

Julius Berner, Philipp Grohs, Felix Voigtlaender

Machine Learning

Statistical learning theory provides bounds on the necessary number of training samples needed to reach a prescribed accuracy in a learning problem formulated over a given target class. This accuracy is typically measured in terms of a generalization error, that is, an expected value of a given loss function. However, for several applications -- for example in a security-critical context or for problems in the computational sciences -- accuracy in this sense is not sufficient...

Find SimilarView on arXiv

Error bounds for approximations with deep ReLU networks

October 4, 2016

85% Match

Dmitry Yarotsky

Machine Learning

Neural and Evolutionary Comp...

We study expressive power of shallow and deep neural networks with piece-wise linear activation functions. We establish new rigorous upper and lower bounds for the network complexity in the setting of approximations in Sobolev spaces. In particular, we prove that deep ReLU networks more efficiently approximate smooth functions than shallow networks. In the case of approximations of 1D Lipschitz functions we describe adaptive depth-6 network architectures more efficient than t...

Find SimilarView on arXiv

Lower bounds over Boolean inputs for deep neural networks with ReLU gates

Learning Deep ReLU Networks Is Fixed-Parameter Tractable

Lower Bounds on the Depth of Integral ReLU Neural Networks via Lattice Polytopes

How Many Neurons Does it Take to Approximate the Maximum?

On the Complexity of Learning Neural Networks

Neural Networks with Small Weights and Depth-Separation Barriers

On the growth of the parameters of approximating ReLU neural networks

Neural networks are a priori biased towards Boolean functions with low entropy

Sharp Representation Theorems for ReLU Networks with Precise Dependence on Depth

Learning ReLU networks to high uniform accuracy is intractable

Error bounds for approximations with deep ReLU networks