ID: 1711.03073

Lower bounds over Boolean inputs for deep neural networks with ReLU gates

November 8, 2017

View on ArXiv

Similar papers 2

Learning Deep ReLU Networks Is Fixed-Parameter Tractable

September 28, 2020

86% Match
Sitan Chen, Adam R. Klivans, Raghu Meka
Machine Learning
Data Structures and Algorith...
Machine Learning

We consider the problem of learning an unknown ReLU network with respect to Gaussian inputs and obtain the first nontrivial results for networks of depth more than two. We give an algorithm whose running time is a fixed polynomial in the ambient dimension and some (exponentially large) function of only the network's parameters. Our bounds depend on the number of hidden units, depth, spectral norm of the weight matrices, and Lipschitz constant of the overall network (we show...

Find SimilarView on arXiv

Lower Bounds on the Depth of Integral ReLU Neural Networks via Lattice Polytopes

February 24, 2023

86% Match
Christian Haase, Christoph Hertrich, Georg Loho
Machine Learning
Discrete Mathematics
Neural and Evolutionary Comp...
Combinatorics
Machine Learning

We prove that the set of functions representable by ReLU neural networks with integer weights strictly increases with the network depth while allowing arbitrary width. More precisely, we show that $\lceil\log_2(n)\rceil$ hidden layers are indeed necessary to compute the maximum of $n$ numbers, matching known upper bounds. Our results are based on the known duality between neural networks and Newton polytopes via tropical geometry. The integrality assumption implies that these...

Find SimilarView on arXiv

How Many Neurons Does it Take to Approximate the Maximum?

July 18, 2023

86% Match
Itay Safran, Daniel Reichman, Paul Valiant
Machine Learning

We study the size of a neural network needed to approximate the maximum function over $d$ inputs, in the most basic setting of approximating with respect to the $L_2$ norm, for continuous distributions, for a network that uses ReLU activations. We provide new lower and upper bounds on the width required for approximation across various depths. Our results establish new depth separations between depth 2 and 3, and depth 3 and 5 networks, as well as providing a depth $\mathcal{...

Find SimilarView on arXiv

On the Complexity of Learning Neural Networks

July 14, 2017

86% Match
Le Song, Santosh Vempala, ... , Xie Bo
Machine Learning
Computational Complexity

The stunning empirical successes of neural networks currently lack rigorous theoretical explanation. What form would such an explanation take, in the face of existing complexity-theoretic lower bounds? A first step might be to show that data generated by neural networks with a single hidden layer, smooth activation functions and benign input distributions can be learned efficiently. We demonstrate here a comprehensive lower bound ruling out this possibility: for a wide class ...

Find SimilarView on arXiv

Neural Networks with Small Weights and Depth-Separation Barriers

May 31, 2020

85% Match
Gal Vardi, Ohad Shamir
Machine Learning
Computational Complexity
Machine Learning

In studying the expressiveness of neural networks, an important question is whether there are functions which can only be approximated by sufficiently deep networks, assuming their size is bounded. However, for constant depths, existing results are limited to depths $2$ and $3$, and achieving results for higher depths has been an important open question. In this paper, we focus on feedforward ReLU networks, and prove fundamental barriers to proving such results beyond depth $...

Find SimilarView on arXiv

On the growth of the parameters of approximating ReLU neural networks

June 21, 2024

85% Match
Erion Morina, Martin Holler
Machine Learning
Numerical Analysis
Numerical Analysis

This work focuses on the analysis of fully connected feed forward ReLU neural networks as they approximate a given, smooth function. In contrast to conventionally studied universal approximation properties under increasing architectures, e.g., in terms of width or depth of the networks, we are concerned with the asymptotic growth of the parameters of approximating networks. Such results are of interest, e.g., for error analysis or consistency results for neural network traini...

Find SimilarView on arXiv

Neural networks are a priori biased towards Boolean functions with low entropy

September 25, 2019

85% Match
Chris Mingard, Joar Skalse, Guillermo Valle-Pérez, David Martínez-Rubio, ... , Louis Ard A.
Machine Learning
Machine Learning

Understanding the inductive bias of neural networks is critical to explaining their ability to generalise. Here, for one of the simplest neural networks -- a single-layer perceptron with n input neurons, one output neuron, and no threshold bias term -- we prove that upon random initialisation of weights, the a priori probability $P(t)$ that it represents a Boolean function that classifies t points in ${0,1}^n$ as 1 has a remarkably simple form: $P(t) = 2^{-n}$ for $0\leq t < ...

Find SimilarView on arXiv

Sharp Representation Theorems for ReLU Networks with Precise Dependence on Depth

June 7, 2020

85% Match
Guy Bresler, Dheeraj Nagaraj
Machine Learning
Machine Learning

We prove sharp dimension-free representation results for neural networks with $D$ ReLU layers under square loss for a class of functions $\mathcal{G}_D$ defined in the paper. These results capture the precise benefits of depth in the following sense: 1. The rates for representing the class of functions $\mathcal{G}_D$ via $D$ ReLU layers is sharp up to constants, as shown by matching lower bounds. 2. For each $D$, $\mathcal{G}_{D} \subseteq \mathcal{G}_{D+1}$ and as $D$ g...

Find SimilarView on arXiv

Learning ReLU networks to high uniform accuracy is intractable

May 26, 2022

85% Match
Julius Berner, Philipp Grohs, Felix Voigtlaender
Machine Learning
Machine Learning

Statistical learning theory provides bounds on the necessary number of training samples needed to reach a prescribed accuracy in a learning problem formulated over a given target class. This accuracy is typically measured in terms of a generalization error, that is, an expected value of a given loss function. However, for several applications -- for example in a security-critical context or for problems in the computational sciences -- accuracy in this sense is not sufficient...

Find SimilarView on arXiv

Error bounds for approximations with deep ReLU networks

October 4, 2016

85% Match
Dmitry Yarotsky
Machine Learning
Neural and Evolutionary Comp...

We study expressive power of shallow and deep neural networks with piece-wise linear activation functions. We establish new rigorous upper and lower bounds for the network complexity in the setting of approximations in Sobolev spaces. In particular, we prove that deep ReLU networks more efficiently approximate smooth functions than shallow networks. In the case of approximations of 1D Lipschitz functions we describe adaptive depth-6 network architectures more efficient than t...

Find SimilarView on arXiv