Lower bounds over Boolean inputs for dee...

Computation complexity of deep ReLU neural networks in high-dimensional approximation

March 1, 2021

84% Match

Dinh Dũng, Van Kien Nguyen, Mai Xuan Thao

Numerical Analysis

The purpose of the present paper is to study the computation complexity of deep ReLU neural networks to approximate functions in H\"older-Nikol'skii spaces of mixed smoothness $H_\infty^\alpha(\mathbb{I}^d)$ on the unit cube $\mathbb{I}^d:=[0,1]^d$. In this context, for any function $f\in H_\infty^\alpha(\mathbb{I}^d)$, we explicitly construct nonadaptive and adaptive deep ReLU neural networks having an output that approximates $f$ with a prescribed accuracy $\varepsilon$, an...

Find SimilarView on arXiv

Deep vs. shallow networks : An approximation theory perspective

August 10, 2016

84% Match

Hrushikesh Mhaskar, Tomaso Poggio

Machine Learning

Functional Analysis

The paper briefy reviews several recent results on hierarchical architectures for learning from examples, that may formally explain the conditions under which Deep Convolutional Neural Networks perform much better in function approximation problems than shallow, one-hidden layer architectures. The paper announces new results for a non-smooth activation function - the ReLU function - used in present-day neural networks, as well as for the Gaussian networks. We propose a new de...

Find SimilarView on arXiv

Three Quantization Regimes for ReLU Networks

May 3, 2024

84% Match

Weigutian Ou, Philipp Schenkel, Helmut Bölcskei

Machine Learning

Artificial Intelligence

Information Theory

Machine Learning

Information Theory

We establish the fundamental limits in the approximation of Lipschitz functions by deep ReLU neural networks with finite-precision weights. Specifically, three regimes, namely under-, over-, and proper quantization, in terms of minimax approximation error behavior as a function of network weight precision, are identified. This is accomplished by deriving nonasymptotic tight lower and upper bounds on the minimax approximation error. Notably, in the proper-quantization regime, ...

Find SimilarView on arXiv

Rethinking Arithmetic for Deep Neural Networks

May 7, 2019

84% Match

George A. Constantinides

Machine Learning

Hardware Architecture

Neural and Evolutionary Comp...

Machine Learning

We consider efficiency in the implementation of deep neural networks. Hardware accelerators are gaining interest as machine learning becomes one of the drivers of high-performance computing. In these accelerators, the directed graph describing a neural network can be implemented as a directed graph describing a Boolean circuit. We make this observation precise, leading naturally to an understanding of practical neural networks as discrete functions, and show that so-called bi...

Find SimilarView on arXiv

Deep ReLU Networks Preserve Expected Length

February 21, 2021

84% Match

Boris Hanin, Ryan Jeong, David Rolnick

Machine Learning

Assessing the complexity of functions computed by a neural network helps us understand how the network will learn and generalize. One natural measure of complexity is how the network distorts length - if the network takes a unit-length curve as input, what is the length of the resulting curve of outputs? It has been widely believed that this length grows exponentially in network depth. We prove that in fact this is not the case: the expected length distortion does not grow wi...

Find SimilarView on arXiv

The Computational Complexity of Training ReLU(s)

October 9, 2018

84% Match

Pasin Manurangsi, Daniel Reichman

Computational Complexity

We consider the computational complexity of training depth-2 neural networks composed of rectified linear units (ReLUs). We show that, even for the case of a single ReLU, finding a set of weights that minimizes the squared error (even approximately) for a given training set is NP-hard. We also show that for a simple network consisting of two ReLUs, the error minimization problem is NP-hard, even in the realizable case. We complement these hardness results by showing that, whe...

Find SimilarView on arXiv

Nonlinear Approximation and (Deep) ReLU Networks

May 5, 2019

84% Match

I. Daubechies, R. DeVore, S. Foucart, ... , Petrova G.

Machine Learning

This article is concerned with the approximation and expressive powers of deep neural networks. This is an active research area currently producing many interesting papers. The results most commonly found in the literature prove that neural networks approximate functions with classical smoothness to the same accuracy as classical linear methods of approximation, e.g. approximation by polynomials or by piecewise polynomials on prescribed partitions. However, approximation by n...

Find SimilarView on arXiv

On the Optimal Memorization Power of ReLU Neural Networks

October 7, 2021

84% Match

Gal Vardi, Gilad Yehudai, Ohad Shamir

Machine Learning

Neural and Evolutionary Comp...

Machine Learning

We study the memorization power of feedforward ReLU neural networks. We show that such networks can memorize any $N$ points that satisfy a mild separability assumption using $\tilde{O}\left(\sqrt{N}\right)$ parameters. Known VC-dimension upper bounds imply that memorizing $N$ samples requires $\Omega(\sqrt{N})$ parameters, and hence our construction is optimal up to logarithmic factors. We also give a generalized construction for networks with depth bounded by $1 \leq L \leq ...

Find SimilarView on arXiv

Information Theoretic Lower Bounds for Feed-Forward Fully-Connected Deep Networks

July 1, 2020

84% Match

Xiaochen Yang, Jean Honorio

Machine Learning

In this paper, we study the sample complexity lower bounds for the exact recovery of parameters and for a positive excess risk of a feed-forward, fully-connected neural network for binary classification, using information-theoretic tools. We prove these lower bounds by the existence of a generative network characterized by a backwards data generating process, where the input is generated based on the binary output, and the network is parametrized by weight parameters for the ...

Find SimilarView on arXiv

Tight Hardness Results for Training Depth-2 ReLU Networks

November 27, 2020

84% Match

Surbhi Goel, Adam Klivans, ... , Reichman Daniel

Machine Learning

Computational Complexity

Data Structures and Algorith...

We prove several hardness results for training depth-2 neural networks with the ReLU activation function; these networks are simply weighted sums (that may include negative coefficients) of ReLUs. Our goal is to output a depth-2 neural network that minimizes the square loss with respect to a given training set. We prove that this problem is NP-hard already for a network with a single ReLU. We also prove NP-hardness for outputting a weighted sum of $k$ ReLUs minimizing the squ...

Find SimilarView on arXiv

Lower bounds over Boolean inputs for deep neural networks with ReLU gates

Computation complexity of deep ReLU neural networks in high-dimensional approximation

Deep vs. shallow networks : An approximation theory perspective

Three Quantization Regimes for ReLU Networks

Rethinking Arithmetic for Deep Neural Networks

Deep ReLU Networks Preserve Expected Length

The Computational Complexity of Training ReLU(s)

Nonlinear Approximation and (Deep) ReLU Networks

On the Optimal Memorization Power of ReLU Neural Networks

Information Theoretic Lower Bounds for Feed-Forward Fully-Connected Deep Networks

Tight Hardness Results for Training Depth-2 ReLU Networks