Lower bounds over Boolean inputs for dee...

How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?

November 27, 2019

84% Match

Zixiang Chen, Yuan Cao, ... , Gu Quanquan

Machine Learning

Optimization and Control

Machine Learning

A recent line of research on deep learning focuses on the extremely over-parameterized setting, and shows that when the network width is larger than a high degree polynomial of the training sample size $n$ and the inverse of the target error $\epsilon^{-1}$, deep neural networks learned by (stochastic) gradient descent enjoy nice optimization and generalization guarantees. Very recently, it is shown that under certain margin assumptions on the training data, a polylogarithmic...

Find SimilarView on arXiv

Neural networks with linear threshold activations: structure and algorithms

November 15, 2021

84% Match

Sammy Khalife, Hongyu Cheng, Amitabh Basu

Machine Learning

In this article we present new results on neural networks with linear threshold activation functions. We precisely characterize the class of functions that are representable by such neural networks and show that 2 hidden layers are necessary and sufficient to represent any function representable in the class. This is a surprising result in the light of recent exact representability investigations for neural networks using other popular activation functions like rectified line...

Find SimilarView on arXiv

Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks

March 8, 2017

84% Match

Peter L. Bartlett, Nick Harvey, ... , Mehrabian Abbas

Machine Learning

We prove new upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function. These bounds are tight for almost the entire range of parameters. Letting $W$ be the number of weights and $L$ be the number of layers, we prove that the VC-dimension is $O(W L \log(W))$, and provide examples with VC-dimension $\Omega( W L \log(W/L) )$. This improves both the previously known upper bounds and lower bounds. In terms of the number $U$ of non-linear...

Find SimilarView on arXiv

Hardness of Noise-Free Learning for Two-Hidden-Layer Neural Networks

February 10, 2022

84% Match

Sitan Chen, Aravind Gollakota, ... , Meka Raghu

Machine Learning

Computational Complexity

Machine Learning

We give superpolynomial statistical query (SQ) lower bounds for learning two-hidden-layer ReLU networks with respect to Gaussian inputs in the standard (noise-free) model. No general SQ lower bounds were known for learning ReLU networks of any depth in this setting: previous SQ lower bounds held only for adversarial noise models (agnostic learning) or restricted models such as correlational SQ. Prior work hinted at the impossibility of our result: Vempala and Wilmes showed ...

Find SimilarView on arXiv

Descriptive complexity for neural networks via Boolean networks

August 1, 2023

84% Match

Veeti Ahvonen, Damian Heiman, Antti Kuusisto

Computational Complexity

Logic in Computer Science

We investigate the descriptive complexity of a class of neural networks with unrestricted topologies and piecewise polynomial activation functions. We consider the general scenario where the running time is unlimited and floating-point numbers are used for simulating reals. We characterize these neural networks with a rule-based logic for Boolean networks. In particular, we show that the sizes of the neural networks and the corresponding Boolean rule formulae are polynomially...

Find SimilarView on arXiv

Lower bounds for artificial neural network approximations: A proof that shallow neural networks fail to overcome the curse of dimensionality

March 7, 2021

84% Match

Philipp Grohs, Shokhrukh Ibragimov, ... , Koppensteiner Sarah

Numerical Analysis

Artificial neural networks (ANNs) have become a very powerful tool in the approximation of high-dimensional functions. Especially, deep ANNs, consisting of a large number of hidden layers, have been very successfully used in a series of practical relevant computational problems involving high-dimensional input data ranging from classification tasks in supervised learning to optimal decision problems in reinforcement learning. There are also a number of mathematical results in...

Find SimilarView on arXiv

Depth Separations in Neural Networks: Separating the Dimension from the Accuracy

February 11, 2024

84% Match

Itay Safran, Daniel Reichman, Paul Valiant

Machine Learning

We prove an exponential separation between depth 2 and depth 3 neural networks, when approximating an $\mathcal{O}(1)$-Lipschitz target function to constant accuracy, with respect to a distribution with support in $[0,1]^{d}$, assuming exponentially bounded weights. This addresses an open problem posed in \citet{safran2019depth}, and proves that the curse of dimensionality manifests in depth 2 approximation, even in cases where the target function can be represented efficient...

Find SimilarView on arXiv

Near-Optimal Lower Bounds on the Threshold Degree and Sign-Rank of AC^0

January 4, 2019

84% Match

Alexander A. Sherstov, Pei Wu

Computational Complexity

The threshold degree of a Boolean function $f\colon\{0,1\}^n\to\{0,1\}$ is the minimum degree of a real polynomial $p$ that represents $f$ in sign: $\mathrm{sgn}\; p(x)=(-1)^{f(x)}.$ A related notion is sign-rank, defined for a Boolean matrix $F=[F_{ij}]$ as the minimum rank of a real matrix $M$ with $\mathrm{sgn}\; M_{ij}=(-1)^{F_{ij}}$. Determining the maximum threshold degree and sign-rank achievable by constant-depth circuits ($\text{AC}^{0}$) is a well-known and extensiv...

Find SimilarView on arXiv

Deep Representation with ReLU Neural Networks

March 29, 2019

84% Match

Andreas Heinecke, Wen-Liang Hwang

Machine Learning

Signal Processing

Machine Learning

We consider deep feedforward neural networks with rectified linear units from a signal processing perspective. In this view, such representations mark the transition from using a single (data-driven) linear representation to utilizing a large collection of affine linear representations tailored to particular regions of the signal space. This paper provides a precise description of the individual affine linear representations and corresponding domain regions that the (data-dri...

Find SimilarView on arXiv

Expressivity and Approximation Properties of Deep Neural Networks with ReLU$^k$ Activation

December 27, 2023

84% Match

Juncai He, Tong Mao, Jinchao Xu

Machine Learning

Numerical Analysis

Neural and Evolutionary Comp...

Numerical Analysis

In this paper, we investigate the expressivity and approximation properties of deep neural networks employing the ReLU$^k$ activation function for $k \geq 2$. Although deep ReLU networks can approximate polynomials effectively, deep ReLU$^k$ networks have the capability to represent higher-degree polynomials precisely. Our initial contribution is a comprehensive, constructive proof for polynomial representation using deep ReLU$^k$ networks. This allows us to establish an uppe...

Find SimilarView on arXiv

Lower bounds over Boolean inputs for deep neural networks with ReLU gates

How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?

Neural networks with linear threshold activations: structure and algorithms

Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks

Hardness of Noise-Free Learning for Two-Hidden-Layer Neural Networks

Descriptive complexity for neural networks via Boolean networks

Lower bounds for artificial neural network approximations: A proof that shallow neural networks fail to overcome the curse of dimensionality

Depth Separations in Neural Networks: Separating the Dimension from the Accuracy

Near-Optimal Lower Bounds on the Threshold Degree and Sign-Rank of AC^0

Deep Representation with ReLU Neural Networks

Expressivity and Approximation Properties of Deep Neural Networks with ReLU$^k$ Activation