November 9, 2001
Similar papers 4
September 13, 2020
Computational learning theory states that many classes of boolean formulas are learnable in polynomial time. This paper addresses the understudied subject of how, in practice, such formulas can be learned by deep neural networks. Specifically, we analyze boolean formulas associated with model-sampling benchmarks, combinatorial optimization problems, and random 3-CNFs with varying degrees of constrainedness. Our experiments indicate that: (i) neural learning generalizes better...
September 1, 2022
Deep neural networks use multiple layers of functions to map an object represented by an input vector progressively to different representations, and with sufficient training, eventually to a single score for each class that is the output of the final decision function. Ideally, in this output space, the objects of different classes achieve maximum separation. Motivated by the need to better understand the inner working of a deep neural network, we analyze the effectiveness o...
February 13, 2018
The learnability of different neural architectures can be characterized directly by computable measures of data complexity. In this paper, we reframe the problem of architecture selection as understanding how data determines the most expressive and generalizable architectures suited to that data, beyond inductive bias. After suggesting algebraic topology as a measure for data complexity, we show that the power of a network to express the topological complexity of a dataset in...
September 17, 2020
For supervised learning models, the analysis of generalization ability (generalizability) is vital because the generalizability expresses how well a model will perform on unseen data. Traditional generalization methods, such as the VC dimension, do not apply to deep neural network (DNN) models. Thus, new theories to explain the generalizability of DNNs are required. In this study, we hypothesize that the DNN with a simpler decision boundary has better generalizability by the ...
March 2, 2017
We suggest analyzing neural networks through the prism of space constraints. We observe that most training algorithms applied in practice use bounded memory, which enables us to use a new notion introduced in the study of space-time tradeoffs that we call mixing complexity. This notion was devised in order to measure the (in)ability to learn using a bounded-memory algorithm. In this paper we describe how we use mixing complexity to obtain new results on what can and cannot be...
September 25, 2019
Understanding the inductive bias of neural networks is critical to explaining their ability to generalise. Here, for one of the simplest neural networks -- a single-layer perceptron with n input neurons, one output neuron, and no threshold bias term -- we prove that upon random initialisation of weights, the a priori probability $P(t)$ that it represents a Boolean function that classifies t points in ${0,1}^n$ as 1 has a remarkably simple form: $P(t) = 2^{-n}$ for $0\leq t < ...
August 1, 2023
This manuscript includes some classical results we select apart from the new results we've found on the Analysis of Boolean Functions and Fourier-Entropy-Influence conjecture. We try to ensure the self-completeness of this work so that readers could probably read it independently. Among the new results, what is the most remarkable is that we prove that the entropy of a boolean function $f$ could be upper bounded by $O(I(f))+O(\sum\limits_{k}I_k(f)\log (1/I_k(f)))$.
September 5, 2016
Although neural networks are routinely and successfully trained in practice using simple gradient-based methods, most existing theoretical results are negative, showing that learning such networks is difficult, in a worst-case sense over all data distributions. In this paper, we take a more nuanced view, and consider whether specific assumptions on the "niceness" of the input distribution, or "niceness" of the target function (e.g. in terms of smoothness, non-degeneracy, inco...
May 28, 2022
We discuss ways in which tools from topology can be used to derive lower bounds for the circuit complexity of Boolean functions.
October 5, 2014
It is well-known that neural networks are computationally hard to train. On the other hand, in practice, modern day neural networks are trained efficiently using SGD and a variety of tricks that include different activation functions (e.g. ReLU), over-specification (i.e., train networks which are larger than needed), and regularization. In this paper we revisit the computational complexity of training neural networks from a modern perspective. We provide both positive and neg...