March 31, 2013
There are enormous amount of examples of Computation in nature, exemplified across multiple species in biology. One crucial aim for these computations across all life forms their ability to learn and thereby increase the chance of their survival. In the current paper a formal definition of autonomous learning is proposed. From that definition we establish a Turing Machine model for learning, where rule tables can be added or deleted, but can not be modified. Sequential and pa...
January 22, 2024
Determining the capacity $\alpha_c$ of the Binary Perceptron is a long-standing problem. Krauth and Mezard (1989) conjectured an explicit value of $\alpha_c$, approximately equal to .833, and a rigorous lower bound matching this prediction was recently established by Ding and Sun (2019). Regarding the upper bound, Kim and Roche (1998) and Talagrand (1999) independently showed that $\alpha_c$ < .996, while Krauth and Mezard outlined an argument which can be used to show that $...
October 5, 2014
It is well-known that neural networks are computationally hard to train. On the other hand, in practice, modern day neural networks are trained efficiently using SGD and a variety of tricks that include different activation functions (e.g. ReLU), over-specification (i.e., train networks which are larger than needed), and regularization. In this paper we revisit the computational complexity of training neural networks from a modern perspective. We provide both positive and neg...
August 29, 1996
We investigate the VC-dimension of the perceptron and simple two-layer networks like the committee- and the parity-machine with weights restricted to values $\pm1$. For binary inputs, the VC-dimension is determined by atypical pattern sets, i.e. it cannot be found by replica analysis or numerical Monte Carlo sampling. For small systems, exhaustive enumerations yield exact results. For systems that are too large for enumerations, number theoretic arguments give lower bounds fo...
September 1, 2022
Neural networks (NNs) struggle to efficiently solve certain problems, such as learning parities, even when there are simple learning algorithms for those problems. Can NNs discover learning algorithms on their own? We exhibit a NN architecture that, in polynomial time, learns as well as any efficient learning algorithm describable by a constant-sized program. For example, on parity problems, the NN learns as well as Gaussian elimination, an efficient algorithm that can be suc...
October 23, 2000
An overview is given about the statistical physics of neural networks generating and analysing time series. Storage capacity, bit and sequence generation, prediction error, antipredictable sequences, interacting perceptrons and the application on the minority game are discussed. Finally, as a demonstration a perceptron predicts bit sequences produced by human beings.
September 15, 2017
Understanding the memory capacity of neural networks remains a challenging problem in implementing artificial intelligence systems. In this paper, we address the notion of capacity with respect to Hopfield networks and propose a dynamic approach to monitoring a network's capacity. We define our understanding of capacity as the maximum number of stored patterns which can be retrieved when probed by the stored patterns. Prior work in this area has presented static expressions d...
January 18, 1995
We analytically derive the geometrical structure of the weight space in multilayer neural networks (MLN), in terms of the volumes of couplings associated to the internal representations of the training set. Focusing on the parity and committee machines, we deduce their learning and generalization capabilities both reinterpreting some known properties and finding new exact results. The relationship between our approach and information theory as well as the Mitchison--Durbin ca...
December 2, 2022
We study the compute-optimal trade-off between model and training data set sizes for large neural networks. Our result suggests a linear relation similar to that supported by the empirical analysis of chinchilla. While that work studies transformer-based large language models trained on the MassiveText corpus gopher, as a starting point for development of a mathematical theory, we focus on a simpler learning model and data generating process, each based on a neural network wi...
April 27, 2023
Although different architectures of quantum perceptrons have been recently put forward, the capabilities of such quantum devices versus their classical counterparts remain debated. Here, we consider random patterns and targets independently distributed with biased probabilities and investigate the storage capacity of a continuous quantum perceptron model that admits a classical limit, thus facilitating the comparison of performances. Such a more general context extends a prev...