April 1, 2001
Similar papers 4
April 11, 1997
The cooperative behaviour of interacting neurons and synapses is studied using models and methods from statistical physics. The competition between training error and entropy may lead to discontinuous properties of the neural network. This is demonstrated for a few examples: Perceptron, associative memory, learning from examples, generalization, multilayer networks, structure recognition, Bayesian estimate, on-line training, noise estimation and time series generation.
May 10, 2023
The $\textit{von Neumann Computer Architecture}$ has a distinction between computation and memory. In contrast, the brain has an integrated architecture where computation and memory are indistinguishable. Motivated by the architecture of the brain, we propose a model of $\textit{associative computation}$ where memory is defined by a set of vectors in $\mathbb{R}^n$ (that we call $\textit{anchors}$), computation is performed by convergence from an input vector to a nearest nei...
October 2, 2021
Current deep neural networks are highly overparameterized (up to billions of connection weights) and nonlinear. Yet they can fit data almost perfectly through variants of gradient descent algorithms and achieve unexpected levels of prediction accuracy without overfitting. These are formidable results that defy predictions of statistical learning and pose conceptual challenges for non-convex optimization. In this paper, we use methods from statistical physics of disordered sys...
July 26, 2023
In this work, we bound a machine's ability to learn based on computational limitations implied by physicality. We start by considering the information processing capacity (IPC), a normalized measure of the expected squared error of a collection of signals to a complete basis of functions. We use the IPC to measure the degradation under noise of the performance of reservoir computers, a particular kind of recurrent network, when constrained by physical considerations. First, w...
October 22, 2019
We develop a statistical mechanical approach based on the replica method to study the design space of deep and wide neural networks constrained to meet a large number of training data. Specifically, we analyze the configuration space of the synaptic weights and neurons in the hidden layers in a simple feed-forward perceptron network for two scenarios: a setting with random inputs/outputs and a teacher-student setting. By increasing the strength of constraints,~i.e. increasing...
November 5, 1996
We demonstrate that the fraction of pattern sets that can be stored in single- and hidden-layer perceptrons exhibits finite size scaling. This feature allows to estimate the critical storage capacity \alpha_c from simulations of relatively small systems. We illustrate this approach by determining \alpha_c, together with the finite size scaling exponent \nu, for storing Gaussian patterns in committee and parity machines with binary couplings and up to K=5 hidden units.
September 17, 2023
In these pedagogic notes I review the statistical mechanics approach to neural networks, focusing on the paradigmatic example of the perceptron architecture with binary an continuous weights, in the classification setting. I will review the Gardner's approach based on replica method and the derivation of the SAT/UNSAT transition in the storage setting. Then, I discuss some recent works that unveiled how the zero training error configurations are geometrically arranged, and ho...
July 15, 2002
The performance of a lossy data compression scheme for uniformly biased Boolean messages is investigated via methods of statistical mechanics. Inspired by a formal similarity to the storage capacity problem in the research of neural networks, we utilize a perceptron of which the transfer function is appropriately designed in order to compress and decode the messages. Employing the replica method, we analytically show that our scheme can achieve the optimal performance known i...
August 1, 2023
We investigate the descriptive complexity of a class of neural networks with unrestricted topologies and piecewise polynomial activation functions. We consider the general scenario where the running time is unlimited and floating-point numbers are used for simulating reals. We characterize these neural networks with a rule-based logic for Boolean networks. In particular, we show that the sizes of the neural networks and the corresponding Boolean rule formulae are polynomially...
March 29, 2013
We generalize recent theoretical work on the minimal number of layers of narrow deep belief networks that can approximate any probability distribution on the states of their visible units arbitrarily well. We relax the setting of binary units (Sutskever and Hinton, 2008; Le Roux and Bengio, 2008, 2010; Mont\'ufar and Ay, 2011) to units with arbitrary finite state spaces, and the vanishing approximation error to an arbitrary approximation error tolerance. For example, we show ...