ID: cond-mat/0104011

Multilayer neural networks with extensively many hidden units

April 1, 2001

View on ArXiv

Similar papers 4

Phase Transitions of Neural Networks

April 11, 1997

83% Match
Wolfgang Kinzel
Disordered Systems and Neura...

The cooperative behaviour of interacting neurons and synapses is studied using models and methods from statistical physics. The competition between training error and entropy may lead to discontinuous properties of the neural network. This is demonstrated for a few examples: Perceptron, associative memory, learning from examples, generalization, multilayer networks, structure recognition, Bayesian estimate, on-line training, noise estimation and time series generation.

Find SimilarView on arXiv

On the Information Capacity of Nearest Neighbor Representations

May 10, 2023

83% Match
Kordag Mehmet Kilic, Jin Sima, Jehoshua Bruck
cs.CC
cs.DM
cs.IT
cs.LG
cs.NE
math.IT

The $\textit{von Neumann Computer Architecture}$ has a distinction between computation and memory. In contrast, the brain has an integrated architecture where computation and memory are indistinguishable. Motivated by the architecture of the brain, we propose a model of $\textit{associative computation}$ where memory is defined by a set of vectors in $\mathbb{R}^n$ (that we call $\textit{anchors}$), computation is performed by convergence from an input vector to a nearest nei...

Find SimilarView on arXiv

Learning through atypical "phase transitions" in overparameterized neural networks

October 2, 2021

83% Match
Carlo Baldassi, Clarissa Lauditi, Enrico M. Malatesta, Rosalba Pacelli, ... , Zecchina Riccardo
Machine Learning
Disordered Systems and Neura...
Probability
Machine Learning

Current deep neural networks are highly overparameterized (up to billions of connection weights) and nonlinear. Yet they can fit data almost perfectly through variants of gradient descent algorithms and achieve unexpected levels of prediction accuracy without overfitting. These are formidable results that defy predictions of statistical learning and pose conceptual challenges for non-convex optimization. In this paper, we use methods from statistical physics of disordered sys...

Find SimilarView on arXiv

Limits to Reservoir Learning

July 26, 2023

83% Match
Anthony M. Polloreno
Machine Learning
Information Theory
Information Theory

In this work, we bound a machine's ability to learn based on computational limitations implied by physicality. We start by considering the information processing capacity (IPC), a normalized measure of the expected squared error of a collection of signals to a complete basis of functions. We use the IPC to measure the degradation under noise of the performance of reservoir computers, a particular kind of recurrent network, when constrained by physical considerations. First, w...

Find SimilarView on arXiv

From complex to simple : hierarchical free-energy landscape renormalized in deep neural networks

October 22, 2019

83% Match
Hajime Yoshino
Disordered Systems and Neura...
Statistical Mechanics
Machine Learning
Machine Learning

We develop a statistical mechanical approach based on the replica method to study the design space of deep and wide neural networks constrained to meet a large number of training data. Specifically, we analyze the configuration space of the synaptic weights and neurons in the hidden layers in a simple feed-forward perceptron network for two scenarios: a setting with random inputs/outputs and a teacher-student setting. By increasing the strength of constraints,~i.e. increasing...

Find SimilarView on arXiv

Finite size scaling in neural networks

November 5, 1996

83% Match
Walter Institut fuer Theoretische Chemie, Universitaet Tuebingen Nadler, Wolfgang Institut fuer Theoretische Physik, Universitaet Tuebingen Fink
Disordered Systems and Neura...
Adaptation and Self-Organizi...

We demonstrate that the fraction of pattern sets that can be stored in single- and hidden-layer perceptrons exhibits finite size scaling. This feature allows to estimate the critical storage capacity \alpha_c from simulations of relatively small systems. We illustrate this approach by determining \alpha_c, together with the finite size scaling exponent \nu, for storing Gaussian patterns in committee and parity machines with binary couplings and up to K=5 hidden units.

Find SimilarView on arXiv

High-dimensional manifold of solutions in neural networks: insights from statistical physics

September 17, 2023

83% Match
Enrico M. Malatesta
Disordered Systems and Neura...
Machine Learning
Probability
Statistics Theory
Statistics Theory

In these pedagogic notes I review the statistical mechanics approach to neural networks, focusing on the paradigmatic example of the perceptron architecture with binary an continuous weights, in the classification setting. I will review the Gardner's approach based on replica method and the derivation of the SAT/UNSAT transition in the storage setting. Then, I discuss some recent works that unveiled how the zero training error configurations are geometrically arranged, and ho...

Find SimilarView on arXiv

Statistical mechanics of lossy data compression using a non-monotonic perceptron

July 15, 2002

83% Match
T. Hosaka, Y. Kabashima, H. Nishimori
Statistical Mechanics

The performance of a lossy data compression scheme for uniformly biased Boolean messages is investigated via methods of statistical mechanics. Inspired by a formal similarity to the storage capacity problem in the research of neural networks, we utilize a perceptron of which the transfer function is appropriately designed in order to compress and decode the messages. Employing the replica method, we analytically show that our scheme can achieve the optimal performance known i...

Find SimilarView on arXiv

Descriptive complexity for neural networks via Boolean networks

August 1, 2023

83% Match
Veeti Ahvonen, Damian Heiman, Antti Kuusisto
Computational Complexity
Logic in Computer Science

We investigate the descriptive complexity of a class of neural networks with unrestricted topologies and piecewise polynomial activation functions. We consider the general scenario where the running time is unlimited and floating-point numbers are used for simulating reals. We characterize these neural networks with a rule-based logic for Boolean networks. In particular, we show that the sizes of the neural networks and the corresponding Boolean rule formulae are polynomially...

Find SimilarView on arXiv

Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units

March 29, 2013

83% Match
Guido F. Montúfar
Machine Learning
Machine Learning
Probability

We generalize recent theoretical work on the minimal number of layers of narrow deep belief networks that can approximate any probability distribution on the states of their visible units arbitrarily well. We relax the setting of binary units (Sutskever and Hinton, 2008; Le Roux and Bengio, 2008, 2010; Mont\'ufar and Ay, 2011) to units with arbitrary finite state spaces, and the vanishing approximation error to an arbitrary approximation error tolerance. For example, we show ...

Find SimilarView on arXiv