April 1, 2001
Similar papers 3
January 17, 2020
A robust theoretical framework that can describe and predict the generalization ability of deep neural networks (DNNs) in general circumstances remains elusive. Classical attempts have produced complexity metrics that rely heavily on global measures of compactness and capacity with little investigation into the effects of sub-component collaboration. We demonstrate intriguing regularities in the activation patterns of the hidden nodes within fully-connected feedforward networ...
April 25, 2022
Neuronal network computation and computation by avalanche supporting networks are of interest to the fields of physics, computer science (computation theory as well as statistical or machine learning) and neuroscience. Here we show that computation of complex Boolean functions arises spontaneously in threshold networks as a function of connectivity and antagonism (inhibition), computed by logic automata (motifs) in the form of computational cascades. We explain the emergent i...
July 19, 2019
There is some theoretical evidence that deep neural networks with multiple hidden layers have a potential for more efficient representation of multidimensional mappings than shallow networks with a single hidden layer. The question is whether it is possible to exploit this theoretical advantage for finding such representations with help of numerical training methods. Tests using prototypical problems with a known mean square minimum did not confirm this hypothesis. Minima fou...
August 20, 2017
We derive the calculation of two critical numbers predicting the behavior of perceptron networks. First, we derive the calculation of what we call the lossless memory (LM) dimension. The LM dimension is a generalization of the Vapnik--Chervonenkis (VC) dimension that avoids structured data and therefore provides an upper bound for perfectly fitting almost any training data. Second, we derive what we call the MacKay (MK) dimension. This limit indicates a 50% chance of not bein...
July 30, 2013
This paper examines the memory capacity of generalized neural networks. Hopfield networks trained with a variety of learning techniques are investigated for their capacity both for binary and non-binary alphabets. It is shown that the capacity can be much increased when multilevel inputs are used. New learning strategies are proposed to increase Hopfield network capacity, and the scalability of these methods is also examined in respect to size of the network. The ability to r...
March 15, 2017
Restricted Boltzmann Machines are key tools in Machine Learning and are described by the energy function of bipartite spin-glasses. From a statistical mechanical perspective, they share the same Gibbs measure of Hopfield networks for associative memory. In this equivalence, weights in the former play as patterns in the latter. As Boltzmann machines usually require real weights to be trained with gradient descent like methods, while Hopfield networks typically store binary pat...
January 29, 2002
The article is a lightly edited version of my habilitation thesis at the University Wuerzburg. My aim is to give a self contained, if concise, introduction to the formal methods used when off-line learning in feedforward networks is analyzed by statistical physics. However, due to its origin, the article is not a comprehensive review of the field but is highly skewed towards reporting my own research.
September 20, 2024
Recent years have been marked with the fast-pace diversification and increasing ubiquity of machine learning applications. Yet, a firm theoretical understanding of the surprising efficiency of neural networks to learn from high-dimensional data still proves largely elusive. In this endeavour, analyses inspired by statistical physics have proven instrumental, enabling the tight asymptotic characterization of the learning of neural networks in high dimensions, for a broad class...
May 26, 2003
The time evolution of an exactly solvable layered feedforward neural network with three-state neurons and optimizing the mutual information is studied for arbitrary synaptic noise (temperature). Detailed stationary temperature-capacity and capacity-activity phase diagrams are obtained. The model exhibits pattern retrieval, pattern-fluctuation retrieval and spin-glass phases. It is found that there is an improved performance in the form of both a larger critical capacity and i...
March 16, 2022
Balancing model complexity against the information contained in observed data is the central challenge to learning. In order for complexity-efficient models to exist and be discoverable in high dimensions, we require a computational framework that relates a credible notion of complexity to simple parameter representations. Further, this framework must allow excess complexity to be gradually removed via gradient-based optimization. Our n-ary, or n-argument, activation function...