Multilayer neural networks with extensively many hidden units

April 1, 2001

Activation function dependence of the storage capacity of treelike neural networks

July 22, 2020

83% Match

Jacob A. Zavatone-Veth, Cengiz Pehlevan

Disordered Systems and Neura...

Machine Learning

The expressive power of artificial neural networks crucially depends on the nonlinearity of their activation functions. Though a wide variety of nonlinear activation functions have been proposed for use in artificial neural networks, a detailed understanding of their role in determining the expressive power of a network has not emerged. Here, we study how activation functions affect the storage capacity of treelike two-layer networks. We relate the boundedness or divergence o...

Find SimilarView on arXiv

Properties of the geometry of solutions and capacity of multi-layer neural networks with Rectified Linear Units activations

July 17, 2019

83% Match

Carlo Baldassi, Enrico M. Malatesta, Riccardo Zecchina

Disordered Systems and Neura...

Machine Learning

Rectified Linear Units (ReLU) have become the main model for the neural units in current deep learning systems. This choice has been originally suggested as a way to compensate for the so called vanishing gradient problem which can undercut stochastic gradient descent (SGD) learning in networks composed of multiple layers. Here we provide analytical results on the effects of ReLUs on the capacity and on the geometrical landscape of the solution space in two-layer neural netwo...

Find SimilarView on arXiv

A Functional Perspective on Learning Symmetric Functions with Neural Networks

August 16, 2020

83% Match

Aaron Zweig, Joan Bruna

Machine Learning

Symmetric functions, which take as input an unordered, fixed-size set, are known to be universally representable by neural networks that enforce permutation invariance. These architectures only give guarantees for fixed input sizes, yet in many practical applications, including point clouds and particle physics, a relevant notion of generalization should include varying the input size. In this work we treat symmetric functions (of any size) as functions over probability measu...

Find SimilarView on arXiv

Fundamental problems in statistical physics XIV: Lecture on Machine Learning

February 11, 2022

83% Match

Aurélien Decelle

Disordered Systems and Neura...

Statistical Mechanics

The recent progresses in Machine Learning opened the door to actual applications of learning algorithms but also to new research directions both in the field of Machine Learning directly and, at the edges with other disciplines. The case that interests us is the interface with physics, and more specifically Statistical Physics. In this short lecture, I will try to present first a brief introduction to Machine Learning from the angle of neural networks. After explaining quickl...

Find SimilarView on arXiv

Representational capacity of a set of independent neurons

January 31, 2002

83% Match

Ines Samengo, Alessandro Treves

Disordered Systems and Neura...

The capacity with which a system of independent neuron-like units represents a given set of stimuli is studied by calculating the mutual information between the stimuli and the neural responses. Both discrete noiseless and continuous noisy neurons are analyzed. In both cases, the information grows monotonically with the number of neurons considered. Under the assumption that neurons are independent, the mutual information rises linearly from zero, and approaches exponentially...

Find SimilarView on arXiv

On the Compressive Power of Boolean Threshold Autoencoders

April 21, 2020

83% Match

Avraham A. Melkman, Sini Guo, Wai-Ki Ching, ... , Akutsu Tatsuya

Machine Learning

An autoencoder is a layered neural network whose structure can be viewed as consisting of an encoder, which compresses an input vector of dimension $D$ to a vector of low dimension $d$, and a decoder which transforms the low-dimensional vector back to the original input vector (or one that is very similar). In this paper we explore the compressive power of autoencoders that are Boolean threshold networks by studying the numbers of nodes and layers that are required to ensure ...

Find SimilarView on arXiv

Dynamical properties of a randomly diluted neural network with variable activity

March 8, 1999

83% Match

Stefan Grosskinsky

Disordered Systems and Neura...

The subject of study is a neural network with binary neurons, randomly diluted synapses and variable pattern activity. We look at the system with parallel updating using a probabilistic approach to solve the one step dynamics with one condensed pattern. We derive restrictions on the storage capacity and the mutual information content occuring during the retrieval process. Special focus is on the constraints on the threshold for optimal performance. We also look at the effect ...

Find SimilarView on arXiv

Deep Learning and the Information Bottleneck Principle

March 9, 2015

83% Match

Naftali Tishby, Noga Zaslavsky

Machine Learning

Deep Neural Networks (DNNs) are analyzed via the theoretical framework of the information bottleneck (IB) principle. We first show that any DNN can be quantified by the mutual information between the layers and the input and output variables. Using this representation we can calculate the optimal information theoretic limits of the DNN and obtain finite sample generalization bounds. The advantage of getting closer to the theoretical limit is quantifiable both by the generaliz...

Find SimilarView on arXiv

Measuring Mutual Information in Random Boolean Networks

September 3, 1999

83% Match

Bartolo Centro de Astrobiologia, Madrid Luque, Antonio Centro de Astrobiologia, Madrid Ferrera

Adaptation and Self-Organizi...

During the last few years an area of active research in the field of complex systems is that of their information storing and processing abilities. Common opinion has it that the most interesting beaviour of these systems is found ``at the edge of chaos'', which would seem to suggest that complex systems may have inherently non-trivial information proccesing abilities in the vicinity of sharp phase transitions. A comprenhensive, quantitative understanding of why this is the c...

Find SimilarView on arXiv

Neural networks with redundant representation: detecting the undetectable

November 28, 2019

83% Match

Elena Agliari, Francesco Alemanno, Adriano Barra, ... , Fachechi Alberto

Disordered Systems and Neura...

Machine Learning

We consider a three-layer Sejnowski machine and show that features learnt via contrastive divergence have a dual representation as patterns in a dense associative memory of order P=4. The latter is known to be able to Hebbian-store an amount of patterns scaling as N^{P-1}, where N denotes the number of constituting binary neurons interacting P-wisely. We also prove that, by keeping the dense associative network far from the saturation regime (namely, allowing for a number of ...

Find SimilarView on arXiv