June 4, 2019
Similar papers 5
June 28, 2021
Representations of the world environment play a crucial role in artificial intelligence. It is often inefficient to conduct reasoning and inference directly in the space of raw sensory representations, such as pixel values of images. Representation learning allows us to automatically discover suitable representations from raw sensory data. For example, given raw sensory data, a deep neural network learns nonlinear representations at its hidden layers, which are subsequently u...
December 30, 2020
Deep learning algorithms are responsible for a technological revolution in a variety of tasks including image recognition or Go playing. Yet, why they work is not understood. Ultimately, they manage to classify data lying in high dimension -- a feat generically impossible due to the geometry of high dimensional space and the associated curse of dimensionality. Understanding what kind of structure, symmetry or invariance makes data such as images learnable is a fundamental cha...
May 23, 2016
In this paper, we prove a conjecture published in 1989 and also partially address an open problem announced at the Conference on Learning Theory (COLT) 2015. With no unrealistic assumption, we first prove the following statements for the squared loss function of deep linear neural networks with any depth and any widths: 1) the function is non-convex and non-concave, 2) every local minimum is a global minimum, 3) every critical point that is not a global minimum is a saddle po...
January 2, 2018
In this paper, we analyze deep learning from a mathematical point of view and derive several novel results. The results are based on intriguing mathematical properties of high dimensional spaces. We first look at perturbation based adversarial examples and show how they can be understood using topological and geometrical arguments in high dimensions. We point out mistake in an argument presented in prior published literature, and we present a more rigorous, general and correc...
November 24, 2015
Deep learning takes advantage of large datasets and computationally efficient training algorithms to outperform other approaches at various machine learning tasks. However, imperfections in the training phase of deep neural networks make them vulnerable to adversarial samples: inputs crafted by adversaries with the intent of causing deep neural networks to misclassify. In this work, we formalize the space of adversaries against deep neural networks (DNNs) and introduce a nove...
November 30, 2017
Deep CNNs are known to exhibit the following peculiarity: on the one hand they generalize extremely well to a test set, while on the other hand they are extremely sensitive to so-called adversarial perturbations. The extreme sensitivity of high performance CNNs to adversarial examples casts serious doubt that these networks are learning high level abstractions in the dataset. We are concerned with the following question: How can a deep CNN that does not learn any high level s...
November 6, 2017
Deep learning has achieved a great success in many areas, from computer vision to natural language processing, to game playing, and much more. Yet, what deep learning is really doing is still an open question. There are a lot of works in this direction. For example, [5] tried to explain deep learning by group renormalization, and [6] tried to explain deep learning from the view of functional approximation. In order to address this very crucial question, here we see deep learn...
June 22, 2018
Like any field of empirical science, AI may be approached axiomatically. We formulate requirements for a general-purpose, human-level AI system in terms of postulates. We review the methodology of deep learning, examining the explicit and tacit assumptions in deep learning research. Deep Learning methodology seeks to overcome limitations in traditional machine learning research as it combines facets of model richness, generality, and practical applicability. The methodology s...
August 12, 2024
The unwavering success of deep learning in the past decade led to the increasing prevalence of deep learning methods in various application fields. However, the downsides of deep learning, most prominently its lack of trustworthiness, may not be compatible with safety-critical or high-responsibility applications requiring stricter performance guarantees. Recently, several instances of deep learning applications have been shown to be subject to theoretical limitations of compu...
October 1, 2019
We empirically evaluate common assumptions about neural networks that are widely held by practitioners and theorists alike. In this work, we: (1) prove the widespread existence of suboptimal local minima in the loss landscape of neural networks, and we use our theory to find examples; (2) show that small-norm parameters are not optimal for generalization; (3) demonstrate that ResNets do not conform to wide-network theories, such as the neural tangent kernel, and that the inte...