June 8, 2017
Similar papers 2
July 20, 2016
Next generation deep neural networks for classification hosted on embedded platforms will rely on fast, efficient, and accurate learning algorithms. Initialization of weights in learning networks has a great impact on the classification accuracy. In this paper we focus on deriving good initial weights by modeling the error function of a deep neural network as a high-dimensional landscape. We observe that due to the inherent complexity in its algebraic structure, such an error...
July 18, 2023
Deep neural networks exhibit a fascinating spectrum of phenomena ranging from predictable scaling laws to the unpredictable emergence of new capabilities as a function of training time, dataset size and network size. Analysis of these phenomena has revealed the existence of concepts and algorithms encoded within the learned representations of these networks. While significant strides have been made in explaining observed phenomena separately, a unified framework for understan...
August 1, 2018
Deep neural networks are workhorse models in machine learning with multiple layers of non-linear functions composed in series. Their loss function is highly non-convex, yet empirically even gradient descent minimisation is sufficient to arrive at accurate and predictive models. It is hitherto unknown why are deep neural networks easily optimizable. We analyze the energy landscape of a spin glass model of deep neural networks using random matrix theory and algebraic geometry. ...
June 12, 2019
In the past decade, deep neural networks (DNNs) came to the fore as the leading machine learning algorithms for a variety of tasks. Their raise was founded on market needs and engineering craftsmanship, the latter based more on trial and error than on theory. While still far behind the application forefront, the theoretical study of DNNs has recently made important advancements in analyzing the highly over-parameterized regime where some exact results have been obtained. Leve...
January 21, 2022
We review recent efforts to machine learn relations between knot invariants. Because these knot invariants have meaning in physics, we explore aspects of Chern-Simons theory and higher dimensional gauge theories. The goal of this work is to translate numerical experiments with Big Data to new analytic results.
April 17, 2022
The goal of identifying the Standard Model of particle physics and its extensions within string theory has been one of the principal driving forces in string phenomenology. Recently, the incorporation of artificial intelligence in string theory and certain theoretical advancements have brought to light unexpected solutions to mathematical hurdles that have so far hindered progress in this direction. In this review we focus on model building efforts in the context of the $E_8\...
July 30, 2020
We revisit the question of predicting both Hodge numbers $h^{1,1}$ and $h^{2,1}$ of complete intersection Calabi-Yau (CICY) 3-folds using machine learning (ML), considering both the old and new datasets built respectively by Candelas-Dale-Lutken-Schimmrigk / Green-H\"ubsch-Lutken and by Anderson-Gao-Gray-Lee. In real world applications, implementing a ML system rarely reduces to feed the brute data to the algorithm. Instead, the typical workflow starts with an exploratory dat...
April 27, 2022
We present a statistical approach for the discovery of relationships between mathematical entities that is based on linear regression and deep learning with fully connected artificial neural networks. The strategy is applied to computational knot data and empirical connections between combinatorial and hyperbolic knot invariants are revealed.
May 9, 2021
We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent absence of the curse of dimensionality, the surprisingly successful optimization performance despite the non-convexity of the pro...
May 2, 2019
Artificial Intelligence (AI), defined in its most simple form, is a technological tool that makes machines intelligent. Since learning is at the core of intelligence, machine learning poses itself as a core sub-field of AI. Then there comes a subclass of machine learning, known as deep learning, to address the limitations of their predecessors. AI has generally acquired its prominence over the past few years due to its considerable progress in various fields. AI has vastly in...