January 25, 2018
The backpropagation algorithm, which had been originally introduced in the 1970s, is the workhorse of learning in neural networks. This backpropagation algorithm makes use of the famous machine learning algorithm known as Gradient Descent, which is a first-order iterative optimization algorithm for finding the minimum of a function. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate...
July 6, 2021
This chapter aims to provide next-level understanding of the problems of the world and the solutions available to those problems, which lie very well within the domain of neural computing, and at the same time are intelligent in their approach, to invoke a sense of innovation among the educationalists, researchers, academic professionals, students and people concerned, by highlighting the work done by major researchers and innovators in this field and thus, encouraging the re...
September 8, 2017
This monograph aims at providing an introduction to key concepts, algorithms, and theoretical results in machine learning. The treatment concentrates on probabilistic models for supervised and unsupervised learning problems. It introduces fundamental concepts and algorithms by building on first principles, while also exposing the reader to more advanced topics with extensive pointers to the literature, within a unified notation and mathematical framework. The material is orga...
January 22, 2023
A Deep Neural Network (DNN) is a composite function of vector-valued functions, and in order to train a DNN, it is necessary to calculate the gradient of the loss function with respect to all parameters. This calculation can be a non-trivial task because the loss function of a DNN is a composition of several nonlinear functions, each with numerous parameters. The Backpropagation (BP) algorithm leverages the composite structure of the DNN to efficiently compute the gradient. A...
July 25, 2024
This book provides an introduction to the mathematical analysis of deep learning. It covers fundamental results in approximation theory, optimization theory, and statistical learning theory, which are the three main pillars of deep neural network theory. Serving as a guide for students and researchers in mathematics and related fields, the book aims to equip readers with foundational knowledge on the topic. It prioritizes simplicity over generality, and presents rigorous yet ...
June 21, 2021
This open-source book represents our attempt to make deep learning approachable, teaching readers the concepts, the context, and the code. The entire book is drafted in Jupyter notebooks, seamlessly integrating exposition figures, math, and interactive examples with self-contained code. Our goal is to offer a resource that could (i) be freely available for everyone; (ii) offer sufficient technical depth to provide a starting point on the path to actually becoming an applied m...
March 16, 2022
We currently witness the spectacular success of artificial intelligence in both science and public life. However, the development of a rigorous mathematical foundation is still at an early stage. In this survey article, which is based on an invited lecture at the International Congress of Mathematicians 2022, we will in particular focus on the current "workhorse" of artificial intelligence, namely deep neural networks. We will present the main theoretical directions along wit...
October 22, 2023
In this study, we explore the integration of neural networks, a potent class of functions known for their exceptional approximation capabilities. Our primary emphasis is on the integration of multi-layer neural networks, a challenging task within this domain. To tackle this challenge, we introduce a novel numerical method that combines the forward algorithm with a corrective procedure. Our experimental results demonstrate the accuracy achieved through our integration approach...
May 19, 1997
We present a method to solve initial and boundary value problems using artificial neural networks. A trial solution of the differential equation is written as a sum of two parts. The first part satisfies the boundary (or initial) conditions and contains no adjustable parameters. The second part is constructed so as not to affect the boundary conditions. This part involves a feedforward neural network, containing adjustable parameters (the weights). Hence by construction the b...
March 27, 2016
There exists a theory of a single general-purpose learning algorithm which could explain the principles of its operation. This theory assumes that the brain has some initial rough architecture, a small library of simple innate circuits which are prewired at birth and proposes that all significant mental algorithms can be learned. Given current understanding and observations, this paper reviews and lists the ingredients of such an algorithm from both architectural and function...