February 14, 2025
Similar papers 5
February 12, 2020
Deep learning networks have been trained to recognize speech, caption photographs and translate text between languages at high levels of performance. Although applications of deep learning networks to real world problems have become ubiquitous, our understanding of why they are so effective is lacking. These empirical results should not be possible according to sample complexity in statistics and non-convex optimization theory. However, paradoxes in the training and effective...
March 27, 2014
We investigate the consequences of natural conjectures of Montgomery type on the non-vanishing of Dirichlet $L$-functions at the central point. We first justify these conjectures using probabilistic arguments. We then show using a result of Bombieri, Friedlander and Iwaniec and a result of the author that they imply that almost all Dirichlet $L$-functions do not vanish at the central point. We also deduce a quantitative upper bound for the proportion of Dirichlet $L$-function...
August 4, 2020
This article considers "compressive learning," an approach to large-scale machine learning where datasets are massively compressed before learning (e.g., clustering, classification, or regression) is performed. In particular, a "sketch" is first constructed by computing carefully chosen nonlinear random features (e.g., random Fourier features) and averaging them over the whole dataset. Parameters are then learned from the sketch, without access to the original dataset. This a...
January 15, 2021
We review, for a general audience, a variety of recent experiments on extracting structure from machine-learning mathematical data that have been compiled over the years. Focusing on supervised machine-learning on labeled data from different fields ranging from geometry to representation theory, from combinatorics to number theory, we present a comparative study of the accuracies on different problems. The paradigm should be useful for conjecture formulation, finding more eff...
August 22, 2007
In this paper, we obtain an unconditional density theorem concerning the low-lying zeros of Hasse-Weil L-functions for a family of elliptic curves. From this together with the Riemann hypothesis for these L-functions, we infer the majorant of 27/14 (which is strictly less than 2) for the average rank of the elliptic curves in the family under consideration. This upper bound for the average rank enables us to deduce that, under the same assumption, a positive proportion of ell...
July 15, 2022
We use machine learning to predict the dimension of a lattice polytope directly from its Ehrhart series. This is highly effective, achieving almost 100% accuracy. We also use machine learning to recover the volume of a lattice polytope from its Ehrhart series, and to recover the dimension, volume, and quasi-period of a rational polytope from its Ehrhart series. In each case we achieve very high accuracy, and we propose mathematical explanations for why this should be so.
November 3, 2022
Building on the work of Iwaniec, Luo and Sarnak, we use the $n$-level density to bound the probability of vanishing to order at least $r$ at the central point for families of cuspidal newforms of prime level $N \to \infty$, split by sign. There are three methods to improve bounds on the order of vanishing: optimizing the test functions, increasing the support, and increasing the $n$-level density studied. Previous work determined the optimal test functions for the $1$ and $2$...
March 31, 2022
In this work we consider the problem of data classification in post-classical settings were the number of training examples consists of mere few data points. We explore the phenomenon and reveal key relationships between dimensionality of AI model's feature space, non-degeneracy of data distributions, and the model's generalisation capabilities. The main thrust of our present analysis is on the influence of nonlinear feature transformations mapping original data into higher- ...
June 24, 2012
The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implem...
November 26, 2019
Learning representations of data is an important problem in statistics and machine learning. While the origin of learning representations can be traced back to factor analysis and multidimensional scaling in statistics, it has become a central theme in deep learning with important applications in computer vision and computational neuroscience. In this article, we review recent advances in learning representations from a statistical perspective. In particular, we review the fo...