December 7, 2020
We show that standard machine-learning algorithms may be trained to predict certain invariants of low genus arithmetic curves. Using datasets of size around one hundred thousand, we demonstrate the utility of machine-learning in classification problems pertaining to the BSD invariants of an elliptic curve (including its rank and torsion subgroup), and the analogous invariants of a genus 2 curve. Our results show that a trained machine can efficiently classify curves according to these invariants with high accuracies (>0.97). For problems such as distinguishing between torsion orders, and the recognition of integral points, the accuracies can reach 0.998.
Similar papers 1
October 2, 2020
We apply some of the latest techniques from machine-learning to the arithmetic of hyperelliptic curves. More precisely we show that, with impressive accuracy and confidence (between 99 and 100 percent precision), and in very short time (matter of seconds on an ordinary laptop), a Bayesian classifier can distinguish between Sato-Tate groups given a small number of Euler factors for the L-function. Our observations are in keeping with the Sato-Tate conjecture for curves of low ...
Empirical analysis is often the first step towards the birth of a conjecture. This is the case of the Birch-Swinnerton-Dyer (BSD) Conjecture describing the rational points on an elliptic curve, one of the most celebrated unsolved problems in mathematics. Here we extend the original empirical approach, to the analysis of the Cremona database of quantities relevant to BSD, inspecting more than 2.5 million elliptic curves by means of the latest techniques in data science, machin...
November 17, 2020
We show that standard machine-learning algorithms may be trained to predict certain invariants of algebraic number fields to high accuracy. A random-forest classifier that is trained on finitely many Dedekind zeta coefficients is able to distinguish between real quadratic fields with class number 1 and 2, to 0.96 precision. Furthermore, the classifier is able to extrapolate to fields with discriminant outside the range of the training data. When trained on the coefficients of...
March 25, 2024
We use machine learning to study the locus ${\mathcal L}_n$ of genus two curves with $(n, n)$-split Jacobian. More precisely we design a transformer model which given values for the Igusa invariants determines if the corresponding genus two curve is in the locus ${\mathcal L}_n$, for $n=2, 3, 5, 7$. Such curves are important in isogeny based cryptography. During this study we discover that there are no rational points ${\mathfrak p} \in {\mathcal L}_n$ with weighted moduli ...
July 14, 2022
Determining the rank of an elliptic curve E/Q is a hard problem, and in some applications (e.g. when searching for curves of high rank) one has to rely on heuristics aimed at estimating the analytic rank (which is equal to the rank under the Birch and Swinnerton-Dyer conjecture). In this paper, we develop rank classification heuristics modeled by deep convolutional neural networks (CNN). Similarly to widely used Mestre-Nagao sums, it takes as an input the conductor of E and...
January 15, 2021
We review, for a general audience, a variety of recent experiments on extracting structure from machine-learning mathematical data that have been compiled over the years. Focusing on supervised machine-learning on labeled data from different fields ranging from geometry to representation theory, from combinatorics to number theory, we present a comparative study of the accuracies on different problems. The paradigm should be useful for conjecture formulation, finding more eff...
November 28, 2017
This is an introduction to a probabilistic model for the arithmetic of elliptic curves, a model developed in a series of articles of the author with Bhargava, Kane, Lenstra, Park, Rains, Voight, and Wood. We discuss the theoretical evidence for the model, and we make predictions about elliptic curves based on corresponding theorems proved about the model. In particular, the model suggests that all but finitely many elliptic curves over $\mathbb{Q}$ have rank $\le 21$, which w...
September 19, 2022
We implement and interpret various supervised learning experiments involving real quadratic fields with class numbers 1, 2 and 3. We quantify the relative difficulties in separating class numbers of matching/different parity from a data-scientific perspective, apply the methodology of feature analysis and principal component analysis, and use symbolic classification to develop machine-learned formulas for class numbers 1, 2 and 3 that apply to our dataset.
Despite the widespread use of machine learning algorithms to solve problems of technological, economic, and social relevance, provable guarantees on the performance of these data-driven algorithms are critically lacking, especially when the data originates from unreliable sources and is transmitted over unprotected and easily accessible channels. In this paper we take an important step to bridge this gap and formally show that, in a quest to optimize their accuracy, binary cl...
March 22, 2023
We survey some recent applications of machine learning to problems in geometry and theoretical physics. Pure mathematical data has been compiled over the last few decades by the community and experiments in supervised, semi-supervised and unsupervised machine learning have found surprising success. We thus advocate the programme of machine learning mathematical structures, and formulating conjectures via pattern recognition, in other words using artificial intelligence to hel...