November 4, 2019
Empirical analysis is often the first step towards the birth of a conjecture. This is the case of the Birch-Swinnerton-Dyer (BSD) Conjecture describing the rational points on an elliptic curve, one of the most celebrated unsolved problems in mathematics. Here we extend the original empirical approach, to the analysis of the Cremona database of quantities relevant to BSD, inspecting more than 2.5 million elliptic curves by means of the latest techniques in data science, machine-learning and topological data analysis. Key quantities such as rank, Weierstrass coefficients, period, conductor, Tamagawa number, regulator and order of the Tate-Shafarevich group give rise to a high-dimensional point-cloud whose statistical properties we investigate. We reveal patterns and distributions in the rank versus Weierstrass coefficients, as well as the Beta distribution of the BSD ratio of the quantities. Via gradient boosted trees, machine learning is applied in finding inter-correlation amongst the various quantities. We anticipate that our approach will spark further research on the statistical properties of large datasets in Number Theory and more in general in pure Mathematics.
Similar papers 1
December 7, 2020
We show that standard machine-learning algorithms may be trained to predict certain invariants of low genus arithmetic curves. Using datasets of size around one hundred thousand, we demonstrate the utility of machine-learning in classification problems pertaining to the BSD invariants of an elliptic curve (including its rank and torsion subgroup), and the analogous invariants of a genus 2 curve. Our results show that a trained machine can efficiently classify curves according...
July 14, 2022
Determining the rank of an elliptic curve E/Q is a hard problem, and in some applications (e.g. when searching for curves of high rank) one has to rely on heuristics aimed at estimating the analytic rank (which is equal to the rank under the Birch and Swinnerton-Dyer conjecture). In this paper, we develop rank classification heuristics modeled by deep convolutional neural networks (CNN). Similarly to widely used Mestre-Nagao sums, it takes as an input the conductor of E and...
October 2, 2020
We apply some of the latest techniques from machine-learning to the arithmetic of hyperelliptic curves. More precisely we show that, with impressive accuracy and confidence (between 99 and 100 percent precision), and in very short time (matter of seconds on an ordinary laptop), a Bayesian classifier can distinguish between Sato-Tate groups given a small number of Euler factors for the L-function. Our observations are in keeping with the Sato-Tate conjecture for curves of low ...
November 7, 2016
In this article, we propose a new probabilistic model for the distribution of ranks of elliptic curves in families of fixed Selmer rank, and compare the predictions with previous results, and with the databases of curves over the rationals that we have at our disposal. In addition, we document a phenomenon we refer to as Selmer bias that seems to play an important role in the data and in our models.
November 28, 2017
This is an introduction to a probabilistic model for the arithmetic of elliptic curves, a model developed in a series of articles of the author with Bhargava, Kane, Lenstra, Park, Rains, Voight, and Wood. We discuss the theoretical evidence for the model, and we make predictions about elliptic curves based on corresponding theorems proved about the model. In particular, the model suggests that all but finitely many elliptic curves over $\mathbb{Q}$ have rank $\le 21$, which w...
March 22, 2023
We survey some recent applications of machine learning to problems in geometry and theoretical physics. Pure mathematical data has been compiled over the last few decades by the community and experiments in supervised, semi-supervised and unsupervised machine learning have found surprising success. We thus advocate the programme of machine learning mathematical structures, and formulating conjectures via pattern recognition, in other words using artificial intelligence to hel...
January 15, 2021
We review, for a general audience, a variety of recent experiments on extracting structure from machine-learning mathematical data that have been compiled over the years. Focusing on supervised machine-learning on labeled data from different fields ranging from geometry to representation theory, from combinatorics to number theory, we present a comparative study of the accuracies on different problems. The paradigm should be useful for conjecture formulation, finding more eff...
October 12, 2010
We describe an algorithm to prove the Birch and Swinnerton-Dyer conjectural formula for any given elliptic curve defined over the rational numbers of analytic rank zero or one. With computer assistance we have proved the formula for 16714 of the 16725 such curves of conductor less than 5000.
September 24, 2008
The paper proves that the Birch and Swinnerton-Dyer conjecture is false.
September 9, 2015
The need for new methods to deal with big data is a common theme in most scientific fields, although its definition tends to vary with the context. Statistical ideas are an essential part of this, and as a partial response, a thematic program on statistical inference, learning, and models in big data was held in 2015 in Canada, under the general direction of the Canadian Statistical Sciences Institute, with major funding from, and most activities located at, the Fields Instit...