March 22, 2023
We survey some recent applications of machine learning to problems in geometry and theoretical physics. Pure mathematical data has been compiled over the last few decades by the community and experiments in supervised, semi-supervised and unsupervised machine learning have found surprising success. We thus advocate the programme of machine learning mathematical structures, and formulating conjectures via pattern recognition, in other words using artificial intelligence to hel...
March 4, 2019
Despite the widespread use of machine learning algorithms to solve problems of technological, economic, and social relevance, provable guarantees on the performance of these data-driven algorithms are critically lacking, especially when the data originates from unreliable sources and is transmitted over unprotected and easily accessible channels. In this paper we take an important step to bridge this gap and formally show that, in a quest to optimize their accuracy, binary cl...
November 7, 2016
In this article, we propose a new probabilistic model for the distribution of ranks of elliptic curves in families of fixed Selmer rank, and compare the predictions with previous results, and with the databases of curves over the rationals that we have at our disposal. In addition, we document a phenomenon we refer to as Selmer bias that seems to play an important role in the data and in our models.
April 25, 2006
We consider all genus 2 curves over Q given by an equation y^2 = f(x) with f a squarefree polynomial of degree 5 or 6, with integral coefficients of absolute value at most 3. For each of these roughly 200000 isomorphism classes of curves, we decide if there is a rational point on the curve or not, by a combination of techniques. For a small number of curves, our result is conditional on the BSD conjecture or on GRH.
September 14, 2014
In this technical report we presented a novel approach to machine learning. Once the new framework is presented, we will provide a simple and yet very powerful learning algorithm which will be benchmark on various dataset. The framework we proposed is based on booleen circuits; more specifically the classifier produced by our algorithm have that form. Using bits and boolean gates instead of real numbers and multiplication enable the the learning algorithm and classifier to ...
December 12, 2016
This article will devise data-driven, mathematical laws that generate optimal, statistical classification systems which achieve minimum error rates for data distributions with unchanging statistics. Thereby, I will design learning machines that minimize the expected risk or probability of misclassification. I will devise a system of fundamental equations of binary classification for a classification system in statistical equilibrium. I will use this system of equations to for...
We describe how simple machine learning methods successfully predict geometric properties from Hilbert series (HS). Regressors predict embedding weights in projective space to ${\sim}1$ mean absolute error, whilst classifiers predict dimension and Gorenstein index to $>90\%$ accuracy with ${\sim}0.5\%$ standard error. Binary random forest classifiers managed to distinguish whether the underlying HS describes a complete intersection with high accuracies exceeding $95\%$. Neura...
September 11, 2023
Fano varieties are basic building blocks in geometry - they are `atomic pieces' of mathematical shapes. Recent progress in the classification of Fano varieties involves analysing an invariant called the quantum period. This is a sequence of integers which gives a numerical fingerprint for a Fano variety. It is conjectured that a Fano variety is uniquely determined by its quantum period. If this is true, one should be able to recover geometric properties of a Fano variety dire...
June 28, 2018
While there has been some discussion on how Symbolic Computation could be used for AI there is little literature on applications in the other direction. However, recent results for quantifier elimination suggest that, given enough example problems, there is scope for machine learning tools like Support Vector Machines to improve the performance of Computer Algebra Systems. We survey the authors own work and similar applications for other mathematical software. It may seem t...
November 25, 2022
Plotting a learner's generalization performance against the training set size results in a so-called learning curve. This tool, providing insight in the behavior of the learner, is also practically valuable for model selection, predicting the effect of more training data, and reducing the computational complexity of training. We set out to make the (ideal) learning curve concept precise and briefly discuss the aforementioned usages of such curves. The larger part of this surv...