May 11, 2022
Similar papers 2
November 1, 2021
Machine learning (ML) has emerged into formidable force for identifying hidden but pertinent patterns within a given data set with the objective of subsequent generation of automated predictive behavior. In the recent years, it is safe to conclude that ML and its close cousin deep learning (DL) have ushered unprecedented developments in all areas of physical sciences especially chemistry. Not only the classical variants of ML , even those trainable on near-term quantum hardwa...
June 11, 2018
Motivation: Despite its great success in various physical modeling, differential geometry (DG) has rarely been devised as a versatile tool for analyzing large, diverse and complex molecular and biomolecular datasets due to the limited understanding of its potential power in dimensionality reduction and its ability to encode essential chemical and biological information in differentiable manifolds. Results: We put forward a differential geometry based geometric learning (DG-GL...
April 23, 2019
Herein we review aspects of leading-edge research and innovation in chemistry which exploits big data and machine learning (ML), two computer science fields that combine to yield machine intelligence. ML can accelerate the solution of intricate chemical problems and even solve problems that otherwise would not be tractable. But the potential benefits of ML come at the cost of big data production; that is, the algorithms, in order to learn, demand large volumes of data of vari...
November 28, 2023
Most scientific challenges can be framed into one of the following three levels of complexity of function approximation. Type 1: Approximate an unknown function given input/output data. Type 2: Consider a collection of variables and functions, some of which are unknown, indexed by the nodes and hyperedges of a hypergraph (a generalized graph where edges can connect more than two vertices). Given partial observations of the variables of the hypergraph (satisfying the functiona...
February 1, 2019
In this review, we highlight recent developments in the application of machine learning for molecular modeling and simulation. After giving a brief overview of the foundations, components, and workflow of a typical supervised learning approach for chemical problems, we showcase areas and state-of-the-art examples of their deployment. In this context, we discuss how machine learning relates to, supports, and augments more traditional physics-based approaches in computational r...
We describe how simple machine learning methods successfully predict geometric properties from Hilbert series (HS). Regressors predict embedding weights in projective space to ${\sim}1$ mean absolute error, whilst classifiers predict dimension and Gorenstein index to $>90\%$ accuracy with ${\sim}0.5\%$ standard error. Binary random forest classifiers managed to distinguish whether the underlying HS describes a complete intersection with high accuracies exceeding $95\%$. Neura...
December 8, 2020
Statistical learning algorithms are finding more and more applications in science and technology. Atomic-scale modeling is no exception, with machine learning becoming commonplace as a tool to predict energy, forces and properties of molecules and condensed-phase systems. This short review summarizes recent progress in the field, focusing in particular on the problem of representing an atomic configuration in a mathematically robust and computationally efficient way. We also ...
January 15, 2021
We review, for a general audience, a variety of recent experiments on extracting structure from machine-learning mathematical data that have been compiled over the years. Focusing on supervised machine-learning on labeled data from different fields ranging from geometry to representation theory, from combinatorics to number theory, we present a comparative study of the accuracies on different problems. The paradigm should be useful for conjecture formulation, finding more eff...
November 16, 2015
Capacity control, the bias/variance dilemma, and learning unknown functions from data, are all concerned with identifying effective and consistent fits of unknown geometric loci to random data points. A geometric locus is a curve or surface formed by points, all of which possess some uniform property. A geometric locus of an algebraic equation is the set of points whose coordinates are solutions of the equation. Any given curve or surface must pass through each point on a spe...
August 27, 2017
This work introduces a number of algebraic topology approaches, such as multicomponent persistent homology, multi-level persistent homology and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. Multicomponent persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a...