Learning to be Simple

December 8, 2023

View on ArXiv

Yang-Hui He, Vishnu Jejjala, Challenger Mishra, Max Sharnoff

Computer Science

High Energy Physics - Theory

Mathematics

Machine Learning

Group Theory

Mathematical Physics

In this work we employ machine learning to understand structured mathematical data involving finite groups and derive a theorem about necessary properties of generators of finite simple groups. We create a database of all 2-generated subgroups of the symmetric group on n-objects and conduct a classification of finite simple groups among them using shallow feed-forward neural networks. We show that this neural network classifier can decipher the property of simplicity with varying accuracies depending on the features. Our neural network model leads to a natural conjecture concerning the generators of a finite simple group. We subsequently prove this conjecture. This new toy theorem comments on the necessary properties of generators of finite simple groups. We show this explicitly for a class of sporadic groups for which the result holds. Our work further makes the case for a machine motivated study of algebraic structures in pure mathematics and highlights the possibility of generating new conjectures and theorems in mathematics with the aid of machine learning.

Learning Algebraic Structures: Preliminary Investigations

May 2, 2019

90% Match

Yang-Hui He, Minhyong Kim

Machine Learning

Group Theory

Rings and Algebras

Machine Learning

We employ techniques of machine-learning, exemplified by support vector machines and neural classifiers, to initiate the study of whether AI can "learn" algebraic structures. Using finite groups and finite rings as a concrete playground, we find that questions such as identification of simple groups by "looking" at the Cayley table or correctly matching addition and multiplication tables for finite rings can, at least for structures of small size, be performed by the AI, even...

Find Similar View on arXiv

A Toy Model of Universality: Reverse Engineering How Networks Learn Group Operations

February 6, 2023

87% Match

Bilal Chughtai, Lawrence Chan, Neel Nanda

Machine Learning

Artificial Intelligence

Representation Theory

Universality is a key hypothesis in mechanistic interpretability -- that different models learn similar features and circuits when trained on similar tasks. In this work, we study the universality hypothesis by examining how small neural networks learn to implement group composition. We present a novel algorithm by which neural networks may implement composition for any finite group via mathematical representation theory. We then show that networks consistently learn this alg...

Find Similar View on arXiv

Machine-Learning Mathematical Structures

January 15, 2021

87% Match

Yang-Hui He

Machine Learning

History and Overview

History and Philosophy of Ph...

We review, for a general audience, a variety of recent experiments on extracting structure from machine-learning mathematical data that have been compiled over the years. Focusing on supervised machine-learning on labeled data from different fields ranging from geometry to representation theory, from combinatorics to number theory, we present a comparative study of the accuracies on different problems. The paradigm should be useful for conjecture formulation, finding more eff...

Find Similar View on arXiv

Grokking Group Multiplication with Cosets

December 11, 2023

87% Match

Dashiell Stander, Qinan Yu, ... , Biderman Stella

Machine Learning

Artificial Intelligence

Representation Theory

We use the group Fourier transform over the symmetric group $S_n$ to reverse engineer a 1-layer feedforward network that has "grokked" the multiplication of $S_5$ and $S_6$. Each model discovers the true subgroup structure of the full group and converges on circuits that decompose the group multiplication into the multiplication of the group's conjugate subgroups. We demonstrate the value of using the symmetries of the data and models to understand their mechanisms and hold u...

Find Similar View on arXiv

Why does Deep Learning work? - A perspective from Group Theory

December 20, 2014

87% Match

Arnab Paul, Suresh Venkatasubramanian

Machine Learning

Neural and Evolutionary Comp...

Machine Learning

Why does Deep Learning work? What representations does it capture? How do higher-order representations emerge? We study these questions from the perspective of group theory, thereby opening a new approach towards a theory of Deep learning. One factor behind the recent resurgence of the subject is a key algorithmic step called pre-training: first search for a good generative model for the input samples, and repeat the process one layer at a time. We show deeper implications ...

Find Similar View on arXiv

Neural Discovery of Permutation Subgroups

September 11, 2023

87% Match

Pavan Karjol, Rohan Kashyap, Prathosh A P

Machine Learning

We consider the problem of discovering subgroup $H$ of permutation group $S_{n}$. Unlike the traditional $H$-invariant networks wherein $H$ is assumed to be known, we present a method to discover the underlying subgroup, given that it satisfies certain conditions. Our results show that one could discover any subgroup of type $S_{k} (k \leq n)$ by learning an $S_{n}$-invariant function and a linear transformation. We also prove similar results for cyclic and dihedral subgroups...

Find Similar View on arXiv

Harmonics of Learning: Universal Fourier Features Emerge in Invariant Networks

December 13, 2023

87% Match

Giovanni Luca Marchetti, Christopher Hillar, ... , Sanborn Sophia

Machine Learning

Artificial Intelligence

Signal Processing

In this work, we formally prove that, under certain conditions, if a neural network is invariant to a finite group then its weights recover the Fourier transform on that group. This provides a mathematical explanation for the emergence of Fourier features -- a ubiquitous phenomenon in both biological and artificial learning systems. The results hold even for non-commutative groups, in which case the Fourier transform encodes all the irreducible unitary group representations. ...

Find Similar View on arXiv

A Group Theoretic Perspective on Unsupervised Deep Learning

April 8, 2015

87% Match

Arnab Paul, Suresh Venkatasubramanian

Machine Learning

Neural and Evolutionary Comp...

Machine Learning

Why does Deep Learning work? What representations does it capture? How do higher-order representations emerge? We study these questions from the perspective of group theory, thereby opening a new approach towards a theory of Deep learning. One factor behind the recent resurgence of the subject is a key algorithmic step called {\em pretraining}: first search for a good generative model for the input samples, and repeat the process one layer at a time. We show deeper implicat...

Find Similar View on arXiv

Finding discrete symmetry groups via Machine Learning

July 25, 2023

86% Match

Pablo Calvo-Barlés, Sergio G. Rodrigo, ... , Martín-Moreno Luis

Computational Physics

Chemical Physics

Optics

We introduce a machine-learning approach (denoted Symmetry Seeker Neural Network) capable of automatically discovering discrete symmetry groups in physical systems. This method identifies the finite set of parameter transformations that preserve the system's physical properties. Remarkably, the method accomplishes this without prior knowledge of the system's symmetry or the mathematical relationships between parameters and properties. Demonstrating its versatility, we showcas...

Find Similar View on arXiv

What do AI algorithms actually learn? - On false structures in deep learning

June 4, 2019

86% Match

Laura Thesing, Vegard Antun, Anders C. Hansen

Machine Learning

Cryptography and Security

Computer Vision and Pattern ...

Machine Learning

There are two big unsolved mathematical questions in artificial intelligence (AI): (1) Why is deep learning so successful in classification problems and (2) why are neural nets based on deep learning at the same time universally unstable, where the instabilities make the networks vulnerable to adversarial attacks. We present a solution to these questions that can be summed up in two words; false structures. Indeed, deep learning does not learn the original structures that hum...

Find Similar View on arXiv