ID: 2312.06581

Grokking Group Multiplication with Cosets

December 11, 2023

View on ArXiv
Dashiell Stander, Qinan Yu, Honglu Fan, Stella Biderman
Computer Science
Mathematics
Machine Learning
Artificial Intelligence
Representation Theory

We use the group Fourier transform over the symmetric group $S_n$ to reverse engineer a 1-layer feedforward network that has "grokked" the multiplication of $S_5$ and $S_6$. Each model discovers the true subgroup structure of the full group and converges on circuits that decompose the group multiplication into the multiplication of the group's conjugate subgroups. We demonstrate the value of using the symmetries of the data and models to understand their mechanisms and hold up the ``coset circuit'' that the model uses as a fascinating example of the way neural networks implement computations. We also draw attention to current challenges in conducting mechanistic interpretability research by comparing our work to Chughtai et al. [6] which alleges to find a different algorithm for this same problem.

Similar papers 1

A Toy Model of Universality: Reverse Engineering How Networks Learn Group Operations

February 6, 2023

92% Match
Bilal Chughtai, Lawrence Chan, Neel Nanda
Machine Learning
Artificial Intelligence
Representation Theory

Universality is a key hypothesis in mechanistic interpretability -- that different models learn similar features and circuits when trained on similar tasks. In this work, we study the universality hypothesis by examining how small neural networks learn to implement group composition. We present a novel algorithm by which neural networks may implement composition for any finite group via mathematical representation theory. We then show that networks consistently learn this alg...

Find SimilarView on arXiv

Harmonics of Learning: Universal Fourier Features Emerge in Invariant Networks

December 13, 2023

90% Match
Giovanni Luca Marchetti, Christopher Hillar, ... , Sanborn Sophia
Machine Learning
Artificial Intelligence
Signal Processing

In this work, we formally prove that, under certain conditions, if a neural network is invariant to a finite group then its weights recover the Fourier transform on that group. This provides a mathematical explanation for the emergence of Fourier features -- a ubiquitous phenomenon in both biological and artificial learning systems. The results hold even for non-commutative groups, in which case the Fourier transform encodes all the irreducible unitary group representations. ...

Find SimilarView on arXiv

Neural Discovery of Permutation Subgroups

September 11, 2023

89% Match
Pavan Karjol, Rohan Kashyap, Prathosh A P
Machine Learning

We consider the problem of discovering subgroup $H$ of permutation group $S_{n}$. Unlike the traditional $H$-invariant networks wherein $H$ is assumed to be known, we present a method to discover the underlying subgroup, given that it satisfies certain conditions. Our results show that one could discover any subgroup of type $S_{k} (k \leq n)$ by learning an $S_{n}$-invariant function and a linear transformation. We also prove similar results for cyclic and dihedral subgroups...

Find SimilarView on arXiv

A New Neural Network Architecture Invariant to the Action of Symmetry Subgroups

December 11, 2020

89% Match
Piotr Kicki, Mete Ozay, Piotr Skrzypczyński
Machine Learning
Artificial Intelligence

We propose a computationally efficient $G$-invariant neural network that approximates functions invariant to the action of a given permutation subgroup $G \leq S_n$ of the symmetric group on input data. The key element of the proposed network architecture is a new $G$-invariant transformation module, which produces a $G$-invariant latent representation of the input data. Theoretical considerations are supported by numerical experiments, which demonstrate the effectiveness and...

Find SimilarView on arXiv

On the Symmetries of Deep Learning Models and their Internal Representations

May 27, 2022

88% Match
Charles Godfrey, Davis Brown, ... , Kvinge Henry
Machine Learning
Artificial Intelligence

Symmetry is a fundamental tool in the exploration of a broad range of complex systems. In machine learning symmetry has been explored in both models and data. In this paper we seek to connect the symmetries arising from the architecture of a family of models with the symmetries of that family's internal representation of data. We do this by calculating a set of fundamental symmetry groups, which we call the intertwiner groups of the model. We connect intertwiner groups to a m...

Find SimilarView on arXiv

A Computationally Efficient Neural Network Invariant to the Action of Symmetry Subgroups

February 18, 2020

88% Match
Piotr Kicki, Mete Ozay, Piotr Skrzypczyński
Machine Learning
Neural and Evolutionary Comp...
Machine Learning

We introduce a method to design a computationally efficient $G$-invariant neural network that approximates functions invariant to the action of a given permutation subgroup $G \leq S_n$ of the symmetric group on input data. The key element of the proposed network architecture is a new $G$-invariant transformation module, which produces a $G$-invariant latent representation of the input data. This latent representation is then processed with a multi-layer perceptron in the net...

Find SimilarView on arXiv

Learning to be Simple

December 8, 2023

87% Match
Yang-Hui He, Vishnu Jejjala, ... , Sharnoff Max
Machine Learning
Group Theory
Mathematical Physics

In this work we employ machine learning to understand structured mathematical data involving finite groups and derive a theorem about necessary properties of generators of finite simple groups. We create a database of all 2-generated subgroups of the symmetric group on n-objects and conduct a classification of finite simple groups among them using shallow feed-forward neural networks. We show that this neural network classifier can decipher the property of simplicity with var...

Finding discrete symmetry groups via Machine Learning

July 25, 2023

87% Match
Pablo Calvo-Barlés, Sergio G. Rodrigo, ... , Martín-Moreno Luis
Computational Physics
Chemical Physics
Optics

We introduce a machine-learning approach (denoted Symmetry Seeker Neural Network) capable of automatically discovering discrete symmetry groups in physical systems. This method identifies the finite set of parameter transformations that preserve the system's physical properties. Remarkably, the method accomplishes this without prior knowledge of the system's symmetry or the mathematical relationships between parameters and properties. Demonstrating its versatility, we showcas...

Find SimilarView on arXiv

Feature emergence via margin maximization: case studies in algebraic tasks

November 13, 2023

87% Match
Depen Morwani, Benjamin L. Edelman, Costin-Andrei Oncescu, ... , Kakade Sham
Machine Learning

Understanding the internal representations learned by neural networks is a cornerstone challenge in the science of machine learning. While there have been significant recent strides in some cases towards understanding how neural networks implement specific target functions, this paper explores a complementary question -- why do networks arrive at particular computational strategies? Our inquiry focuses on the algebraic learning tasks of modular addition, sparse parities, and ...

Find SimilarView on arXiv

Discovering Symmetry Group Structures via Implicit Orthogonality Bias

February 26, 2024

87% Match
Dongsung Huh
Machine Learning
Group Theory
Representation Theory

We introduce the HyperCube network, a novel approach for autonomously discovering symmetry group structures within data. The key innovation is a unique factorization architecture coupled with a novel regularizer that instills a powerful inductive bias towards learning orthogonal representations. This leverages a fundamental theorem of representation theory that all compact/finite groups can be represented by orthogonal matrices. HyperCube efficiently learns general group oper...

Find SimilarView on arXiv