Grokking Group Multiplication with Coset...

A Toy Model of Universality: Reverse Engineering How Networks Learn Group Operations

February 6, 2023

92% Match

Bilal Chughtai, Lawrence Chan, Neel Nanda

Machine Learning

Artificial Intelligence

Representation Theory

Universality is a key hypothesis in mechanistic interpretability -- that different models learn similar features and circuits when trained on similar tasks. In this work, we study the universality hypothesis by examining how small neural networks learn to implement group composition. We present a novel algorithm by which neural networks may implement composition for any finite group via mathematical representation theory. We then show that networks consistently learn this alg...

Find SimilarView on arXiv

Harmonics of Learning: Universal Fourier Features Emerge in Invariant Networks

December 13, 2023

90% Match

Giovanni Luca Marchetti, Christopher Hillar, ... , Sanborn Sophia

Machine Learning

Artificial Intelligence

Signal Processing

In this work, we formally prove that, under certain conditions, if a neural network is invariant to a finite group then its weights recover the Fourier transform on that group. This provides a mathematical explanation for the emergence of Fourier features -- a ubiquitous phenomenon in both biological and artificial learning systems. The results hold even for non-commutative groups, in which case the Fourier transform encodes all the irreducible unitary group representations. ...

Find SimilarView on arXiv

Neural Discovery of Permutation Subgroups

September 11, 2023

89% Match

Pavan Karjol, Rohan Kashyap, Prathosh A P

Machine Learning

We consider the problem of discovering subgroup $H$ of permutation group $S_{n}$. Unlike the traditional $H$-invariant networks wherein $H$ is assumed to be known, we present a method to discover the underlying subgroup, given that it satisfies certain conditions. Our results show that one could discover any subgroup of type $S_{k} (k \leq n)$ by learning an $S_{n}$-invariant function and a linear transformation. We also prove similar results for cyclic and dihedral subgroups...

Find SimilarView on arXiv

A New Neural Network Architecture Invariant to the Action of Symmetry Subgroups

December 11, 2020

89% Match

Piotr Kicki, Mete Ozay, Piotr Skrzypczyński

Machine Learning

Artificial Intelligence

We propose a computationally efficient $G$-invariant neural network that approximates functions invariant to the action of a given permutation subgroup $G \leq S_n$ of the symmetric group on input data. The key element of the proposed network architecture is a new $G$-invariant transformation module, which produces a $G$-invariant latent representation of the input data. Theoretical considerations are supported by numerical experiments, which demonstrate the effectiveness and...

Find SimilarView on arXiv

On the Symmetries of Deep Learning Models and their Internal Representations

May 27, 2022

88% Match

Charles Godfrey, Davis Brown, ... , Kvinge Henry

Machine Learning

Artificial Intelligence

Symmetry is a fundamental tool in the exploration of a broad range of complex systems. In machine learning symmetry has been explored in both models and data. In this paper we seek to connect the symmetries arising from the architecture of a family of models with the symmetries of that family's internal representation of data. We do this by calculating a set of fundamental symmetry groups, which we call the intertwiner groups of the model. We connect intertwiner groups to a m...

Find SimilarView on arXiv

A Computationally Efficient Neural Network Invariant to the Action of Symmetry Subgroups

February 18, 2020

88% Match

Piotr Kicki, Mete Ozay, Piotr Skrzypczyński

Machine Learning

Neural and Evolutionary Comp...

Machine Learning

We introduce a method to design a computationally efficient $G$-invariant neural network that approximates functions invariant to the action of a given permutation subgroup $G \leq S_n$ of the symmetric group on input data. The key element of the proposed network architecture is a new $G$-invariant transformation module, which produces a $G$-invariant latent representation of the input data. This latent representation is then processed with a multi-layer perceptron in the net...

Find SimilarView on arXiv

Learning to be Simple

December 8, 2023

87% Match

Yang-Hui He, Vishnu Jejjala, ... , Sharnoff Max

Machine Learning

Group Theory

Mathematical Physics

In this work we employ machine learning to understand structured mathematical data involving finite groups and derive a theorem about necessary properties of generators of finite simple groups. We create a database of all 2-generated subgroups of the symmetric group on n-objects and conduct a classification of finite simple groups among them using shallow feed-forward neural networks. We show that this neural network classifier can decipher the property of simplicity with var...

Find Similar View on arXiv

Finding discrete symmetry groups via Machine Learning

July 25, 2023

87% Match

Pablo Calvo-Barlés, Sergio G. Rodrigo, ... , Martín-Moreno Luis

Computational Physics

Chemical Physics

Optics

We introduce a machine-learning approach (denoted Symmetry Seeker Neural Network) capable of automatically discovering discrete symmetry groups in physical systems. This method identifies the finite set of parameter transformations that preserve the system's physical properties. Remarkably, the method accomplishes this without prior knowledge of the system's symmetry or the mathematical relationships between parameters and properties. Demonstrating its versatility, we showcas...

Find SimilarView on arXiv

Feature emergence via margin maximization: case studies in algebraic tasks

November 13, 2023

87% Match

Depen Morwani, Benjamin L. Edelman, Costin-Andrei Oncescu, ... , Kakade Sham

Machine Learning

Understanding the internal representations learned by neural networks is a cornerstone challenge in the science of machine learning. While there have been significant recent strides in some cases towards understanding how neural networks implement specific target functions, this paper explores a complementary question -- why do networks arrive at particular computational strategies? Our inquiry focuses on the algebraic learning tasks of modular addition, sparse parities, and ...

Find SimilarView on arXiv

Discovering Symmetry Group Structures via Implicit Orthogonality Bias

February 26, 2024

87% Match

Dongsung Huh

Machine Learning

Group Theory

Representation Theory

We introduce the HyperCube network, a novel approach for autonomously discovering symmetry group structures within data. The key innovation is a unique factorization architecture coupled with a novel regularizer that instills a powerful inductive bias towards learning orthogonal representations. This leverages a fundamental theorem of representation theory that all compact/finite groups can be represented by orthogonal matrices. HyperCube efficiently learns general group oper...

Find SimilarView on arXiv

Grokking Group Multiplication with Cosets

A Toy Model of Universality: Reverse Engineering How Networks Learn Group Operations

Harmonics of Learning: Universal Fourier Features Emerge in Invariant Networks

Neural Discovery of Permutation Subgroups

A New Neural Network Architecture Invariant to the Action of Symmetry Subgroups

On the Symmetries of Deep Learning Models and their Internal Representations

A Computationally Efficient Neural Network Invariant to the Action of Symmetry Subgroups

Learning to be Simple

Finding discrete symmetry groups via Machine Learning

Feature emergence via margin maximization: case studies in algebraic tasks

Discovering Symmetry Group Structures via Implicit Orthogonality Bias