A Toy Model of Universality: Reverse Eng...

Grokking Group Multiplication with Cosets

December 11, 2023

92% Match

Dashiell Stander, Qinan Yu, ... , Biderman Stella

Machine Learning

Artificial Intelligence

Representation Theory

We use the group Fourier transform over the symmetric group $S_n$ to reverse engineer a 1-layer feedforward network that has "grokked" the multiplication of $S_5$ and $S_6$. Each model discovers the true subgroup structure of the full group and converges on circuits that decompose the group multiplication into the multiplication of the group's conjugate subgroups. We demonstrate the value of using the symmetries of the data and models to understand their mechanisms and hold u...

Find SimilarView on arXiv

Harmonics of Learning: Universal Fourier Features Emerge in Invariant Networks

December 13, 2023

90% Match

Giovanni Luca Marchetti, Christopher Hillar, ... , Sanborn Sophia

Machine Learning

Artificial Intelligence

Signal Processing

In this work, we formally prove that, under certain conditions, if a neural network is invariant to a finite group then its weights recover the Fourier transform on that group. This provides a mathematical explanation for the emergence of Fourier features -- a ubiquitous phenomenon in both biological and artificial learning systems. The results hold even for non-commutative groups, in which case the Fourier transform encodes all the irreducible unitary group representations. ...

Find SimilarView on arXiv

Feature emergence via margin maximization: case studies in algebraic tasks

November 13, 2023

89% Match

Depen Morwani, Benjamin L. Edelman, Costin-Andrei Oncescu, ... , Kakade Sham

Machine Learning

Understanding the internal representations learned by neural networks is a cornerstone challenge in the science of machine learning. While there have been significant recent strides in some cases towards understanding how neural networks implement specific target functions, this paper explores a complementary question -- why do networks arrive at particular computational strategies? Our inquiry focuses on the algebraic learning tasks of modular addition, sparse parities, and ...

Find SimilarView on arXiv

Why does Deep Learning work? - A perspective from Group Theory

December 20, 2014

89% Match

Arnab Paul, Suresh Venkatasubramanian

Machine Learning

Neural and Evolutionary Comp...

Machine Learning

Why does Deep Learning work? What representations does it capture? How do higher-order representations emerge? We study these questions from the perspective of group theory, thereby opening a new approach towards a theory of Deep learning. One factor behind the recent resurgence of the subject is a key algorithmic step called pre-training: first search for a good generative model for the input samples, and repeat the process one layer at a time. We show deeper implications ...

Find SimilarView on arXiv

Neural Group Actions

October 8, 2020

89% Match

Span Spanbauer, Luke Sciarappa

Machine Learning

Neural and Evolutionary Comp...

We introduce an algorithm for designing Neural Group Actions, collections of deep neural network architectures which model symmetric transformations satisfying the laws of a given finite group. This generalizes involutive neural networks $\mathcal{N}$, which satisfy $\mathcal{N}(\mathcal{N}(x))=x$ for any data $x$, the group law of $\mathbb{Z}_2$. We show how to optionally enforce an additional constraint that the group action be volume-preserving. We conjecture, by analogy t...

Find SimilarView on arXiv

Universal Equivariant Multilayer Perceptrons

February 7, 2020

88% Match

Siamak Ravanbakhsh

Machine Learning

Neural and Evolutionary Comp...

Group Theory

Machine Learning

Group invariant and equivariant Multilayer Perceptrons (MLP), also known as Equivariant Networks, have achieved remarkable success in learning on a variety of data structures, such as sequences, images, sets, and graphs. Using tools from group theory, this paper proves the universality of a broad class of equivariant MLPs with a single hidden layer. In particular, it is shown that having a hidden layer on which the group acts regularly is sufficient for universal equivariance...

Find SimilarView on arXiv

On the Symmetries of Deep Learning Models and their Internal Representations

May 27, 2022

88% Match

Charles Godfrey, Davis Brown, ... , Kvinge Henry

Machine Learning

Artificial Intelligence

Symmetry is a fundamental tool in the exploration of a broad range of complex systems. In machine learning symmetry has been explored in both models and data. In this paper we seek to connect the symmetries arising from the architecture of a family of models with the symmetries of that family's internal representation of data. We do this by calculating a set of fundamental symmetry groups, which we call the intertwiner groups of the model. We connect intertwiner groups to a m...

Find SimilarView on arXiv

Learning Linear Groups in Neural Networks

May 29, 2023

88% Match

Emmanouil Theodosis, Karim Helwani, Demba Ba

Machine Learning

Neural and Evolutionary Comp...

Employing equivariance in neural networks leads to greater parameter efficiency and improved generalization performance through the encoding of domain knowledge in the architecture; however, the majority of existing approaches require an a priori specification of the desired symmetries. We present a neural network architecture, Linear Group Networks (LGNs), for learning linear groups acting on the weight space of neural networks. Linear groups are desirable due to their inher...

Find SimilarView on arXiv

A Group Theoretic Perspective on Unsupervised Deep Learning

April 8, 2015

88% Match

Arnab Paul, Suresh Venkatasubramanian

Machine Learning

Neural and Evolutionary Comp...

Machine Learning

Why does Deep Learning work? What representations does it capture? How do higher-order representations emerge? We study these questions from the perspective of group theory, thereby opening a new approach towards a theory of Deep learning. One factor behind the recent resurgence of the subject is a key algorithmic step called {\em pretraining}: first search for a good generative model for the input samples, and repeat the process one layer at a time. We show deeper implicat...

Find SimilarView on arXiv

Learning to be Simple

December 8, 2023

87% Match

Yang-Hui He, Vishnu Jejjala, ... , Sharnoff Max

Machine Learning

Group Theory

Mathematical Physics

In this work we employ machine learning to understand structured mathematical data involving finite groups and derive a theorem about necessary properties of generators of finite simple groups. We create a database of all 2-generated subgroups of the symmetric group on n-objects and conduct a classification of finite simple groups among them using shallow feed-forward neural networks. We show that this neural network classifier can decipher the property of simplicity with var...

Find Similar View on arXiv

A Toy Model of Universality: Reverse Engineering How Networks Learn Group Operations

Grokking Group Multiplication with Cosets

Harmonics of Learning: Universal Fourier Features Emerge in Invariant Networks

Feature emergence via margin maximization: case studies in algebraic tasks

Why does Deep Learning work? - A perspective from Group Theory

Neural Group Actions

Universal Equivariant Multilayer Perceptrons

On the Symmetries of Deep Learning Models and their Internal Representations

Learning Linear Groups in Neural Networks

A Group Theoretic Perspective on Unsupervised Deep Learning

Learning to be Simple