Search
Workshop

Equivariance as Design Principle for Modern Deep Learning

  • Jan Stühmer
E1 05 (Leibniz-Saal)

Abstract

Geometric deep learning has emerged as a powerful framework for the study and design of modern machine learning models, with broad applications in chemistry, physics, robotics, and engineering. At its core, it leverages invariance and equivariance with respect to the symmetry groups underlying the data. In this presentation, I will highlight two recent contributions from our research group in this area.

In the first part, I will present our work on protein design, where we introduce a generative model for protein backbone generation that leverages geometric products and higher-order message passing. Building on FrameFlow, a state-of-the-art model for protein backbone generation, we represent the frames of the protein backbone as elements of the projective geometric algebra. This formulation enables the use of geometrically more expressive bilinear geometric products as a paradigm for higher order message passing. The proposed model achieves high designability and structural diversity, while generating protein backbones that more closely match the statistical distribution of secondary structures found in naturally occurring proteins - a capability so far only insufficiently achieved by state-of-the-art generative models. I will conclude this part with future directions for protein design and potential applications in materials science.

In the second part, I will present our contributions to a perhaps unexpected application area of geometric deep learning: large language models (LLMs). While LLMs demonstrate impressive capabilities across numerous applications, their robustness and factual correctness remain a critical concern. Another vulnerability of LLMs is their order sensitivity, i.e., a bias towards the sequence in which options or documents are presented. This issue manifests in multiple-choice reasoning, automated evaluation tasks, and retrieval-augmented generation, where input order significantly impacts reliability. To address this, we propose a modification of the transformer architecture that enables the processing of mixed set and text inputs with permutation invariance guarantees. This adaptation improves performance on tasks such as multi-document summarization and multi-document question answering, while preserving the runtime efficiency of the original model and eliminating order sensitivity.

Antje Vandenberg

Max Planck Institute for Mathematics in the Sciences Contact via Mail

Michael Bleher

University of Heidelberg & STRUCTURES

Freya Jensen

University of Heidelberg & STRUCTURES

Levin Maier

University of Heidelberg & STRUCTURES

Diaaeldin Taha

Max Planck Institute for Mathematics in the Sciences

Anna Wienhard

Max Planck Institute for Mathematics in the Sciences