Workshop
Effective fluctuating continuum models for Riemannian SGD
- Benjamin Gess
Abstract
In this talk, we present recent results on the derivation of effective models for the training dynamics of Riemannian stochastic gradient descent (SGD) in limits of small learning rates or large, shallow networks. The focus lies on developing effective limiting models that also capture the fluctuations inherent in Riemannian SGD. This will lead to novel concepts of stochastic modified flows and distribution-dependent modified flows. The advantage of these limiting models is that they match the SGD dynamics to higher order and recover the correct multi-point distributions. This is joint work with Vitalii Konarovskyi and Sebastian Kassing.