Workshop
Training of deep ResNets and shallow networks in the lense of optimal transport
- François-Xavier Vialard
Abstract
In this talk, we study the mean-field regime of the residual networks architectures, which are key to modern deep learning model. We show that the landscape of the loss function for standard quadratic loss is made "nicer" by using such an architecture. More precisely, in the time-continuous limit and two different over-parametrized regimes, we prove that the loss function satisfies a local Polyak-Lojasiewicz inequality, thereby guaranteeing that any critical point is a global minimum and a local convergence results. If time permits, we also present a recent result on feature learning in the context of a shallow network. This is joint work with Raphaël Barboni and Gabriel Peyré.