A01 Gradient descent for deep neural network learning

This project aims at making progress in the understanding of convergence properties of (stochastic) gradient descent methods for training deep neural networks. We target several extensions of initial results by the PIs on (fully connected) linear networks. For instance, we will investigate convergence to global minimizers for training structured linear and nonlinear neural networks. An important aspect of the project will be to explore the Riemannian geometry underlying the corresponding gradient flows.

A01 A03 A06 A09

Prof. Dr. Holger Rauhut
Ludwig-Maximilians-Universität München
more information
+49 89 2180 4618
rauhut@math.lmu.de
homepage

A01 C02

Prof. Dr. Michael Westdickenberg
RWTH Aachen University
more information
+49 241 80 94569
mwest@instmath.rwth-aachen.de
homepage

A01

Dr. Ulrich Terstiege
Ludwig-Maximilians-Universität München
more information
terstiege@math.lmu.de

Publications

A01

E. M. Achour, K. Kohn, H. Rauhut

The Riemannian Geometry associated to Gradient Flows of Linear Convolutional Networks

Preprint 2025

A01

F. Innocenti, E. M. Achour, C. L. Buckley

μPC: Scaling Predictive Coding to 100+ Layer Networks

Preprint 2025

A01

F. Innocenti, M. Achour, R. Singh, L. Buckley

Only Strict Saddles in the Energy Landscape of Predictive Coding Networks?

Preprint pp. 26 Seiten, 2024

bibtex publications.rwth-aachen.de doi.org