A01 Gradient descent for deep neural network learning

This project aims at making progress in the understanding of convergence properties of (stochastic) gradient descent methods for training deep neural networks. We target several extensions of initial results by the PIs on (fully connected) linear networks. For instance, we will investigate convergence to global minimizers for training structured linear and nonlinear neural networks. An important aspect of the project will be to explore the Riemannian geometry underlying the corresponding gradient flows.

Project Leaders
Postdoctoral Researchers

Publications

  • A01
    F. Innocenti, M. Achour, R. Singh, L. Buckley

    Only Strict Saddles in the Energy Landscape of Predictive Coding Networks?

    Preprint pp. 26 Seiten, 2024

    bibtex publications.rwth-aachen.de doi.org