## The ABCs of Sparsity and Singular Structures

### April 3-5, 2024, RWTH Aachen University

#### The workshop will cover the scientific topics of our Collaborative Research Center, with a focus on the mathematics of sparsity-based methods, singular structures in PDEs and interactions between these areas. The aim is to introduce the topics more broadly through lectures and exercises and to deepen the acquired knowledge through talks on current research topics in the field.

#### The workshop will take place from April 3 to 5, 2024, in Hörsaal IV (Templergraben 55, Aachen) of RWTH Aachen University.

## Speakers

**Felix Krahmer**, TU Munich**Richard Küng**, Johannes Kepler University Linz**Ivan Trapasso**, Polytechnic University of Turin**Helmut Harbrecht**, University of Basel**Wolfgang Hackbusch,**Max Planck Institute for Mathematics in the Sciences**Christian Kühn**, TU Munich**Andrea Zanoni**, Stanford University**Philipp Trunschke**, Nantes University**Michael Feischl**, TU Wien**Patrick Henning**, Ruhr-University Bochum**Lubomir Banas**, Bielefeld University

## Programme (PDF)

## Abstracts

## Felix Krahmer: Structure and Randomness in Data Science

The goal of this short course is to convince you that a smart combination of structure and randomness can be very useful for many data science applications, including sensing and data acquisition, dimension reduction, and computational tasks. Randomness can help, as random parameter choices often lead to good conditioning with high probability due to concentration phenomena, while structure arises either as imposed by applications, or to make the computations more feasible. Here the role of randomness can be twofold. On the one hand, randomness can help establish deterministic properties which are too complex in nature to be understood for any deterministic constructions. For example, a sufficient condition for guaranteed recovery of sparse signals in compressive sensing is the so-called Restricted Isometry Property, that holds when the sensing matrix acts as an approximate isometry on sparse vectors. This property is known to hold for random matrices of embedding dimension near-linear in the sparsity level with high probability, while no comparable deterministic construction is known to date. This way of thinking can also help in the analysis of stochastic partial differential equations.

In many applications related to computing, in contrast, randomness plays another essential role: A random preprocessing of the data, such as a random projection or a random subsampling, can allow to reduce high dimensional problems to lower dimensional problems that can be solved more efficiently. In most of these scenarios, comparable deterministic constructions are not feasible, as for any realization of the preprocessing operation, the procedure will fail for some data sets. The role of randomness in this context is that it translates a method that works for most instances - which may be useless as the actual data set may be one of the bad instances - into a method that for any data set works with high probability. Structure is important to ensure that the preprocessing remains efficient and its computational complexity does not exceed the one of the task to be performed. Examples of problems where this strategy is of use include nearest neighbour search and principal component analysis.

In the course, we will give an overview of instances of both these paradigms from various application areas and also present some key ideas of the underlying mathematical theory.

## Richard Küng: Learning to predict ground state properties of gapped Hamiltonians

Classical machine learning (ML) provides a potentially powerful approach to solving challenging quantum many-body problems in physics and chemistry. However, the advantages of ML over traditional methods have not been firmly established. In this work, we prove that classical ML algorithms can efficiently predict ground-state properties of gapped Hamiltonians after learning from other Hamiltonians in the same quantum phase of matter. By contrast, under a widely accepted conjecture, classical algorithms that do not learn from data cannot achieve the same guarantee. Our proof technique combines mathematical signal processing with quantum many-body physics and also builds upon the recently developed framework of classical shadows. I will try to convey the main proof ingredients and also present numerical experiments that address the anti-ferromagnetic Heisenberg model and Rydberg atom systems.

This is joint work with Hsin-Yuan (Robert) Huang, Giacomo Torlai, Victor Albert and John Preskill, see [Huang et al., Provably efficient machine learning for quantum many-body problems, Science 2022]

## Ivan Trapasso: Scattering transforms: a tale of stability and scales

The aim of this minicourse is to discuss some mathematical aspects of scattering transforms and networks. Motivated by the extraordinary effectiveness of deep convolutional neural networks in classification tasks and the paradigm of geometric deep learning, we will focus on the problem of how to extract meaningful yet simple features from complex signals (e.g., images). The main goal of the course is to discuss how some of the very fundamental principles of harmonic analysis can be ingeniously combined, leading to insightful signal representations in a quite natural way. In particular, we will emphasize how setting as a goal the stability of the features with respect to relevant groups of symmetries eventually leads to a convolution network architecture, which "scatters" input data through along different scales by means of a cascading sequence of wavelet-like transforms and non-linear operators without deteriorating fine details.

The main mathematical properties of such scattering transforms are discussed, as well as some of their applications and limitations.

## Helmut Harbrecht: High dimensional approximation: Sparse grids vs. tensors

This lecture is dedicated to the approximation of high-dimensional functions by sparse grids and low-rank tensor approximation. To this end, we assume that the function to be approximated is defined on a product domain with being sufficiently smooth subdomains. The function spaces under consideration are related isotropic or anisotropic Sobolev spaces.

## Wolfgang Hackbusch: Numerical Tensor Calculus

Tensors appear as multi-indexed quantities, but also as multivariate functions or Kronecker matrices. Usually, the number of entries is far beyond the storage size of computers. This requires special sparse representation techniques. Two classical schemes are the r-term format (or canonical format) and the tensor subspace format (or Tucker format). However, both formats have their disadvantages. A better representation is given by the hierarchical tensor format.

Unfortunately, the nice properties of matrices do not generalise to tensors of order >2. Nevertheless, a generalisation of the singular value decomposition can be used to truncate the tensor rank, which is increased by the tensor operations.

## Christian Kühn: Mean-Field PDEs for Complex Network Dynamics

Network dynamics in complex systems can often only be treated analytically in large-node limits leading to mesoscopic/kinetic or to macroscopic PDE. In my talk, I am going to outline techniques, how to include a broad range of heterogeneous network structures into kinetic Vlasov-type PDEs. In particular, also the cases beyond graphons, for adaptive and hypergraph systems will be covered. From these limit PDEs, a wide range of dynamics can then be proven analytically. A key new idea that will be explained is the use of graphops ("graph operators") for PDEs.

## Andrea Zanoni: Estimating effective equations of multiscale diffusions and stochastic models with colored noise using filtered data

We consider the problem of learning unknown parameters in stochastic differential equations from data. Quite often data are originated from complex dynamics and one is interested in obtaining a reduced, coarse-grained description of the dynamics, i.e., a simplified stochastic model. In this talk, we focus on both multiscale diffusions of the Langevin type and systems driven by colored noise. We aim to infer parameters in the corresponding homogenized and white noise limits, respectively, given observations from the full dynamics. This is a challenging problem because model and observations are not compatible, and therefore classic estimators fail. We bypass traditional subsampling techniques, which have been shown to be not robust and strongly dependent on the subsampling rate, and focus, instead, on filtering the data through either an exponential kernel or a moving average. We consider different approaches: maximum likelihood estimators and stochastic gradient descent in continuous time for continuous-time observations, and martingale estimating functions based on eigenvalues and eigenfunctions of the generator of the effective dynamics for discrete-time observations. We combine these techniques with filtered data and propose novel estimators that allow to efficiently infer parameters in the effective models. We prove theoretically and show through several numerical examples that our estimators are asymptotically unbiased in the limit of infinite data and when either the multiscale parameter or the correlation time of the colored noise vanishes.

## Philipp Trunschke: Using weights for fun and profit: Weighted sparse and low-rank approximation

Many functions of interest exhibit weighted summability of their coefficients with respect to some dictionary of basis functions. This summability enables efficient sparse and low-rank approximation and ensures that the function can be estimated efficiently from samples. This talk presents some fundamental results on the estimation of sparse and low-rank functions, like the weighted Stechkin lemma and the restricted isometry property, and introduces simultaneously sparse and low-rank tensor formats.

## Michael Feischl: Optimality of Adaptive Discretizations

The ultimate goal of any numerical method is to achieve maximal accuracy with minimal computational cost. This is also the driving motivation behind adaptive mesh refinement algorithms to approximate partial differential equations (PDEs). PDEs are the foundation of almost every simulation in computational physics (from classical mechanics to geophysics, astrophysics, hydrodynamics, and micromagnetism) and even in computational finance and machine learning. Without adaptive mesh refinement such simulations fail to reach significant accuracy even on the strongest computers before running out of memory or time. The goal of adaptivity is to achieve a mathematically guaranteed optimal accuracy vs. work ratio for such problems.

We will discuss basic and more recent ideas of how to prove optimality and look at classical PDE examples as well as problems with stochastic noise.

## Patrick Henning: On the computation of ground states of the Gross-Pitaevskii equation

In this talk we discuss the computation of ground states of Bose-Einstein condensates by solving the stationary Gross-Pitaevskii equation (GPE). The stationary GPE can be either seen as a Riemannian optimization problem, or equivalently, a nonlinear eigenvalue problem. We will connect both perspectives through gradient flows and show how these connections can be used in the construction of suitable numerical methods. Special attention is also given to the global and local convergence properties of the methods. Finally, we will give an outlook on current challenges and trends in this field.

## Lubomir Banas: Robust a posteriori estimates for the stochastic Cahn-Hilliard equation with singular noise

We discuss a posteriori error estimates for a fully discrete finite element approximation of the stochastic Cahn-Hilliard equation with singular space-time noise (i.e., noise that is white in space and in time). To deal with the low spatial regularity of the noise we consider a regularized problem with suitable approximation of the space-time white noise. The a posteriori bound for the regularized problem is obtained by a splitting of the equation into a linear stochastic partial differential equation and a nonlinear random partial differential equation and treating the respective problems separately. The estimate for the discretization of the original problem is then obtained by considering suitable probability subsets with high probability. The resulting a posteriori estimate is computable and robust with respect to the interfacial width parameter as well as the noise intensity. We illustrate the theoretical result by numerical simulations.

The talk is based on a joint work with Jean Daniel Mukam (Bielefeld).