# Applied Harmonic Analysis, Massive Data Sets, Machine Learning, and Signal Processing (BIRS)

Arriving in Oaxaca, Mexico Sunday, October 16 and departing Friday October 21, 2016

### Organizers

Amit Singer (Princeton University)

Thomas Strohmer (University of California, Davis)

Ronald Coifman (Yale University)

Emmanuel Candes (Stanford University)

### Objectives

**Relevance, timeliness, and importance of the workshop:**

The analysis of massive, high-dimensional, noisy, time-varying data sets has become a critical issue for a large number of scientists and engineers. Major theoretical and algorithmic advances in analyzing massive and complex data are crucial, including methods of exploiting sparsity, clustering and classification, data mining, anomaly detection, and many more. In the last decade we have witnessed significant advances in many individual core areas of data analysis, including machine learning, signal processing, statistics, optimization, and of course harmonic analysis. It appears highly likely that the next major breakthroughs will occur at the intersection of these disciplines. Hence, what is needed is a concerted effort to bring together world leading experts from all these areas.

It is therefore the perfect time for a workshop that will feature a group of exceptional scientists at the intersection of the aforementioned disciplines, to present recent developments and to foster new interactions.

This direct interaction of mathematicians with statisticians, engineers, and computer scientists will make for an efficient intellectual feedback loop, which is central to achieving the urgently needed breakthroughs in the area of ``Big Data’’. Many of the proposed participants have expertise in more than one area, and moreover have already extensive experience in interdisciplinary collaborations. We intend the workshop to build upon and expand on this existing network of collaborations. We also plan to have numerous new researchers to further enhance the influx of novel and disruptive ideas. Senior and young scientists will be invited, and we will ensure appropriate representation of female participants.

A synopsis of this proposal was circulated among 32 researchers (+4 organizers) to solicit tentative participation in the proposed workshop. We received 29 (+4) enthusiastic positive responses from internationally renowned mathematicians, statisticians, computer scientists, and engineers, including Robert Calderbank (Duke), David Donoho (Stanford), Yonina Eldar (Technion), Leonid Guibas (Stanford), Stephane Mallat (Paris), Guillermo Sapiro (Duke), and Joel Tropp (CalTech).

This workshop will revolve around the following topics:

- Emerging connections between harmonic analysis and deep learning;
- Understanding the structure of high-dimensional data (for example graphs or text documents) and the construction of data-adaptive efficient representations;
- Inverse problems on complex data sets.

We briefly describe these topics and the associated objectives in more detail below. We emphasize that there is strong overlap between these three topics.

**Emerging connections between harmonic analysis and deep learning:**

One of the most exciting developments in machine learning in the past five years is the advent of *deep learning*, which is a special form of a neural network. Deep neural networks build hierarchical invariant representations by applying a succession of linear and non-linear operators which are learned from training data. Deep neural networks, and in particular convolutional networks developed by LeCun, have recently achieved state-of-the-art results on several complex object recognition tasks. They learn a huge network of filter banks and non-linearities on large datasets, using both supervised and non-supervised methods.

A major issue is to understand the properties of these networks, what needs to be learned and what is generic and common to most image classification problems. Theory needs to be developed to guide the search of proper feature extraction models at each layer. Until now deep learning acts very much like a black box, since algorithms are often based on ad hoc rules without theoretical foundation, the learned representations lack intepretability, and we do not know how to modify deep learning for those cases where it fails. Hence we are in urgent need of developing theoretical insight into deep learning.

There are promising signs that this theoretical framework could be derived with tools from harmonic analysis. A first breakthrough towards this goal is the scattering transform (developed by Mallat), which has the structure of a convolutional network. Yet, rather than being learnt, the scattering network is obtained from the invariance, stability and informative requirements. Scattering transforms provide a promising line of attack for developing a theoretical framework for deep learning. By introducing the rich collection of tools from harmonic analysis into deep neural networks in a principled way, we should be able to greatly enhance the efficiency and performance of deep neural networks. One objective of this workshop is thus to bridge the gap between harmonic analysis and deep learning.

**Understanding the structure of high-dimensional data:**

The need to analyze massive data sets in Euclidean space has led to a proliferation of research activity, including methods of dimension reduction and manifold learning. In general, understanding large data means identifying intrinsic characteristics of the data and developing techniques to isolate them.

While many of the currently existing tools (such as diffusion maps) show great promise, they rely on the assumption that data are stationary and homogeneous. Yet in many cases, we are dealing with changing and heterogeneous data. For instance, in medical diagnostics, we may want to infer a common phenomenon from data as diverse as MRI, EEG, and ECG. How do we properly fuse and process heterogeneous data to extract knowledge?

In a broad range of natural and real-world dynamical systems, measured signals are controlled by underlying processes or drivers. As a result, these signals exhibit highly redundant representations, while their temporal evolution can often be compactly described by dynamical processes on a low-dimensional manifold. Recently, diffusion maps have been generalized to the setting of a dynamic data set, in which the graph associated with it changes depending on some set of parameters. The associated “global” diffusion distance allows measuring the evolution of the dynamic data set in its intrinsic geometry. However, this is just a first step. One objective of this workshop is dedicated to mathematical tools that can detect and capture in an automatic, unsupervised manner the inner architecture of large data sets.

**Construction of data-adaptive efficient representations:**

Processing of signals on graphs is emerging as a fundamental problem in an increasing number of applications. Indeed, in addition to providing a direct representation of a variety of networks arising in practice, graphs serve as an overarching abstraction for many other types of data.

The construction of data-adaptive dictionaries is crucial, even more so in light of the need to analyze data that in past has not fallen within the boundary of signal processing, for example graphs or text documents. In fact, the above may be considered as casting a bridge between classical signal processing and the new era of processing of general data.

Convolutional neural networks have been successful in machine learning problems where the coordinates of the underlying data representation have a grid structure, and the data to be studied in those coordinates has translational equivariance/invariance with respect to this grid. However, e.g. data defined on 3-D meshes, such as surface tension or temperature, measurements from a network of meteorological stations, or data coming from social networks or collaborative filtering, are all examples of datasets on which one cannot apply standard convolutional networks. Clearly, this is another area where a closer link between deep learning, signal processing, and harmonic analysis would be highly beneficial.

**Inverse problems on complex data sets:**

Inverse problems arising in connection with massive, complex data sets pose tremendous challenges and require new mathematical tools. Consider for instance femtosecond X-ray protein nanocrystallography. There the problem is to uncover the structure of (3-dimensional) proteins from multiple (2-dimensional) intensity measurements. In addition to the huge amount of data and the fact that phase information gets lost during the measurement process, we also do not know the proteins’ rotation, which change from illumination to illumination. Standard phase retrieval methods fail miserably in this case. Yet, recent advances at the intersection of harmonic analysis, optimization, and signal processing show promise to solve such challenging problems.

Other important inverse problems in this topic are tied to heterogenous data or to the idea of self-calibration. Numerous deep questions arise. How can we utilize ideas of sparsity and minimal information complexity in this context? Is there a unified view of such measures that would include sparsity, lowrankness, and others (such as low-entropy), as special cases? This may lead to a new theory that considers an abstract notion of simplicity in general inverse problems. Can we design efficient non-convex algorithms with provable convergence? One objective is the advancement of new theoretical and numerical tools for such demanding inverse problems.

From *BIRS*