The inability of Deep Learning to perform compositional learning is one of the main reasons for Deep Learning’s most critical limitations, including the need to feed them tons of data.

Compositionality is* the algebraic capacity to understand and produce novel combinations from known components* (Loula 2018). While the human brain can easily learn compositionally, Neural Networks (NNs) are not able *to discover and store skills that are common across problems, and to re-combine them in a hierarchical fashion to solve new challenges* (Liška 2018).

The inability of NNs to perform compositional learning is one of the reasons for NNs most…

Explainable AI (xAI) is the new cool kid on the block and the xAI approach (build a black box and then explain it) is now the most cherished modus-operandi of Machine Learning practitioners. Is this really the best route? Why don’t we build an interpretable model right away?

Explainability and interpretability are two different concepts although, across different sources, the two seem to be erroneously used interchangeably. In this blog post, I will base my reasoning on the following definitions [7], which, at least from my viewpoint, seem to be the most widely adopted:

- Explainable ML: using a black box…

How will your Deep Learning system perform on new data (generalize)? How bad can its performance get? Estimating the ability of an algorithm to generalize is necessary to build trust and be able to rely on AI systems.

**TL;DR — Traditional approaches (VC Dimension, Rademacher complexity) fail at providing reliable, useful (tight enough) generalization bounds. What if network compression goes hand in hand with the estimation of generalization bounds? That’s a winning lottery ticket!**

Ensuring that an algorithm will perform as expected once it goes live is necessary: the AI system needs to be safe and reliable. …

Under the manifold assumption, real-world high-dimensional data concentrates close to a non-linear low-dimensional manifold **[2]**. In other words, data lies approximately on a manifold of much lower dimension than the input space, a manifold that can be retrieved/learned **[8]**

The manifold assumption is crucial in order to deal with the curse of dimensionality: many machine learning models problems seem hopeless if we expect the machine learning algorithm to learn functions with interesting variations across an highly dimensional space **[6]**

Fortunately, it has been empirically proven that ANNs *capture the geometric regularities of commonplace data thanks to their hierarchical, layered structure…*

Despite wide adoption in the industry, our understanding of deep learning is still lagging.

[20], nicely summarized by [21], identifies four research branches:

**Non-Convex Optimization**: we deal with a non-convex function, yet SGD works. Why does SGD even converge?**Over-parameterization and Generalization**: how can Deep Neural Networks avoid the curse of dimensionality?

Theorists have long assumed networks with hundreds of thousands of neurons and orders of magnitude more individually weighted connections between them should suffer from a fundamental problem: over-parameterization

[19]

**Role of Depth**: How does depth help a neural network to converge? What is the link between depth and…

In my previous post, while discussing the importance of DSLs in ML and AI, we mentioned the idea of Software 2.0, introduced by Andrej Karpathy:

Software 2.0 is written in neural network weights. No human is involved in writing this code because there are a lot of weights (typical networks might have millions), and coding directly in weights is kind of hard (I tried). Instead, we specify some constraints on the behavior of a desirable program (e.g., a dataset of input output pairs of examples) and use the computational resources at our disposal to search the program space for a…

Domain-Specific Languages make our life easier while developing AI/ML applications in many different ways. Choosing the right DSL for the job might matter more than the choice of the host language.

DSLs are a powerful tool to express concisely business logic.

At the same time, ML and AI systems do not come set in stone.

- The underlying models reflect business and working hypothesis that might change over time
- Sensitivity analysis should not be only performed against model (hyper)parameters but also against business and working assumptions

Principal Heisenberg Compensator — https://www.linkedin.com/in/mattia-ferrini/