Microsoft & Université de Montréal Researchers Leverage Measure Theory to Reveal the Mathematical Properties of Attention
Attention architectures are pushing the frontier in many machine learning (ML) tasks and have become a building block in many modern neural networks. Our conceptual and theoretical understanding of their power and inherent limitations however remains nascent. Researchers from Microsoft and Université de Montréal set out to capture the essential mathematical properties of attention, proposing a new mathematical framework that uses measure theory and…