SyncedReview
Published in

SyncedReview

Google’s H-Transformer-1D: Fast One-Dimensional Hierarchical Attention With Linear Complexity for Long Sequence Processing

Transformer architectures’ powerful attention mechanisms are capable of pushing SOTA performance across various natural language processing (NLP) tasks. The quadratic complexity of run time and memory usage for such attention mechanisms however has long been a critical bottleneck when…

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store