Applying Linearly Scalable Transformers to Model Longer Protein Sequences

Synced
Synced
Jul 31, 2020 · 4 min read

In a bid to make transformer models even better for real-world applications, researchers from Google, University of Cambridge, DeepMind and Alan Turing Institute have proposed a new transformer architecture called “Performer” — based on what they call fast attention via orthogonal random features (FAVOR).