Tarique AnwarActivation function and GLU variants for Transformer modelsCharacterizing the first week of April 2022 as happening in the field of AI and Deep Learning would be an understatement. Within the same…Apr 18, 20221Apr 18, 20221