Google & Columbia U’s Mnemosyne: Learning to Train Transformers With Transformers
Training deep and complex machine learning (ML) models involves determining the best optimizer and then manually tuning its hyperparameters — a process that is both computationally intensive and time-consuming. Learning-to-learn (L2L) systems have recently emerged as a more efficient alternative to conventional human-engineered ML optimizers.