Homepage
Open in app
Sign in
Get started
Deep Learning Reviews
Summaries of important papers
Follow
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity…
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity…
Review of paper by William Fedus, Barret Zoph, and Noam Shazeer, Google Brain, 2021.
Stan Kriventsov
Feb 10, 2021
AutoDropout: Learning Dropout Patterns to Regularize Deep Networks (paper review)
AutoDropout: Learning Dropout Patterns to Regularize Deep Networks (paper review)
Review of paper by Hieu Pham¹ ² and Quoc V. Le¹, ¹Google Research and ²Carnegie Mellon University, 2021.
Stan Kriventsov
Jan 21, 2021
Every Model Learned by Gradient Descent Is Approximately a Kernel Machine (paper review)
Every Model Learned by Gradient Descent Is Approximately a Kernel Machine (paper review)
Review of paper by Pedro Domingos, University of Washington, 2020
Stan Kriventsov
Dec 14, 2020
Point Transformer (paper review)
Point Transformer (paper review)
Review of paper by Hengshuang Zhao¹, Li Jiang², Jiaya Jia², et al, ¹University of Oxford, ²The Chinese University of Hong Kong, 2020.
Stan Kriventsov
Jan 3, 2021
Scaling *down* Deep Learning (paper review)
Scaling *down* Deep Learning (paper review)
Review of paper by Sam Greydanus, Oregon State University and the ML Collective, 2020
Stan Kriventsov
Dec 7, 2020
Gradient Starvation: A Learning Proclivity in Neural Networks (paper review)
Gradient Starvation: A Learning Proclivity in Neural Networks (paper review)
Review of paper by Mohammad Pezeshki¹ ², Sekou-Oumar Kaba¹ ³, Yoshua Bengio¹ ², et al, ¹Mila, ²Université de Montréal, ³McGill University…
Stan Kriventsov
Nov 30, 2020
AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients (paper review)
AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients (paper review)
By Juntang Zhuang¹, Tommy Tang², Yifan Ding³, et al, ¹Yale University, ²University of Illinois at Urbana-Champaign, and ³University of…
Stan Kriventsov
Nov 24, 2020
Big Bird: Transformers for Longer Sequences (paper review)
Big Bird: Transformers for Longer Sequences (paper review)
Review of paper by Manzil Zaheer, Guru Guruganesh, Avinava Dubey et al (Google Research), 2020.
Stan Kriventsov
Oct 21, 2020
Linformer: Self-Attention with Linear Complexity (paper review)
Linformer: Self-Attention with Linear Complexity (paper review)
Review of paper by Sinong Wang, Belinda Z. Li, Madian Khabsa et al (Facebook AI Research), 2020
Stan Kriventsov
Oct 21, 2020
Supervised Contrastive Learning (paper review)
Supervised Contrastive Learning (paper review)
Review of paper by Prannay Khosla, Piotr Teterwak, Chen Wang et al (Google Research), 2020
Stan Kriventsov
Oct 21, 2020
ResNeSt: Split-Attention Networks (paper review)
ResNeSt: Split-Attention Networks (paper review)
Review of paper by Hang Zhang¹, Chongruo Wu², Zhongyue Zhang¹ et al (¹Amazon and ²UC Davis), 2020
Stan Kriventsov
Oct 21, 2020
ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators (paper review)
ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators (paper review)
Review of paper by Kevin Clark¹, Minh-Thang Luong², Quoc V. Le², and Christopher D. Manning² (¹Stanford University and ²Google Brain), 2020
Stan Kriventsov
Oct 21, 2020
Batch Normalization Biases Deep Residual Networks Towards Shallow Paths (paper review)
Batch Normalization Biases Deep Residual Networks Towards Shallow Paths (paper review)
Review of paper by Soham De and Samuel L. Smith (Deepmind), 2020
Stan Kriventsov
Oct 21, 2020
Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network…
Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network…
Review of paper by Jungkyu Lee, Taeryun Won, and Kiho Hong, Clova Vision, NAVER Corp, 2019
Stan Kriventsov
Oct 21, 2020
Deep Learning for Symbolic Mathematics (paper review)
Deep Learning for Symbolic Mathematics (paper review)
Review of paper by Guillaume Lample and François Charton, Facebook AI Research, 2019.
Stan Kriventsov
Oct 21, 2020
Generative Language Modeling for Automated Theorem Proving (paper review)
Generative Language Modeling for Automated Theorem Proving (paper review)
Review of paper by Stanislas Polu and Ilya Sutskever (Open AI), 2020
Stan Kriventsov
Oct 21, 2020
End-to-End Object Detection with Transformers (paper review)
End-to-End Object Detection with Transformers (paper review)
Review of paper by Nicolas Carion, Francisco Massa, Gabriel Synnaeve et al (Facebook AI Research), 2020
Stan Kriventsov
Oct 21, 2020
N-BEATS: Neural Basis Expansion Analysis For Interpretable Time Series Forecasting (paper review)
N-BEATS: Neural Basis Expansion Analysis For Interpretable Time Series Forecasting (paper review)
Review of paper by Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados (Element AI), and Yoshua Bengio (MILA), 2019
Stan Kriventsov
Oct 21, 2020
Synthesizer: Rethinking Self-Attention in Transformer Models (paper review)
Synthesizer: Rethinking Self-Attention in Transformer Models (paper review)
Review of paper by Yi Tay, Dara Bahri, Donald Metzler et al (Google Research), 2020
Stan Kriventsov
Oct 21, 2020
Attention Augmented Differentiable Forest for Tabular Data (paper review)
Attention Augmented Differentiable Forest for Tabular Data (paper review)
Review of paper by Yingshi Chen, Xiamen University, 2020
Stan Kriventsov
Nov 11, 2020
About Deep Learning Reviews
Latest Stories
Archive
About Medium
Terms
Privacy
Teams