Evan Pete WalshinAi2 BlogScaling up AllenNLP to 11B Parameter ModelsA deep dive into the challenges of large scale training and the tools we used to get thereOct 7, 2021Oct 7, 2021
Evan Pete WalshinAi2 BlogPython caching in GitHub ActionsHow to speed up slow Python builds in GitHub Actions with effective cachingSep 28, 20205Sep 28, 20205
Evan Pete WalshinAi2 BlogTutorial: training on larger batches with less memory in AllenNLPThis is part of a series of mini-tutorials to help you with various aspects of the AllenNLP library.Sep 8, 20201Sep 8, 20201
Evan Pete WalshinAi2 BlogTutorial: How to train with multiple GPUs in AllenNLPThis is part of a series of mini-tutorials to help you with various aspects of the AllenNLP library.Aug 24, 20201Aug 24, 20201
Evan Pete WalshinAi2 BlogTutorial: How to upload transformer weights and tokenizers from AllenNLP to HuggingFaceThis is the first of a series of mini tutorials to help you with various aspects of the AllenNLP library.Aug 14, 20201Aug 14, 20201
Evan Pete WalshIncorporating a copy mechanism into sequence-to-sequence modelsThis post explains the details behind the CopyNet model from Gu et al. (1). If you just want to see the code, you can check out myโฆSep 10, 20193Sep 10, 20193
Evan Pete WalshinStructurely EngineeringSequence-to-sequence models with a dash of reinforcement learning ๐Practical training techniques for optimizing sequence-level objectivesMar 6, 2019Mar 6, 2019