Member-only story
Beyond Codex: A Code Generation Model That You Can Train
An overview and hands-on tutorial of the codeT5 model
With the recent release of the OpenAI Codex model, code generation is becoming a hot topic in the NLP world, and is it not just hype. If you watch one of the Codex demos, you will see how these models will shape the future of software programming. However, from a researcher’s perspective, working with a Codex model might be unreachable if the needs go beyond trying it via the API since the pre-trained models are not publicly available. Technically, you can replicate Codex using the published paper, but you will need a large GPU cluster that only a few have access to or can afford. This limitation will, in my opinion, slows down research. Imagine how many fewer BERT downstream applications we would have if the authors did not share the pre-trained weights. Hopefully, Codex is not the only code generation model out there.
This post will overview codeT5, an encoder-decoder code generation model with publicly available pre-training checkpoints that you can try today. Moreover, this post contains a hands-on tutorial on how to use this model.
CodeT5 Overview
CodeT5 [1], as the name suggests, is based on the T5 [2] encoder-decoder model. Compared to other code generation models, it…