Announcing AllenNLP 1.0
AllenNLP is a free, open-source natural language processing platform from AI2, designed so researchers can easily build state-of-the-art models. AllenNLP accelerates the translation of ideas into effective models by providing abstractions and APIs that relate to concepts familiar with researchers, as well as a suite of reference implementations from recent literature. This week, we’re releasing AllenNLP 1.0, unveiling new models, better performance, and fresh resources for the community.
The 1.0 version of AllenNLP is the culmination of several months of work from our engineering team (including over 500 GitHub commits!) and represents an important maturity milestone for the library. We’ve improved almost every corner of the platform, from our documentation to the addition of new NLP components to adjusting our APIs so they can better serve the community over the long haul.
Launched in 2017, the AllenNLP library provides natural language components that can be easily composed to build novel models. Model architectures can be clearly specified in a high-level configuration language which also provides an easy way for scientists to experiment with different architectures and parameters. Since its inception, AllenNLP has grown to include reference implementations of many models, with interactive demonstrations of over 20 models. The library has been used by over 800 open-source projects on GitHub and it’s been cited hundreds of times in academic publications. To learn more about the AllenNLP platform, read the whitepaper, or check out our new guide.
To stay relevant, the platform engineers work closely with AI2’s research scientists, who are innovating at the cutting edge of NLP and AI more broadly. One example advance is ELMo, described in the paper “Deep Contextualized Word Representations,” which first demonstrated how language models could yield significant gains across a variety of tasks. (To learn more about such models and their impact, see Contextual Word Representations: Putting Words into Computers.) The AllenNLP platform is designed to speed up new research that takes advantage of general-purpose modules like ELMo, and others developed since.
What’s included in v1?
Key highlights of the 1.0 release include:
- Several new models, including TransformerQA, and improved coreference model, the NMN reading comprehension model, and RoBERTa models for textual entailment
- The new AllenNLP Guide, an interactive resource that provides a comprehensive introduction to our library and experiment framework
- Performance improvements across the library, including switching to native PyTorch data loading, enabling support for 16-bit floating through Apex, and increasing the efficiency of multi-GPU training
- Splitting models into a separate model repository (allennlp-models) to give a clean core library with fewer dependencies
- Decoupling the experiment framework from core library components, making it easier to use the library without the experiment framework, and simplifying the config files in the process.
What’s next for AllenNLP?
Now that 1.0 is out, the whole team is planning a long, long vacation — just kidding! We’re actually growing our platform team so we can do an even better job of providing what research scientists need to build state-of-the-art NLP models. If you’re interested in joining the AllenNLP team, you can find our current openings here.
We plan to continue to invest in performance improvements and in infrastructure to make it easier to build up a broad library of demos, as well as to work closely with AI2 research scientists to make sure the library is keeping up with their latest research. We’re grateful to our users so far for their useful feedback and contributions to the library, and we hope to see even more community engagement in the future.
Edit 6/17/20: You can listen to AI2 senior research scientist Matt Gardner discuss the origins, challenges, and future plans for the AllenNLP library along with what to expect with the new release of v1.0 in this special episode of the NLP Highlights podcast.