This is the first of a series of mini-tutorials to help you with various aspects of the AllenNLP library.
If you’re new to AllenNLP, consider first going through the official guide, as these tutorials will be focused on more advanced use cases.
Please keep in mind these tutorials are written for version 1.0 and greater of AllenNLP and may not be relevant for older versions.
One way AllenNLP is commonly used is for fine-tuning transformer models to specific tasks. We host several of these models on our demo site, such as a BERT model applied to the SQuAD v1.1 question-answer task, and a RoBERTa model applied to the SNLI textual entailment task.
You can find the code and configuration files used to train these models in the AllenNLP Models repository.
This tutorial will show you how to take a fine-tuned transformer model, like one of these, and upload the weights and/or the tokenizer to HuggingFace’s model hub.
Note that we are talking about uploading only the transformer part of your model, not including any task-specific heads that you’re using.
First of all, you’ll need to know how a transformer model and tokenizer is actually integrated into an AllenNLP model.
This is usually done by providing your dataset reader with a
PretrainedTransformerTokenizer and a matching
PretrainedTransformerIndexer, and then providing your model with the corresponding
If your dataset reader and model are already general enough that they can accept any type of tokenizer / token indexer and token embedder, respectively, then the only thing you need to do in order to utilize a pretrained transformer in your model is tweak your training configuration file.
With the RoBERTa SNLI model, for example, the “dataset_reader” part of the config would look like this:
While the “model” part of the config would look like this:
Once you’ve trained your model, just follow these 3 steps to upload the transformer part of your model to HuggingFace.
Step 1: Load your tokenizer and your trained model.
If you get a
ConfigurationError during this step that says something like “foo is not a registered name for bar”, that just means you need to import any other classes that your model or dataset reader use so they get registered.
Step 2: Serialize your tokenizer and just the transformer part of your model using the HuggingFace
Step 3: Upload the serialized tokenizer and transformer to the HuggingFace model hub.
Finally, just follow the steps from HuggingFace’s documentation to upload your new cool transformer with their CLI.
That’s it! Happy NLP-ing!
If you find any issues with this tutorial please leave a comment or open a new issue in the AllenNLP repo and give it the “Tutorials” tag: