Trapper: An NLP library for transformer models

Cemil Cengiz
Codable
Published in
6 min readNov 9, 2021
source: https://unsplash.com/photos/B-x4VaIriRc

Trapper (Transformers wRAPPER) is an NLP library that aims to make it easier to train transformer-based models on downstream tasks. It wraps HuggingFace's 🤗 Transformers library to provide transformer model implementations and training mechanisms. It defines abstractions with base classes for common tasks encountered while using transformer models. Additionally, it provides a dependency-injection mechanism and allows defining training and/or evaluation experiments via configuration files. In this way, you can replicate your experiment with different models, optimizers, etc by only changing their values inside the configuration file without writing any new code or changing the existing code. These features foster code reuse, less boilerplate code, as well as repeatable and well-documented training experiments which are crucial in machine learning.

Homepage

https://github.com/obss/trapper

Installation

You can install Trapper using pip as follows.

pip install trapper

Why You Should Use Trapper

Trapper merges the good features from Transformers and AllenNLP
  • You have been a Transformers user for quite some time now. However, you started to feel that some computation steps could be standardized through new abstractions. You wish to reuse the scripts you write for data processing, post-processing, etc with different models/tokenizers easily. You would like to separate the code from the experiment details, mix and match components through configuration files while keeping your codebase clean and free of duplication.
  • You are an AllenNLP user who is really happy with the dependency-injection system, well-defined abstractions and smooth workflow. However, you would like to use the latest transformer models without having to wait for the core developers to integrate them. Moreover, the Transformers community is scaling up rapidly, and you would like to join the party while still enjoying an AllenNLP touch.
  • You are an NLP researcher/practitioner, and you would like to give a shot to a library aiming to support state-of-the-art models along with datasets, metrics and more in unified APIs.

Key Features

Compatibility with HuggingFace 🤗 Transformers

Trapper extends Transformers!

While implementing the components of Trapper, we try to reuse the classes from Transformers library as much as we can. For example, Trapper uses the models and the trainer as they are in Transformers. This makes it easy to use the models trained with Trapper on other projects or libraries that depend on Transformers (or pytorch in general).

We strive to keep Trapper fully compatible with Transformers, so you can always use some of our components to write a script for your own needs without being forced to use the full pipeline (e.g. for training).

Dependency Injection and Training Based on Configuration Files

We use the registry mechanism of AllenNLP to provide dependency injection and enable reading the experiment details from the configuration files injson or jsonnet format. You can look at the AllenNLP guide on dependency injection to learn more about how the registry system and dependency injection works as well as how to write configuration files. In addition, we strongly recommend reading the remaining parts of the AllenNLP guide to learn more about its design philosophy, the importance of abstractions, etc. (especially Part2: Abstraction, Design and Testing). As a warning, please note that we do not use AllenNLP’s abstractions and base classes in general, which means you can not mix and match Trapper’s and AllenNLP’s components. Instead, we just use the class registry and dependency injection mechanisms and only adapt its very limited set of components, first by wrapping and then registering them as Trapper components. For example, we use the optimizers from AllenNLP since we can conveniently do so without hindering our full compatibility with Transformers.

Full Integration with HuggingFace 🤗 Datasets

In Trapper, we officially use the format of the datasets from datasets and provide full integration with it. You can directly use all the datasets published in the datasets hub without doing any extra work. You can write the dataset name and additional loading arguments (if there are any) in your training config file, and Trapper will automatically download the dataset and pass it to the trainer. If you have a local or private dataset, you can still use it after converting it to the HuggingFace `datasets` format by writing a dataset loading script as explained here.

Support for Metrics Through Jury

Trapper supports the common NLP metrics through Jury. Jury is an NLP library dedicated to provide metric implementations by adopting and extending the datasets library. For metric computation during the training, you can use jury style metric instantiation/configuration to set up on your Trapper configuration file to compute metrics on the fly on eval dataset with a specified eval_steps value. If your desired metric is not yet available on the Jury or datasets, you can still create your own by extending trapper.Metric and utilizing either jury.Metric or datasets.Metric for handling a larger set of cases on predictions.

Abstractions and Base Classes

Following AllenNLP, we implemented our own registrable base classes to abstract away the common operations for data processing and model training. Below, you can see the main parts of the library.

trapper components
Basic components of Trapper

Usage

To use Trapper, you need to select the common NLP formulation of the problem you are tackling as well as decide on its input representation, including the special tokens.

Modeling the Problem

The first step in using Trapper is to decide on how to model the problem. You need to model your problem as one of the common modeling tasks in NLP such as seq-to-seq, sequence classification, etc. We stick with the Transformers way of dividing the tasks into common categories as it does in its AutoModelFor... classes. To be compatible with Transformers and reuse its model factories, Trapper formalizes the tasks by wrapping the AutoModelFor... classes and matching them to a name that represents a common task in NLP. For example, the natural choice for POS tagging is to model it as a token classification (i.e. sequence labeling) task. On the other hand, for the question answering task, you can directly use the question answering formulation since Transformers already has support for that task.

Modeling the Input

You need to decide on how to represent the input including the common special tokens such as BOS, EOS. This formulation is directly used while creating the input_ids value of the input instances. As a concrete example, you can represent a sequence classification input with BOS ... actual_input_tokens ... EOS format. Moreover, some tasks require extra task-specific special tokens as well. For example, in conditional text generation, you may need to prompt the generation with a special signaling token. In tasks that utilize multiple sequences, you may need to use segment embeddings (via token_type_ids) to label the tokens according to their sequence.

Examples for Using Trapper as a Library

We created an examples folder that includes example projects to help you get started using Trapper. Currently, it includes a POS tagging project using the CONLL2003 dataset, and a question answering project using the SQuAD dataset. The POS tagging example shows how to use Trapper on a task that does not have direct support from Transformers. It implements all the custom components and provides a complete project structure including the tests. On the other hand, the question answering example shows using Trapper on a task that Transformers already supported. We implemented it to demonstrate how Trapper may still be helpful thanks to configuration file based experiments.

Training a POS Tagging Model on CONLL2003

Since Transformers lacks direct support for POS tagging, we added an example project that trains a transformer model on CONLL2003 POS tagging dataset and performs inference using it. It is a self-contained project including its own requirements file, therefore you can copy the folder into another directory to use as a template for your own project. Please follow its README to get started.

Training a Question Answering Model on SQuAD Dataset

You can use the notebook in the Example QA Project examples/question_answering/question_answering.ipynb to follow the steps while training a transformer model on SQuAD v1.

We are happy to release this package and excited for the future contributions from the open-source community.

Acknowledgement

Special thanks to Devrim Çavuşoğlu, Sinan Onur ALTINUÇ and Fatih Cagatay Akyon for their valuable feedback.

--

--

Cemil Cengiz
Codable
Writer for

Software engineer specialized in machine learning, interested in clean, reusable and testable software development.