PADL: portable PyTorch pipelines facilitating deep-learning model use

Duncan Blythe
PyTorch
Published in
9 min readApr 7, 2022

PADL (“Pipeline Abtractions for Deep Learning”) is a recently published open-source framework for PyTorch, which allows users to build flexible pipelines including PyTorch layers. In this article we show how PADL facilitates the use and lifecycle of deep learning models well beyond training, filling a significant gap in the deep-learning tooling landscape.

Introduction

An old adage in programming is:

Programs are read more often than they are written.

Likewise, in deep learning, we can say that:

Models are used more often than they are trained.

The PyTorch, and the deep-learning ecosystem in general, abounds with tools for training models, and squeezing the best performance out of computational resources in doing this. In the life cycle of a model this is only the beginning of the journey. Once a model has been trained, it will be shared, and used in a multitude of contexts, often on a daily basis, in operations, evaluation, comparision and experimentation by data scientists. The use of the trained model, is how value is extracted out of the resultant weights. Despite this important fact, support for using deep-learning models up to now has been very thin in the PyTorch ecosystem and beyond. PADL is a tool which fills this void.

Using deep learning models

Let’s unpack what we mean by “use” here. Using a deep learning model can mean many things:

  • serving model predictions
  • experimentation with pre-trained models in Jupyter notebooks and interactive sessions
  • inspecting and visualizing intermediate model features
  • evaluating and monitoring model performance on metrics and incoming test data
  • computing model outputs as a preprocessing step for further data science tasks

In these tasks two things are critical:

  1. SIMPLE USE: A paramount requirement is that using a deep learning model should be as easy as possible.
  2. CLEAR LINEAGE: Subsidiary to (1), it should be possible and straightforward, when using a model, to know where a model came from and how it works.

“Simple use” clearly needs no extra justification — if there’s something we do often — and useful models are used often — this should be as easy as possible. It should be possible to get started with a trained model with as few commands as possible, and with as little environment setup. It shouldn’t be necessary to mess around with complex configuration systems, or to trace a cumbersome paper trail of lineage back to the code base which created the model.

“Clear lineage” is important, because the time between training and using a model can be substantial. The teams using and training the models may indeed be distinct. In those cases, we would like to avoid the model existing only as a blob of compiled code and also avoid opaque code lineage.

To the best of our knowledge, “simple use” has enjoyed surprisingly little attention in the open-source space. PADL aims to fill this void.

On the other hand “clear lineage” is the focus of many great projects in the MLOps space — for example:

  • Data Version Control allows users to track exactly which data was present in connection to a git project.
  • Gin Config and Hydra allow users to build models based on a compositional configuration system, making clear what parameters created a model.
  • PyTorch Lightning allows users to structure their code in such a way that the important details of training will predictably live in certain key methods.

PADL has a slightly different approach to lineage than these projects, which we think can simplify the life of deep learning developers, as we will see below.

Using vanilla PyTorch models

Let’s first look at how you would normally use and test a previously trained model in PyTorch.

Usually we have trained some model, and saved weights, as recommended in standard PyTorch usage, in some directory, say models/mymodel/weights.pt. Now we’d like to use these weights with the associated model in another context. In order to do this, we would need to reinstantiate the model we trained to get these weights, and then load the weights back into the model with torch.load and .load_state_dict.

For this to be possible, we need to have access to the exact same classes and functions which we used during training — that means our training code should be easily importable. This could be implemented, for example, using pip and versioning, or through some git logging system, so that we can securely load the exact same code as used in training. These systems need to be setup, configured and maintained — not always a simple matter.

In addition we would either need:

  • (not recommended) to remember which classes and functions were necessary by heart
  • (better) to have logged these somehow during training
  • (most flexible) use some type of configuration system such as Hydra or Gin Config to record which functions and classes compositionally led to the model

That would allow us to reinstantiate the model with pre-trained weights loaded. But wait, it doesn’t stop there. If we want to be sure that we are using the model correctly, we need to make sure that the prepared tensors obey the same logic as in training, but with augmentation switched off. That means we need to build some logic into our code base or configuration system to correctly invoke the evaluation and inference time tensor preparation. In doing so we should pay attention to how this differs from, but at the same time correctly reflects, the training preparation.

Finally, once we have gotten correct tensor outputs from our pre-trained model, we still need to apply logic to get usable predictions. For instance, in classification, we would need to maintain a label dictionary or lookup table, or in more complex cases, specify more precisely hyper-parameters in inference (think beam search, with beam width etc.). Here again, the way this is done isn’t reflected a priori in the training process, so needs to be configured either during training, or specified at load time.

For the most common examples, especially in computer vision applications, these steps are manageable when executed manually or explicitly, but as soon as we get into the realm of multimodal, complex and branching models, this can become very unwieldy and cumbersome.

Simple use with PADL

This is where PADL comes in. The magic of PADL makes the following possible (how to get to this is covered in the sections below):

The model pl (called a “pipeline” — more below) has been defined in such a way that absolutely everything necessary to use the PyTorch model in practice is included in pl. There are in no way any additional bits of subsidiary data or code required to get these lines to work: no weights, no JSON files, config files, data blobs, or otherwise. That means you simply pass around the .padl directory if you want to share the model so that other developers can use it. Using the model involves a one line load command padl.load. The loaded object includes preprocessing, forward pass, and post-processing. Think:

A PADL pipeline has an inbuilt data loader, and contains all lookups, data blobs, PyTorch layers and output preparation.

Without PADL, using PyTorch models can require serious overhead.

Clear lineage with PADL

In developing PADL we wanted to make model lineage as transparent as possible — i.e. we wanted to clarify how the model was constructed in a very direct way. The solution we arrived at, is that:

Saving a model means saving the code and saving the data artifacts

The PADL-saved output, contains all the code and only the code, variables, functions, classes and data artifacts necessary to recreate the model.

The PADL saver tracks and extracts only the absolutely essential components of the code base and saves these into a small python module inside the my_model.padl directory. This is a very direct approach to model lineage.

PADL in more depth

How can we achieve this in PADL? PADL uses a functional API to connect the dots between objects implementing pre-processing, forward pass and post-processing and the actual lines of code used to define these objects. The PADL functional API is easy to work with and has many benefits aside from saving and loading.

Let’s take a closer look.

PADL has 2 key concepts: transforms and pipelines.

Transforms are the basic building blocks in PADL. These are callables defined using standard Python functions and classes, as well as PyTorch layers, and wrapped with the @padl.transform decorator. Here are some examples:

  • Any function decorated with @padl.transform is a transform.
  • Any callable class, including PyTorch layers, decorated with @padl.transform becomes a transform when instantiated.
  • Any class instance or callable wrapped with callable_ = padl.transform(callable_) is a transform (including lambda functions).
  • There are also some handy helpers such as padl.same.lower() which gives the same result as padl.transform(lambda x: x.lower()) and which also supports indexing, as in, for examplepadl.same[:, 0].

Pipelines consist of transforms and other sub-pipelines chained together as a directed acyclic graph, which may include PyTorch layers, batch data loading based on preprocessing, and post-processing.

Here is an example pipeline pl:

What do we see here? pl is a model pipeline - the model is written in a such a way as to reflect graphically the way it operates. The input goes into the pipeline at the top with text_process and is passed through the pipeline from top to bottom.

How the inputs are passed through the pipeline is determined by which operators are used. In this example we have:

  • >> “compose”: the left hand side’s outputs are passed to the right hand side (here it’s rather the upper’s output is passed to the lower).
  • + “rollout”: the operands are applied or “rolled out” over a single input producing a tuple of outputs.
  • / “parallel”: the operands are applied in “parallel” to a tuple of inputs, producing a tuple of outputs.

The padl.batch transform is a special transform, which signifies to PADL how to construct a PyTorch data loader out of the previous steps. The padl.unbatch transform signifies the inverse, that all computations afterwards should be mapped serially over the batched elements of the preceding step. The PyTorch layer(s) you would like to include, may be wrapped in exactly the same way as other transforms, and included in the pipeline without further ado.

After defining the pipeline like this, we have some nice features available. If the user is in an interactive session, they may view the object graphically with illustrative ascii-art:

This output allows users who are using the model for the first time to get a nice intuition about what happens in the pipeline.

Iterating through data (and so training) with this model is now easy:

Using the model, with gradients switched off, may be done for single data points (“infer”) or batch mode (“eval”):

When you’re done, you can save, load and use the model with minimal effort.

Let’s save the pipeline example above. This line works in interactive sessions, in Jupyter notebooks (even with variables and definitions spanning multiple cells, see a full example here) as well as regular python programs.

The model directory has the structure displayed below. Data artifacts (in this case weights of 2 PyTorch layers) are labelled according to which position the transform they belong to occupies in the pipeline:

my_model.padl
|__13.pt
|__14.pt
|__requirements.txt
|__transform.py

The transform.py defines the pipeline object. For the example above it looks like this:

As you can see, the pipeline code exposes exactly which components went into the pipeline, how these are defined, and what their dependencies are. The requirements.txt file is created automatically on saving the model. It includes only the versions and packages of the code necessary in the transform.py module, as well as the python version used. This is critical information when loading and reusing the pipeline in a new environment.

# created with python-3.9.10
padl==0.2.5
torch==1.10.2

We’ve seen that PADL makes defining a pipeline using PyTorch layers extremely straightforward. The user also is able to produce saved outputs which make using the pipeline extremely easy and transparent. The pipeline can be trained with minimal boilerplate in PyTorch Lightning, can then be served in one line with TorchServe, and interacts well with the entire PyTorch ecosystem (for example Hugging Face).

Happy PADL-ling!

Would you like to know more about PADL? Then you might like to try out these resources:

--

--