PyTorchPipe: a framework for training neural networks from IBM

Vlad Tămaș

PyTorchPipe (PTP) is a framework that facilitates the creation and maintenance of a neural network training system. PTP is divided into blocks according to the stages of training neural networks that are connected into a single system.

PTP blocks are different steps of training a neural network: from data preprocessing to model testing — which interact with each other through data streams. Each stream can consist of several components: a piece of data for a task, any number of training components (models), and additional components for data processing and calculations.

As a result, the procedure of training and testing models ceases to be tied to the task and model architecture. PTP has built-in mechanisms for checking the compatibility of new data for the created pipeline. The system is designed to facilitate the development of integrated pipelines and model testing.

PyTorchPipe is based on PyTorch . PyTorch is also used to distribute computations on CPU / GPU resources. The PTP tutorial is available here .

Datasets

PTP currently contains basic datasets for tasks from three areas:

Architecture

What is standardly called a model is called a pipeline in PTP . A pipeline consists of many integrated components with one or more models. Models are the trained components of the pipeline.

PTP has built-in models for tasks from four areas:

  • computer vision;
  • natural language processing;
  • general purpose models;
  • visual question-answer system

For some models, you can select different options.

In addition to the models, components for working with data are available in PTP:

  • Text preprocessing methods;
  • Loss functions and statistics;
  • Data format transformations;
  • Data vizualization

Workers in PTP are Python scripts that are standard for tasks, models and pipelines they work with. In the current version of the framework, three workers are available: ptp-offline-trainer, ptp-online-trainer and ptp-processor. They are responsible for how the learning process goes.

A detailed description of the tool is available in the official repository on GitHub .

Vlad Tămaș

Written by

AI Consultant focusing on bringing intelligence into our customers next generation on products.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade