PyTorchPipe (PTP) is a framework that facilitates the creation and maintenance of a neural network training system. PTP is divided into blocks according to the stages of training neural networks that are connected into a single system.
PTP blocks are different steps of training a neural network: from data preprocessing to model testing — which interact with each other through data streams. Each stream can consist of several components: a piece of data for a task, any number of training components (models), and additional components for data processing and calculations.
As a result, the procedure of training and testing models ceases to be tied to the task and model architecture. PTP has built-in mechanisms for checking the compatibility of new data for the created pipeline. The system is designed to facilitate the development of integrated pipelines and model testing.
PTP currently contains basic datasets for tasks from three areas:
- computer vision ( MNIST , CIFAR );
- natural language processing (WiLY, WikiText , ANKI );
- visual question-answer systems ( CLEVR , GQA , ImageCLEF VQA )
What is standardly called a model is called a pipeline in PTP . A pipeline consists of many integrated components with one or more models. Models are the trained components of the pipeline.
PTP has built-in models for tasks from four areas:
- computer vision;
- natural language processing;
- general purpose models;
- visual question-answer system
For some models, you can select different options.
In addition to the models, components for working with data are available in PTP:
- Text preprocessing methods;
- Loss functions and statistics;
- Data format transformations;
- Data vizualization
Workers in PTP are Python scripts that are standard for tasks, models and pipelines they work with. In the current version of the framework, three workers are available: ptp-offline-trainer, ptp-online-trainer and ptp-processor. They are responsible for how the learning process goes.
A detailed description of the tool is available in the official repository on GitHub .