Pinball: Building workflow management
Pawel Garbacki | Pinterest engineer
Almost every data-driven company depends on a workflow management system. At Pinterest, we built Pinball, our own customizable platform for creating workflow managers before constructing a manager on top of it.
The birth of project Pinball
Hadoop is the technology of choice for large scale distributed data processing, while Redis does for an in-memory key-value store, and Zookeeper handles synchronization of distributed processes. So why isn’t there a standard for workflow management?
What is it that makes workflow management special? The workflow manager operates at a higher level than other systems. It usually comes as a package including the workflow configuration language, UI, executors with adapters for specific computing platforms (e.g., barebone OS, Hadoop, Hive, Pig), scheduler, etc. The broadness of the scope makes it challenging to come up with a one-size-fits-all type of solution, and flexibility is a desired virtue of a well designed workflow manager.
Many of the workflow managers available in the open source space fail to satisfy the requirements of the problem domain. Built on fixed skeletons, they often follow a monolithic design, and adding custom components requires drilling into the system core. Merging local changes with external releases becomes challenging, and pushing local changes to the external repository is often not possible.
Tokens get the (Pin)ball rolling
The key to flexibility is abstraction. In Pinball, the finest piece of system state that’s atomically updated is the token. Tokens may be owned for a limited time, and only owners are allowed to modify the token’s content. Tokens are versioned with identifiers unique across the time and space, where a version gets updated every time a token is modified and it’s never reused, even in different tokens. Version numbers are included in modification requests effectively implementing transactional updates.
To keep track of a token’s identify across modifications, a token is assigned an immutable name — an identifier unique across the space — at creation time. At a given point in time, only one token may use a given name.
One master to rule them all
Pinball core is built on top of master-worker paradigm. It’s generic, scalable, resilient, and above all, simple, which makes the system concepts easier to grasp, debug, and build on top of.
Workers periodically contact the master to claim tokens and perform tasks they represent. Every state (token) change goes through the master and gets committed to the persistent store before the worker request returns. Consequently, any component, including the master, can die at any time and recover without compromising the state consistency. Expiring token leases take care of the disappearing workers. Workers can be added and removed at will, practically at any point in time which becomes useful when dynamic, load-based resizing becomes an objective.
Newly created tokens are labeled as active and kept in master’s memory for efficient access and updates. Irrelevant tokens are either deleted or archived. The transition from active to archived state is one way. Archived tokens become read only and they are pushed out of memory.
Consequently, workers can read archived tokens directly from the persistent storage, bypassing the master, greatly improving system scalability.
Workflow layer and the master-worker paradigm
The master-worker paradigm is applied to workflow management but not as a horizontal extension but rather a vertical layer on top of the facade described in the previous section. The following components constitute the Pinball workflow management framework.
Configuration parser converts a workflow description into a set of tokens. Typically, tokens are defined at the granularity of individual jobs forming the workflow topology.
Workers impersonate application specific clients. Some workers, executors, are responsible for running jobs or delegating the execution to an external system (such as Hadoop cluster) and monitoring the progress. The scheduler worker makes sure that workflows are instantiated at predefined time intervals.
The UI visualizing the execution progress and providing access to job logs is also a worker. The UI may need to visit arbitrarily old workflows introducing the need to keep tokens of finished jobs around. This is where the token archival mechanism and direct reads from the persistent store come in handy. The UI can go back in the history as far back as needed without putting load on the operational components.
Any component can be replaced without affecting the remaining ones. For instance, supporting a new workflow definition language is as simple as replacing the workflow parser translating the configuration to a set of tokens. Similarly, adding a new type of executor does not require changes in other parts of the system.
- Python-based workflow definition language and an accompanying parser, and an alternative, full-UI workflow and job editor
- Github-backed storage of workflow configurations
- Job executors interfacing with local OS, Hive, and Hadoop
- Workflow visualization and tracking UI
- Job log explorer
- Auto-retries of failed jobs
- Email notifications
- User authentication
- Scheduler governing workflow execution timelines and supporting various overrun policies
- Ability to retry failed jobs, abort running workflows, drain the system, resume the execution from where it left off
- No-downtime releases, as the system can be upgraded to a new software version without disrupting running workflows
- Dynamic resizing, so workers can be added or removed without taking the system down
Each of these features can be removed, altered, or replaced with minimal effort and without the need to understand all the intricate details of the system core.
We’ll be open sourcing Pinball soon. Keep an eye on this blog and our Pinterest Engineering Facebook Page for updates.
Pawel Garbacki is a software engineer at Pinterest.