Data Contract

PayPal’s New Open-Source Initiative: The Data Contract Template

Data Contracts — agreements that standardise the exchange of information between parties

Mark Craddock
Published in
3 min readAug 6, 2023

--

Data is an invaluable asset for businesses today. How this data is managed, shared, and maintained between its producers and consumers is a significant challenge. Enter data contracts — agreements that standardise the exchange of information between these parties. PayPal, being a global fintech leader, recognises this need and has recently introduced an open-source ‘Data Contract Template’. Let’s dive deeper into this initiative.

Understanding the Data Contract Template

On the surface, it’s a simple repository under PayPal’s GitHub account. Named data-contract-template, the project has been made public and is available under the Apache 2.0 license, fostering open collaboration. With recent activity, the template seems to be in active development.

What is a Data Contract?

For the uninitiated, a data contract is essentially an agreement between a data producer and its consumer, detailing:

  • Fundamentals: Basic tenets of the contract.
  • Schema: Definition of the data.
  • Data Quality: Assurance of data’s authenticity and accuracy.
  • Service-Level Agreement (SLA): Details of service availability and other guarantees.
  • Security & Stakeholders: Who can access the data, and who is responsible for it?
  • Custom Properties: Any additional terms or specifics related to the dataset.

A visual illustration also showcases how a data contract functions, highlighting its contributors, sections, and how it’s employed.

PayPal and Data Contracts

PayPal is no stranger to managing vast quantities of data. As mentioned, the data contract plays a pivotal role in the implementation of Data Mesh at PayPal. A PayPal Technology blog article provides more insights into how these contracts are leveraged within the company.

UPDATE 20/11/23 — Please see https://github.com/AIDAUserGroup/open-data-contract-standard as the contract template has evolved in an open standard called Open Data Contract Standard. AIDA User Group is thenon-profit hosting the project.

Community Engagement and Collaboration

Embracing the spirit of open source, PayPal encourages contributions to the project. Guidelines can be found in the CONTRIBUTING.md file. They also facilitate communication for their internal team, directing them to a dedicated Slack channel, #rosewall-help.

Moreover, the project has been getting attention from various corners of the tech community. Several articles, blogs, and discussions have emerged since its inception. Some noteworthy mentions include:

  • Data Engineering Weekly #130 which discusses ‘Data Contract in the Wild with PayPal’s Data Contract Template’.
  • A discussion by Jonathan Neo on Reddit.
  • Announcements confirming PayPal’s move to open source their data contract template.

Final Thoughts

It’s commendable to see industry leaders like PayPal not only adopting best practices but also sharing them with the community. Their Data Contract Template serves as a beacon for organisations aiming to streamline their data processes. With 321 stars, 17 watchers, and 36 forks, it’s evident that the tech community acknowledges the value of this initiative.

Whether you’re a budding developer, a data engineer, or just someone intrigued by the dynamics of data management, the paypal/data-contract-template is worth a look.

Disclaimer: The information is based on the data available as of August 2023 on GitHub. For more details and updates, please visit the official repository.

--

--

Mark Craddock

Techie. Built VH1, G-Cloud, Unified Patent Court, UN Global Platform. Saved UK Economy £12Bn. Now building AI stuff #datascout #promptengineer #MLOps #DataOps