Improve user experience: solving core data inconsistencies at Pinterest

Zhihuang Chen | Software Engineer, Core-Services

Background

Challenges naturally occur with Pinterest’s rapid growth. As a Pinner, you might have noticed some instances where your data doesn’t look “correct,” and you may have had a negative experience because of it. For example: the “Pin count” in your profile shows the incorrect number of Pins, as shown in the left picture.

We call these referential integrity scenarios “data inconsistencies” because data stored in one backend doesn’t match the data stored in another backend. As you can see, those inconsistencies directly influence Pinners’ experience because we are showing outdated content to them. Also, these inconsistencies contribute a lot to our team’s maintenance cost. If a Pinner found their data is wrong, they might report this to our operations team. When the operations team is unable to solve these problems using regular operations, they would create bug tickets for us. In order to improve user experience, as well as reduce team’s maintenance work, we hope to build a tool that automatically detects and fixes these inconsistencies.

Core Models

We use MySQL as our major datastore to store content created by Pinners. To store billions of Pins, boards, and other data for hundreds of millions of Pinners, many MySQL database instances form a MySQL cluster, which is split into logical shards to manage and serve the data more efficiently. All data are split across these shards.

Below is a simplified relationship among these three models we stored in MySQL:

Challenges

Our databases are sharded and some of the writes are asynchronized, which means sometimes we can’t do the update in one transaction. For example, when you create a Pin, it should insert into the Pins table, Pinner-Pin relationship table, and board-Pin relationship table. Ideally we do these three updates in one transaction, but sometimes the three tables we want to update don’t exist in the same shard, or the three updates are asynchronized. Inconsistency happens if we write to one table successfully but others fail.

Solution

Architecture

We want to solve this problem by building a tool for auto-detection and auto-resolution of data inconsistencies. The most important part of this tool consists of two parts:

  1. Existence validation jobs, shown in the pink box, are used to check the existence of data in the database. The existence checking jobs will be triggered on every write to three core objects (Pinners, boards, and Pins).
  2. Stat validation jobs, shown in the orange box, are used to check stats accuracy . One of the stats checking jobs checks user stats, which stores the number of public Pins of a Pinner. The other checks board stats, which stores the number of Pins a board has.

Existence checking jobs are lighter than stats checking jobs because they only need to query if data is in the table or not. Stats checking jobs involve more computation as they need to pull exact data since we are validating a specific number. For example, when checking board stats, we need to get all Pins of the board in order to calculate this number. All of these jobs are asynchronous jobs that use Pinlater, which is a Pinterest in-house job scheduling and execution tool. Compared with other job scheduling tools we have, Pinlater is the most light-weighted and provides high throughput, adjustable dequeue rate. Beyond that, the enqueue to dequeue latency is pretty low, near real time.

As shown in the diagram above, the whole flow is:

  1. This tool is triggered when our service detects the write operation of core objects, and it enqueues a proxy job with some parameters such as unique object id, operation type, and additional parameters.
  2. Then, this proxy job will enqueue one of the existence checking jobs.
  3. If a Pin is created, boardstats should increase by one. If a Pin is deleted, boardstats should decrease by one. That is, some operations could influence stats, so we might want to enqueue stats jobs too.
  4. Once these async jobs were dequeued and executed, the job logic will check databases and caches to make sure data is consistent.
  5. If it detects any inconsistency, it will fix the inconsistency and re-enqueue the job to check again to make sure it is consistent.

Deferred and Limited Job Execution

Another important part is limited stats job execution. So first, what’s this? As I said before, board stats will be modified if a Pin is created or deleted, thus, every Pin creation or deletion will trigger a possible enqueue of the boardstats checking job once. And if you deleted a board containing 100 Pins, the board stats job for this board_id will be enqueued 100 times. This is a waste of computing resources and brings extra load on our services as each job will query data online. To solve this issue, we use memcache to store the IDs we have enqueued and check if the ID is in the cache or not. If it’s in the cache, which means there is already a job enqueued, we don’t need to enqueue again. When stats checking job is dequeued and executed, it will delete the cache entry so next time it can be enqueued again. Even if it fails to delete the cache, cache has a TTL so the ID would be removed once it expires.

Results

Acknowledgments

Pinterest Engineering Blog

Inventive engineers building the first visual discovery engine, 200 billion ideas and counting.

Pinterest Engineering Blog

Inventive engineers building the first visual discovery engine, 200 billion ideas and counting.

Pinterest Engineering

Written by

https://medium.com/pinterest-engineering | Inventive engineers building the first visual discovery engine https://careers.pinterest.com/

Pinterest Engineering Blog

Inventive engineers building the first visual discovery engine, 200 billion ideas and counting.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store