Pinvestigator: A tool for exploring experiment-related changes
Alexandra Cong | Pinterest Engineering
As an intern on the Product Science team, I experienced firsthand the importance of A/B experiment-related metric changes. Here I’ll detail building Pinvestigator, a tool to explore those changes quickly by easing the process of accessing data about experiments and identifying their impact on Pinner growth. You’ll also get a peek into what it’s like to be an intern at Pinterest (spoiler alert, it’s pretty fantastic).
Getting started by looking at metrics
Our most important business metrics reflect the number of active Pinners, while we also have countless other charts about Pinner behavior, tracking everything from board creation rate to interest follow rate to Pin closeup rate. Often, we’ll take a look at a chart like the one below (or notice a change from an anomalous values email) and ask ourselves, “what happened there?!”.
The above chart shows the percentage of new Pinners who’ve repinned something during their first time using the product. Evidently, something drastic happened between March 9 and 10. As it turned out, this steep drop was the result of an A/B experiment. We run a lot of these experiments, and because there are so many experiments and metrics, it can be challenging to determine which experiment affected which metrics in what way. Getting to the bottom of this particular case involved one person noticing the change, working with a group to identify which experiment was the cause and turning that experiment off. It’s a long and involved process, and relies on an uncomfortable amount of luck. As the company grows , it can become even harder to keep track of running experiments. The alternative is to look into the data, which can be time-consuming and difficult.
To address this issue, we built Pinvestigator, a tool for exploring experiment-related changes in a fast, easy way.
Finding a method
The first challenge was figuring out what such a tool should do and how it should work, including how to make it general enough to be used across various use cases, yet still specific enough to find helpful information. I probably should have been more daunted, but I was a bright-eyed, bushy-tailed intern on my first day — I could take on anything! We thought the best way to move forward was to first try looking for experiments that had caused changes, and then determine if there was a way to generalize.
I began by looking into several changes in metrics where we knew which experiments were the causes. These experiments generally fell into one of two categories: either the experiment’s treatment group showed a significant change in Pinner behavior compared to the control, or the experiment had a large change in the number of Pinners (or sometimes both). The first is what you might think of when considering experiments that move metrics (after all, the control group is supposed to remain unaffected, right?), while the second catches experiments that can’t be found using the first method.
At this point, I was feeling great. A tool was possible! I should’ve had a sneaking suspicion that implementing Pinvestigator might be hard, but at least we had a vision. We wanted an interactive web tool where users could enter inputs describing the metric change (what day it was, what action, etc.), after which they would get lists for both categories of results, one of experiments with a treatment that caused a change in Pinner behavior and one of experiments that had a large change in Pinners.
With an idea of what to do with the data, the main question was where to get it. Hive queries had allowed flexibility, including filtering the data by Pinners who’d joined on a certain day or by the number of actions performed. However, each query might take 30 minutes, which doesn’t make for a very responsive web tool. With Hive, the user would be sent an email with the results, which wasn’t a bad experience, but it also wasn’t the best. The other option was HBase, which is less flexible than Hive, but much faster — possibly fast enough to be a responsive web tool. That sounded enticing.
In the end, I thought why not both? I first built a fast, interactive version of Pinvestigator that used precomputed data stored in HBase for the most common use cases, such as those where the user is looking at a change in the number of a certain action or a change in the number of Pinners who took an action on a given day. I could then add on extra functionality to run a Hive job and email the user the results, for more specific queries.
Unfortunately, like life, software engineering rarely goes as planned. I attempted to use data that already existed in HBase, but it was too slow to avoid the 90 second web timeout on our internal analytics site. Having gotten used to the idea of a fast web tool, though, I didn’t want to rely only on Hive. A colleague suggested precomputing data into HBase, which would remove even more functionality, but would presumably make it a lot faster. However, after implementing the “fast” solution, it would often still take three minutes or more to return the results in production. Disaster! Eventually, I decided that if I was going to send results emails for the slower, Hive-backed part of Pinvestigator, I could also do that for the HBase part. Crisis was averted! I finally had a tool to ship.
Does it work? Why do we care?
Remember that chart from earlier? I tested Pinvestigator on that metric change (the repin rate of new Pinners on iPhone dropped on March 10) and the first experiment it returned was indeed the one that caused the drop. No emails, and no need to manually dig into data! Pinvestigator isn’t perfect, though. There will always be some experiment that doesn’t show up, or some situation where a user wants to look at data that isn’t stored in Pinvestigator. However, Pinvestigator is a good tool for identifying some experiments that might be worth looking into further, and for showing that an experiment might not have been the cause of a given change.
Data is tremendously powerful, but can also be difficult. Pinvestigator empowers any employee who cares about crazy metric jumps to quickly access data to find the source. Ultimately, we use this data to better the Pinterest experience, which makes all the obstacles worthwhile.
Acknowledgements: Thank you to my mentor, Dan Frankowski, my manager, Andrea Burbank, and Chunyan Wang and Bryant Xiao on the Data Engineering team, who all helped me tremendously along the way. I couldn’t have done it without you!