Looking inside the technology that powers Pinterest
By Vanja Josifovski | CTO
Pinterest began as a small startup, and has grown to a company that serves +175B Pins to +250M people. While we have one of the largest datasets online, we have 600+ engineers, and so technology needs to be accessible for each person to play their part in efficiently building and scaling this visual discovery engine.
Underlying all of this growth is technology built into systems that organize and serve massive amounts of data on both clients and servers. In a relatively short period of time, our journey has forced us to build and rebuild our systems multiple times — sometimes with urgency. Therefore, many of our technical implementations are as much a result of historical circumstances and urgency to deliver as they are of rigorous design.
As with all organically grown artifacts, several parts of our technical systems didn’t develop in a predictable and logically pre-designed manner. As a result, different parts of the stack sometimes collide and conflict with each other. Other times, there has been significant overlap and duplication between systems. While these issues largely resulted from the velocity and frugality of our efforts, they’ve also fueled our company’s growth to date.
To continue scaling, last year we began optimizing for velocity and effectiveness by clarifying and codifying the direction of our technical foundation’s development, resulting in a set of technical strategies for key portions of our stack.
Unifying these individual technical strategies is a framework for technology at Pinterest defining the mission, context and key underlying principles of our individual technical strategies.
Our technical strategy’s mission matches our company’s mission: to help people discover and do things they love. We will not develop technology just for the sake of developing technology. Instead, we will develop technology with a purpose — to support Pinterest’s mission.
Let’s examine the context of product engineering at Pinterest and see what kinds of systems we need in order to provide the greatest possible experience for Pinners, as defined by our mission. Here are the key parameters of our technical environment:
- Complex product: Pinterest as a product is rather complex. The amount of data we generate, consume, process and serve is enormous. We have different surfaces requiring specific approaches to selecting presented Pins. For instance, the home feed requires personalization fundamentally different from Search and Related Pins in a Pin closeup. We have heterogeneous data — products and Rich Pins are very different artifacts than regular Pins. Finally, every surface has both organic and Promoted Pins, which are also fundamentally different in terms of corpus size, lifetime and user interaction patterns.
- Resources: As a growing company building a product for hundreds of millions of people, every engineering team at Pinterest could always use more resources. We are a small organization that covers a lot of territory. Headcount allocation is one of the hardest decisions for the leadership team as there are always several different areas where added resources can improve the outcome for Pinners, Partners and Pinterest (always prioritized in that order).
- Ambition: As we grow our global user base, our systems must be geared toward achieving a long-term growth trajectory and not focus on small, incremental improvements.
Here are the key strategy principles defining how we approach technological advancement in general:
- Simplicity & Velocity: In alignment with our engineering principles, we focus on starting simple and then iterating. Our technical strategy is critical to ensuring we have a clear, well thought-out, yet flexible northstar for our engineering teams. This provides just the right amount of direction while giving us sufficient freedom to leverage the latest innovations. Everything we build needs to support rapid iteration.
- Scale: We build for impact systems that are Pinterest scale, looking ahead to billions of users and trillions of Pins. Our technical strategies ensure long-term thinking and allow us to anticipate and build for future scale instead of making short-sighted decisions that may require expensive rework later.
- Ownership: By keeping our strategies directional rather than strictly prescriptive, we encourage engineers to own local decisions about balancing velocity and quality. Additionally, the strategies themselves are developed with broad input from engineers across the company to ensure all teams have a real stake in charting the technical future of our stack.
Additionally, we have identified the following principles to inform our strategic vision more specifically:
- Reusability: While focusing on the problem at hand, we look to see if there are other systems that could satisfy our needs. We join forces to build together, almost always crossing internal boundaries and often crossing corporate boundaries by using and contributing to open source projects. We think hard about how to bring use cases to a maximal common denominator. We consider many different ways of reuse and find the right balance with velocity.
- Focused complexity: We are experts in our areas and need to be well-versed in technology outside of Pinterest. We are able to pick and choose the right cutting-edge technology and leverage key areas where we choose a complex solution. We accept complexity very deliberately and with deep understanding of the tradeoffs.
We have embodied these principles in the individual technical strategies below. We’re developing these strategies with appropriate input from technical leaders while avoiding unnecessary disruption to engineering teams. In the spirit of iteration, technical strategies are living artifacts and will continually evolve over time.
Here are the current strategies that are either completed or in progress:
- Machine Learning: Machine learning is critical to the operation of many teams and technologies at Pinterest. A unified strategy helps us maximize the velocity of model experimentation. From this strategy, we developed a single model training and serving pipeline that powers the majority of both our organic and ads use cases. The technology developed from this strategy reduces the barrier to building new ML-based production applications at Pinterest.
- Content Distribution Infrastructure: Most core use cases at Pinterest serve organic or paid content based on a user input. Common to all of these are technical challenges like achieving low latency, supporting huge scale, avoiding system fragmentation, and so on. As such we have built a strategy based on a common set of building blocks and usage patterns that allow the right mix of reuse and customization. Among others, these building blocks include inverted indices with incremental updates, key-value stores, scatter-gather layers and graph traversal infrastructure..
- Data Management: Pinterest has a strong culture of data-driven decision-making via real-world experimentation. To ensure the continued success of our engineering and technical decision-making, we reinforce and maintain trust in our business-critical metrics, improve developer productivity, and increase ROI on data pipelines. The data management strategy track focuses on areas like data governance, quality, discovery, and encoding.
- Data Processing: Building off the broader data management strategy above, this track provides strategic direction on specific logging, query processing, programming frameworks and data processing systems.
- Experimentation: Our iteration speed depends on the pace at which we can run experiments. Under the experimentation strategy, we have defined the evolution of our experimentation infrastructure and methodology to 10x experiment throughput.
- Cloud: The cloud strategy initiative aims to document our strategic approach to our foundational cloud infrastructure with a 2–4 year horizon. Clear infrastructure direction will ensure Pinterest remains highly available, resilient, performant, well-utilized, cost effective and predictable.
- Core Client Platform: This track charts out a strategic approach to building client technologies that produce a fast Pinner experience, align with platform conventions, take advantage of native device capabilities and quickly respond to changing experiments, network conditions, and server responses.
- API: The API track provides a coherent and consistent direction for all APIs and API endpoints at Pinterest. This includes both internal APIs for serving product features to first-party clients as well as external APIs for partners, third party application developers and third-party product integrations.
The strategies are live and evolving documents. They are used as a reference when starting new efforts as well as to onboard new engineers. The strategies also help drive internal clarity on technical issues that might span multiple organizations. Finally, the strategy documents are used to communicate with stakeholders outside engineering on the approaches used and the level of funding needed to support our core engineering goals.
Now that you know more about our technical framework, check out our open engineering roles and join us!