Open-sourcing KingPin, building blocks for scaling Pinterest
Shu Zhang | Pinterest engineer, Infrastructure
When we first started building Pinterest, we used Python as our development language, which helped us build quickly and reliably. Over the years we built many tools around Python, including Pinball, MySQL_utils and pymemcache, as well as a set of libraries used daily for service communication and configuration management. Today we’re releasing this toolset, KingPin, as our latest open-source package.
KingPin contains some of the best practices we learned when scaling Pinterest, including:
- A local daemon to deal with the ZooKeeper’s single point of failure (SPOF) problem. The daemon is running on ~20K hosts delivering configuration data in less than 10 seconds.
- A Python Thrift client wrapper for enhanced functionality. We send hundreds of thousands of requests per second via this Python client across Pinterest.
- A configuration management framework. We have over 400 configurations being updated and consumed through this framework.
KingPin use cases
You may want to try out KingPin in any of the following cases:
- Your stack is also Python-oriented and running on AWS.
- You want to make your ZooKeeper cluster more robust and resilient.
- You’re building a configuration system and want your configurations to support a rich set of data structures like lists, maps, sets and JSON.
- You want to use S3 to store some of the most critical metadata.
- You’re using Thrift and looking for a more reliable client library.
KingPin has the following components working together:
- Kazoo Utils: A wrapper for Kazoo that implements the utils we use for the RPC framework, service discovery and some enhancements of native Kazoo APIs.
- Thrift Utils: A greenlet-safe wrapper for Python Thrift client with error handling, retry handling, load balancing and connection pool management built in.
- Config Utils: A system that stores configuration on S3 and uses ZooKeeper as the notification system to broadcast updates to subscribers. (See our previous blog post for additional details.)
- ZK Update Monitor: A local daemon and server that syncs subscribed configurations and serversets to local disk from ZooKeeper and S3. This is a key part of how we make our use of ZooKeeper fault-tolerant. (For more on this design, check out this blog post.)
- Decider: A utility we use to control online logic flow, one typical use case is experiment control. Deciders are set so every A/B testing experiment can be turned on or off in real-time without any code deploy. Decider is built on top of Config Utils.
- Managed Data Structures: A convenient map/list data structure abstraction in Python built on top of Config Utils.
- MetaConfig Manager: A system that manages all configurations/serversets and dependencies (subscriptions), built on top of Config Utils.
Real-time configuration management and deployment
An additional use case of KingPin is managing configurations in real-time. For example, engineers might create a new configuration via MetaConfig Manager and add it to a subscription we call “Dependencies.”
Configuration content is stored in S3 as the ground truth and uses ZooKeeper to track and propagate updates. In order to get the configuration subscribed downloaded properly, ZK Update Monitor must be running on the subscriber machine.
Applications can read the file out and decode into Python object for CRUD operations using variety of APIs.
We rely on KingPin to move towards SOA (service oriented architecture) inside Pinterest. An essential building block for SOA is service discovery. A service client needs to know the addresses of the service endpoints to connect and send request to them. KingPin provides a script for service endpoints to register themselves to ZooKeeper so the endpoint list (“serverset”) can be consumed by service clients.
Similarly, ZK Update Monitor downloads the serverset from ZooKeeper and puts it into a local file. Serversets change dynamically when server nodes join or leave.
Using the Mixin provided in Thrift Utils, a Thrift client reads the local serverset file and talks to the endpoints using any HostSelector algorithm. The Mixin also manages the connection pool and allow users to set various timeouts and retry policies according to specific use cases.
We use KingPin across various parts of our infrastructure. For example, ZK Update Monitor is running on every box at Pinterest to deploy the latest configurations and serversets in real-time. Managed data structures are used for serving write-rare-read-frequent configuration data, such as a blacklist of domains we use to filter spam. The Python service framework is used by every Thrift client.here are hundreds of deciders controlling online logic and turning on and off various experiments.
Acknowledgements: KingPin is a joint effort across Pinterest engineering and has significantly evolved over the years. Contributors include Xiaofang Chen, Tracy Chou, Dannie Chu, Pavan Chitumalla, Steve Cohen, Jayme Cox, Michael Fu, Jiacheng Hong, Xun Liu, Yash Nelapati, Aren Sandersen, Aleksandar Veselinovic, Chris Walters, Yongsheng Wu and Shu Zhang. Thanks to Jon Parise for his support during the open-sourcing effort.