Nick DeChant & Kynan Lalone| Pinterest engineers, Video and Image Platform & Traffic
Pinterest engineers are tasked with showing Pinners the best ideas related to their interests, which includes taking into account signals like location. As part of these efforts, we built a video and image platform to enable a robust infrastructure, new features and tools surrounding media content on Pinterest. In this post we’ll highlight one of the most recent tools we built–new geo-blocking APIs.
Geo-blocking is an industry-standard practice that shows specific content to a user in a specific location for various reasons, while allowing for a seamless experience and adhering to local laws and protections. When designing these APIs, we created them to be similar to LEGO pieces–easily usable by all our teams so they can move fast and deliver products to Pinners instead of spending time learning new tools. Our goal was to make the APIs as straightforward as possible and document the process, making it simple to keep track of which action was performed on what piece of media. We also designed them for speed leveraging both AWS S3 APIs and our CDN provider’s fast purge APIs.
On the API side, when the video and image platform receives a geo-block or unblock request, a few things initially happen:
- Upon successful validation, we log that the piece of media is “in progress” with specified countries in Kafka.
- We then create a job specific to geo-blocking or geo-unblocking (both follow a similar flow). This allows the job to locate all variants of the piece of media stored in S3 using the video and image platform’s locate APIs.
- Next, we get S3 ObjectMetadata and add or remove custom UserMetadata indicating the countries in which the media shouldn’t be served.
- The job calls the platform’s CDN Purge API to purge the media on our CDN providers so that they fetch the media objects again with the updated metadata.
- Then the media is then logged as “successful” in Kafka with the countries specified.
Our Traffic team manages the request path from client to origin, and our CDN platforms permit us to install logic at the edge. When a client requests an image from the CDN, the edge node first attempts to fetch the object from the CDN cache. Since the Purge API has invalidated the object key, the CDN must do an origin fetch. We now have the S3 metadata in an HTTP response header and can cache this object for future client requests. Finally, a geo lookup is performed on the client’s connecting IP address and compared against the comma delimited list of response ISO 3166–1 alpha-2 country codes in the cached object’s response header. If a match is found, the client is given an HTTP 451 status code.
If specific URLs are blocked in certain countries, we want to be transparent, and so Pinners will see a short explanation for why they can’t access the content, which can include everything from copyright to licensing issues (which is more useful than an ambiguous 404 response).
Looking ahead, we’re taking on new technical challenges and improvements to our video and image infrastructure, focusing on putting Pinners first and making their experience of discovering and doing the things they love seamless. Stay tuned as we share our progress!