Serving 1x1 pixels from AWS Lambda endpoints

A no-headache guide to serve 1x1 pixels in a serverless, Pythonic world

Tracking pixels are the small building blocks of the modern world of web analytics. In a nutshell, analytics services (such as Google Analytics) make a GET request for a small Gif (invisible to the final user) to their servers to transmit client data with a standard web request, no different than loading an image (see for example StackOverflow for some more details to the story).

In the world of AWS Lambda (and similar services on Azure and Google Cloud), a lot of applications designed under the micro-services paradigm are going full serverless to take advantage of “infinite” scalability with no maintenance (not to mention rapid design, a natural infrastructure-as-code approach, etc.). It’s no surprise that when we started prototyping Tooso “V3” APIs and our new javascript library, the first design choice was API Gateway + AWS Lambda, all deployed and managed through Serverless.

Since we could not find a clear, concise, end-to-end example of how to build a Python, serverless endpoint for modern web analytics, we decided to share a toy project with all the essential ingredients.

In what follows, we are going to build a small analytics endpoint processing the parameters in the query string and returning a 1x1 pixel as a perfectly valid gif image: the original code is shared in the companion GitHub repo.

Python 101 and basic familiarity with Serverless (i.e. nothing more than a successful setup, as explained here) is all that is needed throughout the blog post.

Let’s get started.

Project setup and basic endpoint structure

The anatomy of our serverless project is pretty straightforward.

The yml file contains the details needed to deploy the endpoint to AWS. The file is as standard as a Serverless configuration can get (plus a flag allowing CORS): a pixel function is declared with handler.get_pixel as its handler, and http GET requests as its triggering events.

The handler Python file contains the code doing the actual processing. The code is heavily commented but it’s worth quickly reviewing the logic behind it:

  • at first, we read the query string parameters (if any) as they are passed through the lambda function by the AWS infrastructure (i.e. through the event dictionary in the lambda function);
  • to simulate a real use case, we wrapped the data we just received into a containing object and assign a unique id to the event;
  • we pass the wrapped data to a (fake) function that simulates a possible workflow for a web-analytics platform, i.e. dropping the event to a message broker for further processing downstream (for example, at Tooso we use AWS managed services like Kinesis to do this job); remember that all objects that should live across multiple requests (for example, a Kinesis/Postgres/S3 client) need to be declared outside the function, like the (fake) client in our toy project;
  • finally, we serve the response through a dedicated function, return_pixel_through_gateway. The response is pretty straightforward but it’s different than a standard JSON response you may have seen before with Lambda: we set a Content-Type header, we use the base 64 encoding of a 1x1 transparent gif as the body, and, finally, we explicitly specify the encoding.

Code deployment with Serverless and API Gateway console

Now that the function logic has been modeled, it’s time to deploy. As usual with Serverless, deploying is as easy as:

  • open Terminal and cd’ing into the project folder
  • type serverless deploy and wait for the framework to do its job
First deployment with Serverless

Since it’s our first deployment, it may take a few minutes to build all the infrastructure (API Gateway + Lambda). When all it’s done, make sure to write down the URL for our newly created endpoint before closing the terminal.

All done? Not yet, unfortunately, as there is one last (manual) step to be completed to make sure the code works end to end: if we hit the endpoint now, the resulting gif will actually be a corrupted image. What is missing is that we need to tell our API how to properly serve binary content — it’s tedious, but it’s a one-minute fix.

First, go to your Amazon API Gateway console, find the pixel API and click on Binary Support:

API Gateway console: add binary support

Add “image/gif” to the supported binary media types and save:

Make sure “image/gif” is added to binary media types

Go back to the main pixel API window, click on Actions -> Deploy API to deploy the changes we just made.

Re-deploy /pixel GET API to propagate the changes

Deployment is now done: let’s see if everything works properly.

Sanity checks

If you open your browser and past the URL of the newly created endpoint, everything should work as expected, i.e. a very tiny transparent gif should be served in your window:

If you inspect the response with your developer tools, you’ll see the expected content-type.

If you want to download the pixel instead, you can open Terminal and type:

curl -H "Accept:image/gif" https://{LAMBDA_URL}/{LAMBDA_ENV}/pixel > pixel.gif

You should get the 1x1 gif to your local computer for inspection.

Given the print statements we included in the function, if you’re curious you can also check Cloudwatch logs and make sure the data logged there is what you expect:

Cloudwatch records our debugging info

Finally, the project comes with an hello_pixel HTML page to just show the basic functionality of a pixel: if you put the right GET request as a src property in an HTML image, the browser, when loading the image, will transmit to the server all the client’s data bundled together in the request.

See you, space cowboys

In this quick tutorial we barely scratched the surface of how to design a modern, scalable web analytics infrastructure in the cloud. Possibilities are endless: for example, an alternative to a data pipeline powered by a message broker would be a purely lambda solution, in which request data are dumped to an s3 bucket and the bucket triggers a second lambda function doing ETL/data processing, and so on. In any case, whether you decide to remain fully serverless or not, knowing how to seamlessly serve pixels from your stack is a very important first step!

If you want to share your experience and best practices with modern web analytics and javascript libraries, please reach out directly to — and don’t forget to follow us on Linkedin, Twitter and Instagram.