Santa’s going Serverless

Philipp Pahl
DataScienceJob
Published in
3 min readDec 19, 2018

Last Summer Santa C. approached us telling us that he wanted to replace his current data infrastructure which had become pretty messy over the last decades. The system includes a database of all people in the world, especially the children, their holiday wishes and wish lists, people’s behavior in the course of the year (including an API for parents to submit data of their children).

Santa and his SysAdmin elves were running a bunch of servers and load balancers with traffic peaks during the holiday weeks. The rest of the year the computing resources were more or less idle. The decision was to move the whole infrastructure to the Cloud.

The schema we came up with is a complete serverless approach based on Microservices and functions.

We introduced the following Microservices:

  • gcd (global children DB) Contains all children of the world
  • behavior Tracks behavior of the children, uses third party information from buddhism.org's Karma-API
  • wish_lists The lists of wishes
  • products Pool of information on available items and their relation to the items on the wish lists
  • presents Rates certain products based on behavior and further information about a person

The flow diagram which picks a person and gets the ratings for products which are listed in her or his wish list is depicted in the following figure. A person is randomly picked from the DB until all people have been processed. The behavior information and the various products based on the wish list are queried. Based on this information scores for each product are calculated and subsequently the item with the highest score is chosen.

As proof of concept we picked AWS Lambda and Python to implement the service. The following code snippet depicts the helper functions which invoke the Lambdas. We are making use concurrent's Futures to provide lazy and parallel evaluation of Lambda functions.

These 30 lines of code allow us to retrieve the best ranked present for a certain child:

  1. The next “unprocessed” child is picked
  2. Wishes of the child are retrieved
  3. Each wish is asynchronously mapped to the best fitting product
  4. At the same time information on the behavior is retrieved
  5. A score for each item is calculated and the one with the highest score is chosen

The bottom lines:
The aim of this article is to sharpen your awareness to “think functions”. This is obviously only a toy example, but imagine a future technology which allows us to implement Microservices and their functions in such a manner. The concerns are clearly separated, functions can be called anywhere in your code and you don’t need to think about scaling issues or maintenance.

Clearly the above example demonstrates the performance shortcomings of very high latencies and extensive execution times. The costs in a real world scenario would be pretty significant as well.

This post was inspired by a presentation from the guys of Binaris. We love their ideas and approach and will definitely watch their development and progress and share our experiences with their platform as they are going to release a public beta soon.

--

--