Python, Memcached, & Kubernetes: Caching in Distributed Cloud Native Platforms

Developing and deploying your own caching system with memcached

6 min readApr 18, 2020

Authors Note: The tutorial parts of this article is for POSIX systems such as Linux and MacOS! The theory will be true for all systems!

The tutorial assumes you are familiar and have the following tools installed:

Python3.7 or higher
pip3 packet manager
Docker
Kubernetes, and a basic understanding of Pods, Deployments, Services, and Load Balancers. ContainerMind have a great article for beginners.
Helm

When developing any kind of application, the speed at which you can present and access data is incredibly important. We’ve all experienced long loading times, be it in recompiling data or accessing a database, and caching is an effective way of giving a performance boost to your application.

Python comes natively with a toolset to enable caching, with data structures such as dictionaries. Though by definition, these are local to that instance of the software, which is less than ideal for large-scale, distributed platforms.

This is where Memcached comes in! Memcached is essentially a secondary server running alongside your application, that uses extremely fast key-value pairs to store and retrieve data. This process is in addition to your SQL database, and is effectively a layer between the data and the code provisioning that data, such as an API.

Therefore we can say that Memcached goals are:

Simplicity: Designed to hold data in simple key-value pairs in a large hash store, with a basic API to set and get data.

Speed: Holds data exclusively in RAM, making data acquisition multitudes faster than HDD or even SSD storage.

Shows the process flow for a simple cache with an API. Made via Draw.io.

Installing Memcached

Memecache has an open github repo with instructions for installing:

For macOS you will need to install Homebrew, and run brew install memcached in your terminal.
If you are running on Linux, you can install it using the pre-built package with apt-get install memcached or yum install memcached in your terminal depending on which flavour of Linux you are running.

Interacting with Memcached with Python pymemcache Library

This tutorial uses pymemcache, though there are other options available such as pylibmc, python-memcached, and memcache_client. I don’t intend to go through the pro’s and con’s of each library, however this pypi page covers each very well.

The first thing we need to do is get our library installed! This is done easily with pip install pymemcache. Once this has completed, we also have to remember to start our instance of Memcached by running memcached in our terminal. Don’t worry, we will be covering Kubernetes later, but we’re going to start with the basics first.

In the git gist above, you can see the basic implementation on how memcached can interact with you python application. The only major note to make is that the keys and values are always stored as strings. As such, you will need to convert any complex data structures like a dictionary into a JSON string before inserting it into your cache.

Data Expiry Times

Now you have the basics covered, you will likely start raising concerns about the size of your cache. Especially in Python programmes, RAM is a scarce resource which we want to utilise to needs of our application. This means that we cannot allow our cache to grow infinity, and will have to do some resource management.

In caches this is done with a Time-to-Live counter, or TTL for short, in which we give our data a time limit as to when it should be automatically deleted. When this process happens, we will have to query our database once again to have that data returned into the cache. We can do this by adding the parameter expire into our set method. This is a integer representing the time in seconds. There is no exact number as to what to set this TTL to, as it is heavily dependant on your application, and on factors such as the frequency to which you are updating the data in your database.

Concurrency and Consistency

If you are running at a heavily load, or a heavily utilised API, you may run into some concurrency issues with your cache. This happens when an application tries to access a key at the same time. To overcome this memcached uses a Check and Set operation, which is at its fundamental level simply a counter that adds up each time a specific key is visited.

In the above code we can imagine a scenario in which this method is called simultaneously. The first would receive the number 1, and would set the number_of_callsto 2, however, our second call would also receive the number 1 and set the number_of_callsto 2. This would be incorrect, as the correct answer would be 3.

Here we are returned the CAS value of the call, and compare it to the original. If they are the same, client.cas() returns true and we can be sure that the value entered is correct. If false, we know we have to retry. Almost all modern caching systems have mechanisms built to prevent these kinds of concurrency issues from happening, though it is important to be aware that they exist and should be utilised when possible, as it can be very difficult to debug these kinds of issues.

Deploying Multiple Memcached Servers in Kubernetes

In my honest opinion, the best and fastest way to get memcached up and running in your cluster, is to follow the official helm chart instillation instructions. The rest of this article will be covering the important elements and manifests of the instillation, to give some insight and clarity into what exactly is happening within your cluster.

To get the chart up and running you can create a release using the following command:

helm install --name my-release stable/memcached

The readme in the github repository does an amazing job explaining all the configurable parameters, but we will just work with the default setup. Per default the helm chart is essentially deploying 4 resources into your cluster, which can be found under templates folder: PodDisruptionBudget, ServiceMonitor, StatefulSet, and a Service. We are only really interested in the StatefulSet and Service as the other two manifests detail DevOps management resources, such as how to handle a disruption in communication and monitoring pods statistics.

We will start with the actual cache itself, which is deployed as a StatefulSet. A StatefulSet is a form of deployment that is intended for stateful and distributed systems. Usually we consider a regular deployment as being a stateless instance, meaning it does not rely on any external state to operate correctly. With a StatefulSet we have unique network identifiers, persistent storage, and graceful deployment and scaling which are perfect for our cache, as it allows us to have all of our cache instances in the same state at all times, and for us to select a cache based on its unique identifier when we want to communicate with it.

High-Level Kubernetes Architecture Overview. Made via Draw.io.

We can control all the communication comes to and from this set with a load balancer which is specified within the Service manifest. Above shows a diagram of how this communication takes place. All requests go through the service, which decides where the traffic will be directed at one of our StatefulSet cache instances.

If you have a pod with python enabled in your cluster, we can take advantage of our new cache with the above snippet of code. What is most important is getting the IP of our service which is done with the socket.gethostbyname_ex . As we do not need to access anything outside our cluster, we can pass the service name directly to the method, and it will be interpreted correctly by our code.

Closing Remarks

I hope this rough guide helps give you a better understanding of caching, and I really appreciate you taking the time to read through this guide! Please leave a comment on any questions you may have, or any improvements that can be made to this article. I will endeavour to reply to you as soon as I have the time!