Python Cache - The Must-Read Tips for Code Performance

codeforests
Sep 11 · 6 min read
<span>Photo by <a href=”https://unsplash.com/@baudy?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyTex
<span>Photo by <a href=”https://unsplash.com/@baudy?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyTex
Photo by Tim Carey on Unsplash

Introduction

Most of us may have experienced the scenarios that we need to implement some computationally expensive logic such as recursive functions or need to read from I/O or network multiple times, these functions typically requires more resources and longer CPU time, and eventually can cause performance issue if handle without care. For such case, you shall always pay special attention to it once you have completed all the functional requirements, as the additional runtime costs on the resources and time may eventually lead to the user experience issue. In this article, I will be sharing how we can make use of the cache mechanism (aka memoization) to improve the code performance.

pip install requests
import requests 
import json
def inquire_rate_online(dimension):
result = requests.get(f"https://postman-echo.com/get?dim={dimension}")
if result.status_code == requests.codes.OK:
data = result.json()
return data["args"]["dim"]
return ''

Implement cache with global dictionary

For the above example, the most straightforward way to implement a cache is to store the arguments and results in a dictionary, and every time we check this dictionary to see if the key exists before calling the external API. We can implement this logic in a separate function as per below:

cached_rate = {} def cached_inquire(dim): 
if dim in cached_rate:
print(f"cached value: {cached_rate[dim]}")
return cached_rate[dim]

cached_rate[dim]= inquire_rate_online(dim)
print(f"result from online : {cached_rate[dim]}")
return cached_rate[dim]
%time cached_inquire(1)
result from online : 1 
Wall time: 1.22 s
%time cached_inquire(1)
cached value: 1 
Wall time: 997 µs

Implement cache with weakref

Python introduced weakref to allow creating weak reference to the object and then garbage collection is free to destroy the objects whenever needed in order to reuse its memory.

class Rate():     def __init__(self, dim, amount): 
self.dim = dim
self.amount = amount
def __str__(self):
return f"{self.dim} , {self.amount}"
def inquire_rate_online(dimension):
result = requests.get(f"https://postman-echo.com/get?dim={dimension}")
if result.status_code == requests.codes.OK:
data = result.json()
return Rate(float(data["args"]["dim"]), float(data["args"]["dim"]))
return Rate(0.0,0.0)
import weakref wkrf_cached_rate = weakref.WeakValueDictionary() def wkrf_cached_inquire(dim): 
if dim in wkrf_cached_rate:
print(f"cached value: {wkrf_cached_rate[dim]}")
return wkrf_cached_rate[dim]
result = inquire_rate_online(dim)
print(f"result from online : {result}")
wkrf_cached_rate[dim] = result
return wkrf_cached_rate[dim]
Image for post
Image for post
  • Sometimes you do not really want to let garbage collection to determine which values to be cached longer than others, rather you want your cache to follow certain logic, for instance, based on the time from the most recent calls to the least recent calls, aka least recent used, to store the cache

Cache with lru_cache

The lru_cache accepts two arguments :

  • typed when set it as True, the arguments of different types will be cached separately, e.g. wkrf_cached_inquire(1) and wkrf_cached_inquire(1.0) will be cached as different entries
from functools import lru_cache 
@lru_cache(maxsize=None)
def inquire_rate_online(dimension):
result = requests.get(f"https://postman-echo.com/get?dim={dimension}")
if result.status_code == requests.codes.OK:
data = result.json()
return Rate(float(data["args"]["dim"]), float(data["args"]["dim"]))
return Rate(0.0,0.0)
Image for post
Image for post
inquire_rate_online.cache_info() 
#CacheInfo(hits=1, misses=1, maxsize=None, currsize=1)
inquire_rate_online.cache_clear()

Limitations:

Let’s also talk about the limitations of the solutions we discussed above:

def random_x(x): 
return x*random.randint(1,1000)
  • It only works for the arguments that are immutable data type.

Conclusion

In this article, we have discussed about the different ways of creating cache to improve the code performance whenever you have computational expensive operations or heavy I/O or network reads. Although lru_cache decorator provide a easy and clean solution for creating cache but it would be still better that you understand the underline implementation of cache before we just take and use.

The Startup

Medium's largest active publication, followed by +730K people. Follow to join our community.

codeforests

Written by

Resources and tutorials for python, data science and automation solutions

The Startup

Medium's largest active publication, followed by +730K people. Follow to join our community.

codeforests

Written by

Resources and tutorials for python, data science and automation solutions

The Startup

Medium's largest active publication, followed by +730K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store