Data cache to speed up execution of PHP scripts

John Mejia
6 min readDec 15, 2023

From time to time it goes well to help PHP generate answers in a way much faster and more efficiently than it is already capable of. In my humble opinion, the best way to achieve this goal is to implement a cache system to manage your own data or (the boring option) reuse some existing one. In this article we will detail the implementation of a class to manage data cache.

Background image courtesy of pixabay.com

Notice: This article contains programming examples using PHP although the concepts explained here can be applied to scripts made in any other programming language.

To illustrate what we try to accomplish, let’s look at the following not-so-hypothetical case: A web application makes use of a weather API. The data provide by this API is the same for all the users and change only every hour. However, each query takes an average time of 15 seconds. This means that each user will have to wait those 15 seconds to see the result (plus the time that takes to render the page and any other processes required by the web application). That’s a lot of time to wait for! But, this time can be reduced if the answer to the query made by the first user is cached and the other users make use of the data in that cache instead of made their own query, after all, the answer will not change in a while. In this way, only the first user would have to wait for those 15 seconds and for the others the answer would be much faster. An hour later, the process is repeated with a new “first user”.

I bet that you’re thinking of a way where even that poor first user shouldn’t have to wait that long to get his answer, right? Something like having a task that execute the API request every hour and generates the data cache file. This, while valid, would use the API all day, even if there is not a user that requires it and would involve a consumption of resources not necessary in some cases, something to consider (especially if it should be paid for each API query). In any case, the web application should always be able to consume the API in the eventuality that this remote task fails and is not executed in time.

Returning to our case, the code for the use of the weather API with validation of data cache, would be something like this:

$data_clima = false;
// Creation of the cache object
$cache = new DataCache();
// Check if the cache file is valid
if ($cache->read('api-clima')) {
// Recovers cache data
$data_clima = $cache->getData();
}
else {
// Use of the external API
$data_clima = get_api_clima();
// Generate cache file
$cache->save($data_clima);
}

At the end of the block, the variable $data->clima will contain the returned value, either by the external API (represented in the get_api_clima() function of this example) or by the data stored in the cache, which would have previously been generated after recovering the data from the external API. Easy, right? But how can we control that it is updated every hour, according to the previous requirement? For this, we added to our class a property for limit the validity of the cache before saving it.

 $data_clima = get_api_clima();
// Fix the cache duration
$cache->duration(3600);
// Generate cache file
$cache->save($data_clima);

In this way, we indicate to the cache that the data will be valid for one hour (or 3600 seconds) from now.

Another improvement to consider is to prevent keeping two copies of the recovered data in memory, I mean, there is a copy of the data in the $cache object and other in the variable $data->clima. To correct this situation and reduce the risk of failures due to unnecessary memory consumption (especially if multiple caching queries are handled at the same time), we can implement a method that recovers the data and releases the occupied space, so we change the method getData() and use something like exportInto() instead.

if ($cache->read('api-clima')) {
// Recovers cache data
$cache->exportInto($data_clima);
}

Now, with all these considerations, the implementation of the DataCache class would be as follows:

class DataCache {

private string $filename = ''; // Cache file name
private mixed $data = false; // Data to be cached
private int $maxtime = 0; // Cache expiration date

/**
* Read cache file.
*
* @param string $name Name associated with the data to be cached
* @return bool TRUE if the data could be recovered.
**/
public function read(string $name) {

$this->maxtime = 0;
$this->data = false;
$this->filename = '';

$result = false;

if (trim($name) !== '') {
// Automatically use the temporary system directory.
// You can customize this part using your own directory.
// Use md5() to encode the name of the file to use and
// prevent problems if the name contains characters not
// valid for filenames.
$this->filename = sys_get_temp_dir() . DIRECTORY_SEPARATOR . 'cache-' . md5(strtolower($name));
// Read stored data
if (file_exists($this->filename)) {
$this->data = unserialize(file_get_contents($this->filename));
$result = (is_array($this->data) && array_key_exists('data', $this->data) && array_key_exists('maxtime', $this->data));
if ($result && $this->data['maxtime'] > 0) {
$result = (time() <= $this->data['maxtime']);
}
if (!$result) {
// Fail to recover the data
$this->data = false;
}
else {
// Separate components
$this->maxtime = $this->data['maxtime'];
$this->data = $this->data['data'];
}
}
}

return $result;
}

/**
* Returns the recovered data from the cache.
*
* @return mixed Data recovered or FALSE if not available.
**/
public function getData() {
return $this->data;
}

/**
* Assoice the recovered data from the cache to a variable and then
* remove from memory.
*
* @param mixed $data Variable in which the data returns.
**/
public function exportInto(mixed &$data) {
$data = $this->data;
$this->data = false;
}

/**
* Generates a new cache file with the data indicated.
*
* @param mixed $data Data to be saved in the cache file.
* @return bool TRUE if the data could be saved to the file.
**/
public function save(mixed $data) {

$result = false;

if ($this->filename != '') {
$bytes = file_put_contents(
$this->filename,
serialize(array('data' => $data, 'maxtime' => $this->maxtime))
);
if ($bytes > 0) {
$this->data = $data;
$result = true;
}
}

return $result;
}

/**
* Time in seconds for which the data in the cache are valid.
* Must be greater than or equal to zero. Value of zero removes limit.
*
* @param int $seconds
**/
public function duration(int $seconds) {
if ($seconds >= 0) {
$this->maxtime = time() + $seconds;
}
}
}

It is done. Finally we have the implementation of our own cached data management class. But, we must have in mind that this is a basic example and there are situations that need to be prevented in a production environment, such as:

  • Failures caused by the opening of these cache files by multiple users at the same time.
  • Validate reading/writing performance of extremely large data to be store in the cache file.
  • Protect/encrypt the contents of the cache files when the application handle sensitive data.
  • Remove cache files not used in a long time to maintain a healthy disk space (suggestion: implement an external task that performs this activity).

I hope this article will be useful and/or inspire the implementation of your own solutions using data cache. I invite you to share in the comments your suggestions to improve the proposed code and scenarios in which the use of this kind of proprietary cache can help to reduce resource consumption and speed up your PHP scripts.

This article was originally published in Spanish and if you are curious, can read it in my own blog (along with other interesting articles) or in LinkedIn.

--

--

John Mejia

Engineer, programmer, writer, penciller and dreamer.