Build custom php object storage for 46% memory saving

Alin Pintilie
3 min readFeb 6, 2023

--

As we know the iterator is a versatile and powerful tool that can be used in different scenarios (a few examples here). One use case is building a custom collection (example here). Building our collection could have some advantages like restricting stored elements to have unique data types in that collection but when it comes to memory usage we do not have an advantage. However, we can find a workaround for this particular case.

Initial iterator

class InvoiceCollection implements IteratorAggregate{
private array $invoices;

public function __construct()
{
$this->invoices = array();
}

public function add(Invoice $invoice)
{
$this->invoices[] = $invoice;
}

public function getIterator(): Traversable
{
return new ArrayIterator($this->invoices);
}
}

//adding to collection
$invoiceCollection = new InvoiceCollection();
foreach ($invoices as $invoice) {
$invoiceCollection->add($invoice);
$index++;
}

//consuming collection
foreach ($invoiceCollection as $invoice) {
doSomethingWithInvoice($invoice)
}

As you can see this collection is built on top of the array. To improve memory usage we could find a few solutions. We can use different data structures but depending on the PHP version, this solution could not bring any major improvement. Instead of this, we could build our custom object storage using the iterator interface.

Let’s start with the idea that storing an object inside an array takes more memory than storing a string (not the purpose of this article to explain that). So if we find a way to transform an object into a string we could start to think of a solution. This is very simple, we can just use serialize() function. Also if we would have a mechanism to reverse the process of serialization we have half of the solution. In fact, we have a function for this case too: unserialize()

The other part would be to call serialize function every time we add an element in the array (so we store a string representation of an object) and then call unserialize function whenever we take an element out of the array (so we transform the string in object again).

Of course, we will solve this mystery with the help of an iterator :)

class ObjectStorage implements Iterator{
private array $objects = array();
private int $index = 0;

public function add(object $object)
{
$serializedObject = serialize($object);
$this->objects[] = $serializedObject;
}

public function current(): mixed
{
return unserialize($this->objects[$this->key()]);
}

public function next(): void
{
$this->index++;
}

public function key(): mixed
{
return $this->index;
}

public function valid(): bool
{
return !empty($this->objects[$this->key()]);
}

public function rewind(): void
{
$this->index = 0;
}
}

Now we have the magic ObjectStorage, it is time to change the implementation of our InvoiceCollection.

class InvoiceCollection implements IteratorAggregate{
private ObjectStorage $invoices;

public function __construct()
{
$this->invoices = new ObjectStorage();
}

public function add(object $invoice)
{
$this->invoices->add($invoice);
}

public function getIterator(): Traversable
{
return $this->invoices;
}
}

We changed the type of $invoices attribute to ObjectStorage type and also the return type of the getIterator function. As ObjectStorage implements the iterator, ObjectStorage is a type of Traversable. Also, we changed the implementation of add function to suit the actual interface for ObjectStorage.

What is the economy in terms of memory usage?

I run the script in PHP 8 with 100.000 elements (Invoice objects). The memory usage in the first case when plain objects were stored was 52,85 Mb. After the implementation of ObjectStorage, the usage was 28,44 Mb. We managed to save 24,41 MB (46%).

This is not perfect solution for all cases but there are situations when the memory limit could be touched if no optimization is done. For example, running a script on a large dataset. Feel free to adapt to your own situation.

Thank you.

--

--