Php generators or how to simplify iterators

Did you know that a PHP function can return multiple times?

Alin Pintilie
4 min readFeb 26, 2023
Photo by Josh Mills on Unsplash

Although a function typically returns only one value when called, there is another type of syntax that “simulates” multiple returns. This is called “yield”. If this term is unfamiliar to you, it was to me as well, but it’s good to know that it exists. The PHP documentation defines it as “the heart of a generator function”. And what exactly is a generator function? The same documentation tells us that “any function containing yield is a generator function”. This is a fairly straightforward explanation, but the concept itself is not complicated as we will see in the following.

Implementation of a generator function

Yield

As the definition above states, in order for a function to become a generator, we need to replace the “return” statement with “yield”.

function collectInvoices(): Generator
{
$invoiceService = new InvoiceService();
$invoices = $invoiceService->getAllInvoices();
foreach ($invoices as $invoice) {
//do something with invoice
yield $invoice;
}
}

Yield from

Yield from also called generator delegation is another very unfamiliar statement among PHP programmer, but it is just as simple to use. It allows you to yield values from another generator, Traversable object, or array by using the yield from keyword

function collectInvoices(): Generator
{
$invoiceService = new InvoiceService();
yield from $invoiceService->getAllInvoices();
}

What is the actual benefit of a generator?

Lazy functions

Generators are actually Iterators. If you’re not familiar with this concept, you can take a look at the article where I presented it. One of the advantages of using an Iterator is lazy loading, which also applies to generators. In short, we can use generators when we want to implement a lazy loading mechanism.

Using a generator is similar to using an Iterator, meaning that we can iterate over it.

$invoices = $invoiceService->collectInvoices();
foreach ($invoices as $invoice) {
$storingService->store($invoice);
}

When it comes to energy efficiency, a significant difference can be observed: for 41814 elements retrieved from the database by loading them all into memory, 19.14 Mb of memory were used. For the same number of elements using a generator, only 0.55 Mb of memory were used.

Improve space complexity

A very important aspect to mention in this case is that by using a generator, we can change the space complexity (big O notation) from O(n) to O(1). This means that regardless of the size of the input, the memory allocated in memory will remain constant.

Using inside Laravel Collection

In Laravel, Collections are actually an Iterator used as a wrapper over an array. In the example below, we have a simple example where we initialize a Collection.

use Illuminate\Support\Collection;

$collection = new Collection([1, 2, 3, 4]);
$collection->push(5);

In the case above, an array with the respective elements will be stored in memory. Of course, this would not be a problem. However, there are situations when the number of elements that need to be stored in memory is large enough to have a significant impact.

A more concrete example would be when we want to retrieve data from a database.

$invoices = Invoice::orderBy('invoice_date')->take(100000)->get();

foreach ($invoices as $invoice) {
//do something with invoice
}

To avoid problems caused by memory overload, we can use Lazy Collections. There are several ways to use lazy collection:

Cursor()

$invoices = Invoice::orderBy('invoice_date')->take(100000)->cursor();

foreach ($invoices as $invoice) {
//do something with invoice
}

This method will return an object of type Illuminate\Support\LazyCollection. This method will only execute a single database query.

Lazy()

$invoices = Invoice::orderBy('invoice_date')->take(100000)->lazy();

foreach ($invoices as $invoice) {
//do something with invoice
}

Also, this method will return an object of type Illuminate\Support\LazyCollection but the mechanism behind it is similar to the one implemented by the chunk() method, meaning it executes the query in chunks. However, since it returns a LazyCollection, this is unnoticeable, and that’s why we can interact with the results as a single stream.

Yield

$invoices = LazyCollection::make(function () {
$file = new SplFileObject('import_data.ndjson');
$file->seek(0);

while ($file->valid()) {
yield json_decode($file->fgets());
}
});

foreach ($invoices as $invoice) {
//do something with invoice
}

To initialize a Lazy Collection, we will call the make() function which takes a generator function (that yields the values) as a parameter.

As we have seen in these examples, the generator can be a much simpler way of using the iterator. One of the most well-known benefits is the implementation of the lazy loading mechanism, which can significantly improve memory usage.

Thank you!

--

--