Why use generators in PHP?

I’ve heard many co-workers, friends, and colleagues at meetups acknowledge the existence of generators in PHP, but not understand why they would use them or give them much thought when planning out implementations. In this article, I’m going to explain with concrete examples of when it is a good idea to use a generator.

In case you don’t know what a generator is, you can check PHP’s documentation on the subject for a good synopsis.

So why would we want to use generators? Any time you have a method that returns a list of something that has additional logic to continue to get things is a great candidate for a generator. Let’s take a look at a simple method that returns a MySQL row cursor:

class WidgetFactory
{
public function getAllWidgets()
{
return $this->pdo->query('select id, name from widgets');
}
}
$factory = new WidgetFactory();
$widgets = $factory->getAllWidgets();
while ($row = $widgets->fetch()) {
echo "{$row['name']}\n";
}

We’ve seen the above code many times before. We manually implement an iterator without thinking about it and at the same time couple the code where we use this to the PDOStatement interface of fetch(). What happens if this method is used in many places and the underlying storage engine is changed to something other than MySQL?

We can refactor the above code using generators like this:

class WidgetFactory
{
public function getAllWidgets()
{
$stmt = $this->pdo->query('select id, name from widgets');
while ($row = $stmt->fetch()) {
yield $row;
}
}
}
$factory = new WidgetFactory();
foreach ($factory->getAllWidgets() as $widget) {
echo "{$widget['name']}\n";
}

From the outset it doesn’t look all that different. It’s still iterating through the PDOStatement with fetch(), but now the consumer of getAllWidgets has no idea about MySQL and indeed no longer has to call fetch(). Let’s suppose we now retrieve the widgets from a restful service that pages the results.

class WidgetFactory
{
public function getAllWidgets()
{
$next = null;
do {
list($next, $widgets) = $this->service->getWidgets($next);
foreach ($widgets as $widget) {
yield $widget;
}
} while (!empty($next));
}
}

We were able to completely change the mechanism of retrieving widgets without having to change our consumers in any way. We successfully abstracted the iteration in a very simple way.

Other Uses

There are, of course, more reasons to use generators. For example, if your code is making use of asynchronous code with amphp for example, generators allow your consumers to just consume as they resolve from the async process. It also allows for the usage of coroutines (which is not unlike NodeJS’s co library which also uses generators).

Gotchas in Unit tests

One key thing to remember is that the internals of the generator method will not begin until something attempts to iterate it. Let’s suppose we are attempting to write a test for our widget factory example:

// pseudocode...
$results = $factory->getAllWidgets();
$this->expectsToBeCalled([$factory->pdo, 'query']);

The above code would fail the assertion that the $factory->pdo->query was called, because indeed it hasn’t been called yet. We have to iterate before the generator starts executing:

// pseudocode...
$results = iterator_to_array($factory->getAllWidgets());
$this->expectsToBeCalled([$factory->pdo, 'query]);

This sort of abstraction has always been possible with the Iterator interface since PHP 5.0. However, it is inconvenient to have to implement that interface in a class and then utilize it in the caller, especially if you have multiple ways of iterating over a similar data set in the same class. What I hope is conveyed in this article is to think about abstracting your iterators away from the consumers with generators to simplify future updates.