Creating strictly typed arrays and collections in PHP
One of the language features announced back in PHP 5.6 was the addition of the “…” token to denote that a function or method accepts a variable length of arguments.
Something I rarely see mentioned is that it’s possible to combine this feature with type hints to essentially create typed arrays.
For example, we could have a Movie class with a method to set an array of air dates that only accepts DateTimeImmutable objects:
We can now pass a variable number of separate DateTimeImmutable objects to the setAirDates() method:
If we were to pass something else than a DateTimeImmutable, a string for example, a fatal error would be thrown:
If we instead already had an array of DateTimeImmutable objects that we wanted to pass to setAirDates(), we could again use the “…” token, but this time to unpack them:
If the array were to contain a value that is not of the expected type, we would still get the fatal error mentioned earlier.
Additionally, we can use scalar types the same way starting from PHP 7. For example, we can add a method to set a list of ratings as floats on our Movie class:
Again, this ensures that the ratings property will always contain floats without us having to loop over all the contents to validate them. So now we can easily do some math operations on them in getAverageRating(), without having to worry about invalid types.
Problems with this kind of typed arrays
One of the downsides of using this feature as typed arrays is that we can only define one such array per method. Let’s say we wanted to have a Movie class that expects a list of air dates together with a list of ratings in the constructor, instead of setting them later via optional methods. This would be impossible with the method used above.
Another problem is that when using PHP 7, the return types of our get() methods would still have to be “array”, which is often too generic.
Solution: Collection classes
To fix both problems, we can simply inject our typed arrays inside so-called “collection” classes. This also improves our separation of concerns, because we can now move the calculation method for the average rating to the relevant collection class:
Notice how we’re still using a list of typed arguments with a variable length in our constructor, which saves us the trouble of looping over each rating to check its type.
If we wanted the ability to use this collection class in foreach loops, we’d simply have to implement the IteratorAggregate interface:
Moving on, we can also create a collection for our list of air dates:
Putting all the pieces of the puzzle together in the Movie class, we can now inject two separately typed collections in our constructor. Additionally we can define more specific return types than “array” on our get methods:
Using value objects for custom validation
If we wanted to add extra validation to our ratings we could still go one step further, and define a Rating value object with some custom constraints. For example, a rating could be limited between 0 and 5:
Back in our Ratings collection class, we would only have to do some minor alterations to use these value objects instead of floats:
This way we get additional validation of individual collection members, still without having to loop over each injected object.
Typing out these separate collection classes and value object may seem like a lot of work, but they have several advantages over generic arrays and scalar values:
- Easy type validation in one place. We never have to manually loop over an array to validate the types of our collection members;
- Wherever we use these collections and value objects in our application, we know that their values have always been validated upon construction. For example, any Rating will always be between 0 and 5;
- We can easily add custom logic per collection and/or value object. For example the getAverage() method, which we can re-use throughout our whole application;
- We get the possibility to inject multiple typed lists in a single function or method, which we cannot do using the “…” token without injecting the values in collection classes first;
- There are significantly reduced odds of mixing up arguments in method signatures. For example when we want to inject both a list of ratings and a list of air dates, the two could easily get mixed up by accident upon construction when using generic arrays;
What about edits?
By now you might wondering how you could make changes to the values of your collections and value objects after initial construction.
While we could add methods to facilitate edits, this would quickly become cumbersome because we would have to duplicate most methods on each collection to keep the advantage of type hints. For example, an add() method on Ratings should only accept a Rating object, while an add() method on AirDates should only accept a DateTimeImmutable object. This makes interfacing and/or re-use of these methods very hard.
Instead, we could simply keep our collections and value objects immutable, and convert them to their primitive types when we need to make changes. After we’re done making changes, we can simple re-construct any necessary collections or value objects with the updated values. Upon (re-)construction all types would be validated again, along with any extra validation we might have defined.
For example, we could add a simple toArray() method to our collections, and make changes like this:
This way we can also re-use existing array functionality like array_filter().
If we really needed to do edits on the collection objects themselves, we could add the necessary methods on a need-to-have basis wherever they are required. But keep in mind that most of those will also have to do type validation of the given argument(s), so it’s hard to re-use them across all different collection classes.
Re-using generic methods
As you may have noticed we are still getting a some code duplication across our collection classes by implementing both toArray() and getIterator() on all of them. Luckily these methods are generic enough to move to a generic parent class, as they both simply return the injected array:
All we would be left with in our collection class would be type validation in the constructor, and any optional extra logic that is specific to that collection, like this:
Optionally we could make our collection final, to prevent any child classes from messing with the values property in ways that could undo our type validation.
While still far from perfect, it has steadily been getting easier to work with type validation in collections and value objects with recent releases of PHP.
Ideally we’d get some form of generics in a future version of PHP to further facilitate the creation of re-usable collection classes.
A feature that would greatly improve the usage of value objects would be the ability to cast an object to different primitive types, in addition to string. This could easily be implemented by adding extra magic methods comparable to __toString(), like __toInt(), __toFloat(), etc.
Luckily there are some RFCs in progress to possibly implement both features in later versions, so fingers crossed! 🤞
- Generics: https://wiki.php.net/rfc/generics
- Generic arrays: https://wiki.php.net/rfc/generic-arrays
- Casting object to scalar: https://wiki.php.net/rfc/class_casting_to_scalar