Array Oriented Programming
If you work with PHP long enough you’ll inevitably find a class method that looks something like this:
There’s nothing particularly wrong with the above code; it gets the job done. Let’s break down those dozen lines in the search
method:
- It first validates the
$params
array ensuring the required'username'
key exists, otherwise it throws an exception, - modifies the
$params
array to the format therequest
method expects by adding a'name'
key if it does not exist, - calls the parent
request
method passing the$params
along, and - finally converts the response to an array before returning it.
I’ve intentionally kept the example brief, but code written in this style tends to grow exponentially as new params are added or new business rules are introduced. PHP arrays are open to ad-hoc modifications, so any code that works with the array is littered with empty
checks, null coalescing operators or other validations.
That’s because the author of this code (yours truly) has assigned implied context to a freeform data structure. The array represents parameters for a search, and there are rules about how those parameters should be composed.
Just Use Objects
Let’s see if we can clean up the example with an object instead of an array of params. First we need a new class to represent the search parameters:
Now the original search
function can be written like this:
So the search
method is now pretty simple. On the other hand, to use the method we would have to do this:
So far, we haven’t gained much. The search
function is short and sweet but now the data validation must be done everywhere the SearchParams
class is created and used. We don’t want to force a user of this interface to validate their own data.
Encapsulate
Let’s rewrite the SearchParams
class to encapsulate the validation of it’s params:
Notice the private constructor and static factory method. This ensures that the only way to create a SearchParams
class is to pass the data through the factory and enforce the validation rules.
Now it can be used like this:
Profit
At this point we have some real benefits to discuss. Data validation and search param manipulation have been moved outside of the search itself. We no longer have a freeform structure that we cannot trust; the code in the User
class can forget about ensuring the data looks a particular way.
However, we still have some arrays floating around. The parent class has a request
method that expects an array. If we’re in control of that class we should consider refactoring it as well.
Another Example
A common array-oriented pattern involves building an array from several operations. Take a look at this painful example:
The code above is a mess; it suffers from the same problems as the first example. However, there’s another anti-pattern lurking in here that I want to highlight. The $transaction
array is modified several times — each time growing or shrinking into a different bag of parameters before being passed to other methods to do some work.
- Default values are initially set. There’s some annoying, extraneous mapping here between
total
andamount
,currency
andcurrency_code
, etc. Then we merge in whatever was originally passed, handing the array over to thecreate
method. - Some of the data from the
$transaction
array is used to retrieve some processor data which is then added back to the original$transaction
array. - The data is “processed” and the status of that operation is added to the original array.
- Some sensitive keys are deleted from the array before returning it.
I have a lot of questions.
- What is the initial state of this “parameter bag” when it was passed to the
subscribe
function? - Does the
process
method need all the keys of that array to be set, or just some of them? - Superfluous data is likely being passed to
create
,getProcessor
, andprocess
; for example,"total"
,"subtotal"
, and"currency_code"
were optionally passed in by the original call. Are any of those keys used in each method? - Were the
"card-number"
and"ccv"
keys passed in with the original call, or were they added to the array by reference somewhere along the way?
You cannot look at this code and get a good picture of what is happening under the hood without tracing the life of $transaction
and keeping many cyclomatic paths of code in your head. The “transaction” in this case is smeared across many methods. It’s easier to unintentionally write complicated code like this when everything is an array.
The solution here is to ditch the arrays.
Without diving into the individual implementations, we can see that we’ve created three new objects: PostParams
, Transaction
and Processor
. This means we had to write these implementations and create a bit more code than we did when using a simple array, but the above code is much easier to read and reason about. The implementations of create
, getProcessor
, and process
can be refactored to accept these objects, and will likely also become cleaner — much like our original search
example.
Keep it OOP
Passing around humble arrays looks simple on the surface — and is easy to write since we’re not creating so many classes — but it causes code to become subtly more complex since we cannot trust the structure of the array. Swapping implied context with explicit classes allows us to trust our code; it makes our code simple, more robust, and easier to understand.