How to DRY-up complex conditionals

“Don’t Repeat Yourself” is a principle of good abstraction. Here’s a demo of how I DRY complex conditionals.

Berkana
Bits and Pixels
9 min readAug 4, 2015

--

One of the best practices in software engineering is to not repeat yourself. The mnemonic DRY (Don’t Repeat Yourself) vs. WET (Write Everything Twice) is a cute way to remember this. When you find repeated patterns in your code, and encapsulate them into an appropriately named function, the code becomes easier to understand and maintain.

When DRYing code, it is not always clear how to isolate the common structure if there are small variations scattered throughout two pieces of code that are structurally similar. The differences can really get in the way of isolating the commonalities if you are not aware of the tools your programming language gives you to isolate them.

Let me illustrate the DRYing of a complex conditional. Behold the following example. Don’t get distracted by worrying about what function it is from or what it does; look at the structure. Regardless of what the following chunk of code is for, we’re going to abstract out the operation that we see repeated:

Here are some observations:

  1. There are two objects that are used to track values, one for each of the for loops.
  2. The first for-loop scans through the array minMax from beginning to end, whereas the second one scans through the array from end to beginning.
  3. The first conditional tests a property called min, and the second tests a property called max, and the comparison operator is reversed between them; where the greater-than operator (>) is tested in one, the less-than operator (<) is used in the other.

First step in removing duplicated structures is to merge the two loops into one; for-loops can have multiple iterators processed at one time, as long as the end condition is the same. In this case, whether we count up using the iterator variable k or count down using l, the number of iterations is the same, so we’ll just use one conditional for continuing the loop —
k < minMaxLength.

Next, let’s generate the two identical objects (left and right)that are being used to track the minimum and maximum values using a function.

Now, let’s start by composing the function signature of the function that captures the structure that we see repeating — namely, the if-else block that we see repeated four times inside the loop. The method by which we isolate the structure is as follows: each item that differs between the if-else blocks should become a parameter of the function. The first two if-else blocks in the following code block have the items that change from block to block emboldened.

(Note on variable names: We will be classifying the min and max properties of the accumulators with the term “extrema”, since minimum and maximum values are the extreme values in the sequence that is being examined. Again, ignore the details; the key point is how we abstract the structure of this.)

The usage of our function already vastly simplifies the structure of this for-loop immensely. This function signature is overly complex, with five parameters that seem to follow additional patterns that can be simplified away. We can eliminate a couple of parameters:

  1. each pair of if-else blocks alternates between using .min and .max, so instead of handing it in as a parameter for a single pair of if-else blocks, we’ll just process both min and max properties of each accumulator, using a for-in loop.
  2. The extrema min is always used with comparison lessThan in the initial conditional, and greaterThan in the else-if conditional, and max is always used with the opposite arrangement. Because of this, we can get rid of the comparison parameter, and select the correct comparison using an object within the function we’re going to define.
  3. leftAcc is always correlated with the direction left, and rightAcc is always correlated with the direction right. We can pair these by using the string “left” and “right” to select these accumulators from an object.

If we simplify the function signature following the above three simplifications, the usage will look like this:

Observe how simple this is. The key feature of JavaScript that we will be using is JavaScript’s bracket notation for selecting properties of an object:

The big advantage of bracket notation over dot notation is that any variable inside the bracket will be evaluated before it is used to select for a property. This lets us isolate the structure that we are trying to DRY up. Dot notation cannot be used for this because dot notation requires that the name following the dot be the name of a property. Knowing this, let us begin to isolate parts out of the repeated structure to put into our function.

The next target of our simplification is the portion that looks like minMax[k].left.min or minMax[l].right.max. Here, the property left / right is the direction, and min/max is the extrema, and k/l is the count, so the abstract expression of this, using bracket notation, is:

Instead of accessing these properties each time, we’ll access them once and save the value as a variable.

Also, notice that testLeftward and testRightward in the original structure varies along with whether the direction is left or right. We can use this to our advantage by making an object that stores each method name according to the direction it corresponds to:

Furthermore, we can store the result of comp[extrema](record, item) once, and instead of testing its inverse in the else-if, get rid of the second if statement. In this case, even though the opposite of > is <= rather than <, the difference is inconsequential, so there’s no point in testing the second if on the inverse of the ferst conditional.

Once this simplification is inserted into the function, it should look like this:

And there we have it. The DRY version vs. the WET version, again, for comparison, this time with the count variables re-named to be more indicative of what they’re for:

To be sure, the DRY version has a lot of lines of code in the function that abstracts out the common structure, but with appropriate variable naming, it is more apparent what the DRY version is doing at a conceptual level. Communicating the intention of the code is crucial for others to understand and maintain it. It is fine that the code is more complex inside the functions; consider that remote controls and cameras and mechanical watches are simple from the outside, but are incredibly complicated inside. That is the whole point; meaningful complexity is encapsulated in a way that hides complexity from the outside, and are arranged so that they are simple at the high level. This makes what the code is doing readily apparent from the outside, and how it is doing it apparent from the inside.

Summary and conclusion

To DRY up code where you see a repeated structure,

  • Find the parts that vary, and write up a function signature where the parts that vary are specified as parameters.
  • Find the parameters that vary together, and use a string parameter to eliminate the redundancy; the string should be used to select the corresponding variations as properties of objects that record the various options. Remember: you can select properties of objects using bracket notation, inserting expressions that evaluate to strings. This is very helpful for capturing correlated variations.
  • Write the function using the repeated structure as a template, replacing the parts that vary using the parameters and selections.
  • If any of the repetition comes from operating on various similar object properties, use a for-in loop to cycle through those properties (but be aware that this will loop through all enumerable properties; if there are any properties you don’t want to loop through, make the appropriate arrangements.). This can be done internally within the function you write to encapsulate the repeated structure.
  • Lastly, complicated and multi-stage object property selections that are used to retrieve values should have those values stored in variables if they are used repeatedly. (I have a suspicion that this also improves performance because object property access involves hash table operations, but I’m not sure.) For example, compare the for-loop I wrote above with one where I don’t store the object property values in a variable. Which one is messier and more difficult to understand?

--

--