JavaScript Regular Expression Abstraction

Terrill Dent
2 min readApr 20, 2018

--

Working with regular expressions written by others is difficult. This article covers techniques that can help you set up your coworkers for success when they encounter your work in the future.

We’ve been refactoring code to handle new regions, and while working on a class that handles phone number behaviour I was reminded of some nuances of regular expressions.

Naming

Naming based on the visual appearance can be tempting when working with regular expressions:

const numberStartingWithFourFour = /^\+?44/;

Instead, name things based on their intent and context. This makes the usage obvious, allowing adjustments in the future that stay within the intent.

const hasSmsUkCountryPrefix = /^\+?44/;

Global regular expressions & re-use

Some regex are useful in many places. We have a class that holds those most commonly re-used. When considering a refactor that moves multiple instances of a regular expression into one central place there are important considerations: retained state and usage hinting.

Retained State
A shared or global scope regular expression can retain state when created with the g flag. Both the test and exec methods will advance the cursor, and retain it between calls. Consider what happens when test is called twice:

let nonAlphaNumeric = /[^0-9a-z]/gi;nonAlphaNumeric.test("519-");  // true
nonAlphaNumeric.test("519-"); // false !?

nonAlphaNumeric was intended for use with match or replace where it performs well:

"519-".match(nonAlphaNumeric); // ["-"]
"519-".match(nonAlphaNumeric); // ["-"]
"519-364-".match(nonAlphaNumeric); // ["-", "-"]

Usage Hinting
By labelling regular expressions that use the g flag with “all”, “global” or “any” it hints that they are intended to return data, and it becomes more awkward to misuse.

let allNonAlphaNumeric = /[^0-9a-z]/gi;// Reads awkwardly, like it should return an array
allNonAlphaNumeric.test("519-");

Likewise, using “is” and “has” can work better for boolean or decision based regex.

Generally the global flag is unsafe to use with RegExp.test() and .exec()

Testability and pure functions

Testing can be a motivating factor when moving from inline instantiation to an external regex. These simple changes of scope can result in the retained state issue above. The important thing is to follow through and update or write tests that exercise it after the refactor.

When writing tests for regular expressions, remember to add repeated calls that checks if functions are pure: when called with the same input, they result in the same output, without modifying state.

assert.equal(formattedNumberWithThousands('4000'), '4,000');
assert.equal(formattedNumberWithThousands('4000'), '4,000');
assert.equal(formattedNumberWithThousands('4000'), '4,000');

Subtle meaningful change

I enjoy reading small articles that highlight common pitfalls and tweaks we can make to our workflow to avoid them. If you have additional experiences with regular expressions please share them!

--

--

Terrill Dent

Creator. Engineer at Square. Web/Server developer. Passion for elegant code, JavaScript, good design, security, and economics.