
Regexes: Our Not-So-Regular Expressions
You may have heard of this funky little term in our title before and thought, “What is that thing?? Is it even Ruby? Is it even English?!”
To most new programmers (like myself) who are unfamiliar with the term and its application, when encountering these in resources such as StackOverflow, we most likely end up either copying it without understanding it or, if we’re the honest and enterprising programmers we hope and aim to be, we just do this:

And then skip to the next response, praying that one will be something we know.
But why? What’s the big deal? What are we so afraid of? Well, let’s be completely honest here: to most people looking at it for the first time, that thing they’re seeing makes no sense whatsoever. It’s intimidating as hell! But that thing there is just a Regex and it’s basically a string — or is it?
Well, keep reading to find ou-- no, no, sorry. I’m just kidding -- it’s far more than that; it’s a small language unto itself! So if I still have your attention and you haven’t skipped to the next article praying that one will be something you know, let’s proceed with a quick crash course:
1. So What Is It? How Do You Even Pronounce It?
There. First question answered. That wasn’t so bad, was it? Next!
2. Are Regexes Language-Specific?
No! While different regex engines (also sometimes known as flavors) can be found, you will find that - overall - Regexes are ubiquitous and often adapt from the same base: Perl. The Perl regex engine is arguably one of the most common engines due to its readability, flexibility/expressiveness and simplicity; it can be found in the following coding languages:
- Java
- Javascript
- Python
- Ruby
- Microsoft’s .NET Framework
- XML Schema
In short, once learned, we can apply Regexes in any coding language or text-editor!
3. Just what do we mean by Regular Expression?
I mean… there’s nothing regular about it… Let’s not get bogged down in the etymology. “Regular Expression” comes from a mathematical term that took hold in programming culture in 1968 for two specific uses: pattern matching in text editors and lexical analysis in a compiler.
A regex is a text string that describes a pattern that a regex engine uses in order to find text (or positions) in a body of text, typically for the purposes of validating, finding, replacing or splitting. — RexEgg.com
Basically a regex is an indicator of a textual pattern — so if we think about it even words are regexes. Yes, a word is a regular expression. Kind of obvious when taken as a literal face-value, right? The keyword there is literal. ‘Foo’ is a regex in that it is a pattern of regular characters with a literal meaning.
In fact, you’ve probably seen and used a Regex in Ruby before and not even realized it: Grep! “Grep” comes from the command for regex searching in an editor: g/re/p
It stands for "Global search for Regular Expression and Print matching lines" — WHOAH!

4. How does it work?
In Ruby a Regex is a bit simpler than what we’d find in many other languages and engines. This is simply a testament to Ruby’s flexibility, readability and intuitive nature.
In Regex we have two types of characters: literal and special. To start off, any literal character placed between a Regex’s forward slashes can be compared against a string. So, for instance, in the below example from RubyLearning.com:
This regular expression matches the string “a”, as well as any string containing the letter “a”.
/a/The special characters include the following:
- ^ (Negate a Character Class or, if a caret (
^) is at the beginning of the entire regular expression, it matches to the beginning of a line. - $ (If a dollar sign is at the end of the entire regular expression, it matches the end of a line.)
If an entire regular expression is enclosed by a caret and dollar sign (^like this$), it matches an entire line!
- ? (Tells the engine: the preceding token in the regular expression is optional)
Nov(ember)? #matches both 'Nov' and 'November'.- . (The Wildcard Character)
The Wildcard Character is placed within a Regex to indicate that we are looking for a pattern anywhere within our compared string.
/.ism/This would match against strings like “Buddhism”, “prism”, “kismet”, and “charisma”. Obviously, if you’re matching against a long string this can get to be a pretty extensive list — so we use things like character classes to narrow these lists down!
- [ ] (Character Class, see below)
The Character Class ([]) is typically seen in a range such as “/[a-z]/” or “/[0–9]” and these ranges can even be combined. The below example checks for a hexadecimal digit such as one you might find in HTML or CSS to represent colors:
/[A-Fa-f0–9]/Much like how the “!” bang operator in Ruby can be placed before something to indicate “not” that thing, in Regex we have something similar: the carat icon “^” placed at the beginning will negate a character class and evaluate a Regex as “not” the following character class:
/^[A-Fa-f0–9]/- / \ (Escape Characters)
Special characters have special meanings. To escape their meaning and interpret them literally, we need to precede them with a backslash. The backslash tells the Regex engine to ignore the character’s special meaning and interpret it literally:
/\?/- { } (The Limiter/Quantifier. Tells the engine: repeat {min, max} times. Omitting both the comma and max tells the engine to repeat the token exactly min times. Using a comma after min will set a min and max of infinity)
- ( ) (Tells the engine to group that part of the Regex. This allows you to apply a Quantifier to the group or restrict any alternation)
- + (Repeater. Tells the engine: attempt to match the preceding token once or more)
- * (Repeater. Tells the engine: attempt to match the preceding token zero times or more)
We also have special escape sequences. To name a few:
\d matches any digit
\w matches any digit, alphabetical character, or underscore (_).
\s matches any whitespace character (space, tab, newline).
To negate one of these sequences, we merely capitalize it:
/\S/
#this evaluates as not whitespaceSo just a few minutes ago, looking at something like the above might have been intimidating. Now, we see it’s just a question mark.
Nice! We’ve got this.
5. When do we use a Regex?
The typical way in which a basic Regex is used is to match text patterns to strings. In Ruby we see this most often with the #match method. If there’s no match, the return value is nil. If there is a match, the return value is an instance of the class MatchData — most times we don’t actually want that instance so the match method can typically be implemented within a conditional logic statement where the new MatchData object will evaluate as ‘true’ if present or nil. Let’s check out the below example from regular-expressions.info:
print(/\w+/.match(“test”))This will return test in a simple string context. In a boolean context, for example, a conditional statement, this will simply return true. In newer versions of Ruby if we do not want a MatchData object and only want a boolean return value, the method #match? is preferred for optimal use.
We use the ‘threequals’ operator (===) to compare a Regex to a string directly or the (=~) method. This returns the character’s position in the string of the start of the match or nil if no match was found.
We can also find and replace any matches using Regexes:
subject.gsub(/before/, “after”)Now you’re probably thinking, “Yeah, but I can do all of thee things using #find or #replace, etc…” Yes, this is true. But what happens with the following examples from The Bastards Book of Ruby:
Replace all occurrences of “NYC” with “New York City”.
With a regular expression, you can do the same find-and-replace action but catch “N.Y.C”, “N.Y.”, “NY, NY”, “nyc” and any other slight variations in spelling and capitalizations, all in one go.
That’s the least you can do with a regular expression. In web development, regular expressions are used to detect if the email, phone number, city/state, etc. fields contain valid input. Likewise, they are extremely powerful for data cleaning.
Awesome! You’re now well on your way to becoming a coding superhero for your team!!

