Making Regular Expressions Not So Scary for Beginners With 3 Examples

Tyler J Funk
The Startup
Published in
7 min readSep 11, 2020

When I went to Flatiron School, I was learning all sorts of brand new things about coding/programming. There were a lot of foreign concepts I had to tackle and figure out both on my own and with the help of others. However, nothing looked more scary and overwhelming to me than regular expressions, commonly called “regex”. Let’s see an example of a regex from a really great resource, RegexBuddy (found here):

/\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}\b/gi

According to RegexBuddy, this regex is code that searches for and only matches valid email addresses. According to every developer just starting out, this regex looks like a foreign language only to be understood by machines and super genius coders. Symbols and slashes of both variety all over the place, and the only pattern someone might recognize is A-Z or 0–9, outside of that, no words at all.

Seems overwhelming, but after looking past the complexities of a long winded regex like the one above, and focusing on the basics, we can find a lot of applications for regular expressions, even for beginner developers who are scared or deterred like I was.

In this post, with JavaScript, I will walk through and break down 3 examples of simple regexes and demonstrate why regular expressions are only as complicated as you make them, and how they might be more useful to new software developers than they might think. This post will not demonstrate every single functionality of regular expressions, but by the end you should feel more comfortable with them and understand more about regex practicality.

Just to kick things off, regular expressions always appear between two forward slashes, but can have flags (single character letters we add) at the end, outside the second forward slash. Ex.:

const regex = /Hello World/i

In general, what we put between the slashes is like a search query, in this case, I’m searching for the phrase “Hello World”, and the ‘ i ’ is a flag which means that we don’t care whether the letters are uppercase or lowercase (the ‘ i ’ is for either ignore case, or case-insensitive). In other words, the string “hellO WOrlD”, would pass our regex test.

Now lets get into some examples!

Example 1

Let’s say you are given a large string of characters, like genetic code, which is all just A’s, T’s, C’s and G’s. Maybe it looks something like this:

const geneticCode = "ATCGGCACTAGCATTATAGCTATATCGGCGCGCGATATCGATCGCGTATCAGTCGTAGATCGATCGACGATCGTATGCTGTCAGCATTAGCTAGCTAGCTGATTGTGTGTACTAGCTAGCTACGTTATATTCGATCGTGCTAGTACGATCGTAGCTACTACTAGCTAGTCATCGATCGTAGCTGATCGTGCTAGTCGCTAGCATGATCGTACGTGACGATCGTACTCACTATCATCGATACTGACATCATCGATCGATCGATCGTAGCATATCGATCGATCGACGTACGTAGCTGACTGACTATCGACGAACATTAGTGATGACTACGACGATCGAGC"

And then let’s pretend we are looking for a specific pattern of letters, to indicate whether someone is a high risk for a certain disease or not. We’ll say that pattern is “CATTAG” in efforts to lighten up this example.

Our job is to find out if “CATTAG” appears in this genetic code, and if it does, also find out how many times. So let’s search through our string with a regular expression and get some useful information back using a couple different methods:

const dnaRegex = /CATTAG/gdnaRegex.test(geneticCode) // trueconst matches = geneticCode.match(dnaRegex) // ["CATTAG", "CATTAG"]const totalMatches = matches.length // 2

So what happened here? We made a regex that matches the “CATTAG” pattern, and added a ‘ g ’ flag at the end which stands for global and means it will return every instance of “CATTAG” found in a string.

On the next line, we used the “test” method on our regex, and passed it the genetic code string. The “test” method looks through the string for any matches, and if it finds at least one, it returns ‘true’, and obviously if it finds none it returns ‘false’.

Similar to the “test” method, the “match” method looks through the string for matches, except it returns all matches in an array, instead of a boolean. In this case, it returned two matches in an array.

The last line should look familiar to anyone who has worked with JavaScript. We are checking the length of the array created from the “match” method, effectively counting how many matches are in it.

And now we have the information we were asked to retrieve; this genetic code has two matches for the pattern “CATTAG” which we’ll say makes this candidate a low risk for the disease we’re testing for.

Example 2

This time we are given a books worth of text (in this case, we’ll use a paragraph worth of our pretend book for demonstration purposes). The author wants to change the name of a certain character, which means finding and changing every single time his/her name is mentioned throughout the entire book. Our job is to change the name “Matthew” to “Ethan”, except sometimes he is also referred to as “Matt”. Regular expressions are our friend, and we can use one here to easily replace this character’s name, regardless of him having a nickname.

Here is our text (Lorem ipsum text with “Matt” and “Matthew” sprinkled in):

const bookText = "Lorem ipsum dolor sit Matthew, consectetur adipiscing elit. Ut euismod erat ante, at viverra tortor vestibulum vitae. In suscipit arcu magna, vel auctor Matthew varius eu. Ut placerat id libero ac sodales. Matt ipsum nisi, ullamcorper at tempus sed, mollis eu libero. Praesent faucibus tortor risus, a rutrum urna mollis at. Integer accumsan orci vitae condimentum facilisis. Sed eu fringilla Matt. Cras sapien ipsum, sollicitudin non sem sed, convallis feugiat quam. Nam Matthew diam vitae odio euismod elementum. Matthew molestie tempor libero sed consequat. Suspendisse mattis commodo lobortis. Sed dapibus aliquet tincidunt. Morbi ut hendrerit nisl. Phasellus tempus nulla pellentesque erat efficitur, ut rhoncus Matthew vestibulum."

And here is how we are going to use our regex:

const matthewRegex = /Matthew|Matt/gconst newTextWithEthan = bookText.replace(matthewRegex, "Ethan") 
// "Lorem ipsum dolor sit Ethan, ..."

A few new things happened this time. In the regex, i used a pipe symbol (‘|’), which in regex represents “or”. So in other words, our regex matches every instance in the text of the name “Matthew” or “Matt”.

Then, our second constant uses the “replace” method, which takes 2 arguments, the first is the pattern you wish to find, and the second is the text you want to replace it.

And we have successfully changed the character’s name to “Ethan” with the help of regular expressions.

Example 3

As a slightly more complicated example, let’s say we are hired by a company that wants their employees to create a username for their new internal systems. The company has asked that every employee makes their username fit some guidelines, including, that it must start with 4 to 8 lowercase letters, end with exactly 4 numbers, and use 0 symbols and spaces. Our job is to create some code that ensures each employee’s username fits within the company’s guidelines:

const validUsername = "smith1992"
const invalidUsername1 = "ted3333" // not enough letters
const invalidUsername2 = "frankie12345" // not exactly 4 numbers
const invalidUsername3 = "alex-4321" // includes a symbol
const usernameRegex = /^[a-z]{4,8}\d{4}$/usernameRegex.test(validUsername) // true
usernameRegex.test(invalidUsername1) // false
usernameRegex.test(invalidUsername2) // false
usernameRegex.test(invalidUsername3) // false

Okay, this regular expression looks a little more complex, let’s break it down bit by bit to decipher how it works:

^ — this is a symbol which indicates the string we are testing should start with what comes after it, in this case [a-z]

[a-z] — the brackets allow us to do ranges like a-z which matches any single letter in the lowercase alphabet

{4,8} — this is a quantifier where we have set the lower limit to 4, and the upper limit to 8, meaning we are looking for between 4 to 8 lowercase letters

\d — this is a metacharacter, and the ‘d’ represents digit, so by using this metacharacter, we are essentially saying [0–9], a single digit number between 0 and 9

{4} — once again, we are using a quantifier, to tell us how many single digit numbers to look for. Last time, we wanted to use a range of 4 to 8, this time, we want exactly 4 digits, so we don’t need an upper and lower limit

$ — opposite of the ^, the dollar sign symbol indicates that the string we are testing should end with what comes right before it, in this case, 4 numbers.

Now we can confidently use this regex in our code to test any username an employee tries to create and get a quick result telling the employee whether they need to fix their username or whether it fits the parameters set by their employer. With that, we have done the job asked of us with one to two lines of code, a regular expression!

Conclusion

Hopefully these examples have successfully demonstrated that there are multiple, simple uses for regular expressions, despite how complex they can become. As you can see, it depends on how complicated you choose to make them, or how many conditions you need a particular pattern to meet before it returns any matches.

This article doesn’t show every single aspect of regular expressions, but there are a lot of resources out there; here’s a cheat sheet that I found especially useful. Remember, even if you are a beginner, regex can help you! It just takes a little time to learn the syntax, but once you do, from what I understand you can use it in most major coding languages, so it’s a versatile developer tool that can be used by any level of developer. If this article helps even just one new dev realize regular expressions are nothing to shy away from, I’ve met my goal. Happy coding!

--

--

Tyler J Funk
The Startup

Full Stack Web Developer && Creative Thinker && Flatiron School Grad && Continuous Learner