What’s a Regex Pt 1.

Joshua Lacey
5 min readNov 14, 2017

--

This past weekend I went to a great workshop by Anthony Ferrara about Regular Expressions (regex). Rather than keep my notes from the talk I thought it would be better to translate them into a few blog posts. There’s a ton of stuff to cover so today I’m going to limit this post to a few Javascript regex methods and just a little bit of regex. Keep in mind that there are similar methods as I’m describing below in Ruby and PHP so with some minor adjustments this could apply to those languages as well.

Regex.prototype methods

There a lot more methods than I can cover today but I’m going to highlight a few that I’ve found pretty useful.

.match()

Try this in your console:'happy happy joy joy'.match(/happy/)

This will return: ['happy', index: 0, input: 'happy happy joy joy']

What does this tell us? The first element in the array is the first match found in the string, the second element tells us at what index of the string that match starts at, then the last bit tells us what that original input was. Note that if we had run 'happy happy joy joy'.match(/happy/g) we would get back ['happy', 'happy'] . Why? Because the g ‘flag’ at the end the regex tells the match method to return every occurrence of ‘happy’ in the string.

.replace()

Try this out: 'I like apples'.replace(/apple/, 'orange')

You should have gotten 'I like oranges' , pretty cool right.

Use case: I was having an issue where I had copied elements from the DOM and saved them as a string. I was trying to rerender this string onto the DOM however JSX in react doesn’t accept a string as an argument of style, and unfortunately when I saved the DOM I got a fun little added style=”” inside my HTML string. I used DomString.replace(/style=""/g, '') to replace all of these annoying little guys. The g at the end of /style=""/ tells the replace method to replace every instance instead of just the first one it finds. More on g in part 2.

.split()

I was totally unaware until this weekend that you could pass a regex into the split method and get it to split at every match. For example say we wanna split at every punctuation mark so that we can get just the sentences back.

var annoyingString='I am! am I? Who am I.'
annoyingString.split(/[!?\.]\s*/)
=> ["I am", "am I", "Who am I", ""]

See below of an explanation of what the heck /[!?\.]\s*/means.

Let’s break this down[!?\.] is a character class which basically matches one of any of the characters !, ?, or .(note that I had to use an escape character, \, in front of the period because . by itself is reserved for matching any character except for line terminators ). Next is the character \s which says match any white space, but hold up, the last period doesn’t have any white space so how are we gonna match that? That’s where the * quantifier comes in; it tells the function to match zero or any white spaces. You’ll notice in the array above that the last element in the array is an empty string, remember that because we used the .split() method it has split at the end of the annoyingString and what’s after the . of the annoying string? Nothing. So we get back an empty string.

.test()

The test method is really similar to the .match() method except that it returns true or false. For this one let’s try it out with the .filter() method, remember filter returns anything in the array that returns true for the given conditions. Lets try it out on our annoyingArray.

var annoyingArray = annoyingString.split(/[!?\.]\s*/)
annoyingArray
=> ["I am", "am I", "Who am I", ""]
annoyingArray.filter(string => /^I/.test(string)) =>["I am"]

So what’s /^I/ mean? Well the ^ in this case means return any string that begins with a capital I and since we are running text on it. The test returns true only for the first string in the annoyingArray which means that the filter method will only spit out that first string.

Regex Anchors

/^a/   match any string that begins with 'a'['abc', 'bca', 'aba'].filter(str => /^a/.test(str))
=> ['abc']
/a$/ match any string that ends with 'a'['abc', 'bca', 'aba'].filter(str => /a$/.test(str))
=> ['bca', 'aba']
/^a$/ match any string that begins and ends with the same a['abc', 'bca', 'aba', 'a'].filter(str => /^a$/.test(str))
=>['a']

Wait a minute, why didn’t that last one not return ['aba'] ?! Well we’re telling the Regex to match based on weather the string begins and ends with the same 'a' . So yeah 'aba' has two different ones. Let see how we can match that both of them in one go:

['abc', 'bca', 'aba', 'a', 'abba'].filter(str => /^a$|^a.*a$/.test(str))
=> ['aba', 'a', 'abba']

There’s probably a couple things we don’t recognize in that bit. First the character | that looks really similar to || which we use in a lot of programing languages for saying ‘or’, so it’s match either the first bit, or the second bit. The next thing is the . by itself. remember how before we said that it will match any character except for a line terminator. I’ve combined it with the * quantifier because, ‘you are the Regex Queen’. The second bit ^a.*a$ says give me back anything that starts and ends with two different a’s and has zero or more characters inside. That’s how we get both ['aba', 'abba'] .

In my next post I’ll dive a little bit more in depth about regex now that we’re comfortable with the different javascript methods we can use with them. Below is a list of thing’s we’ll go over and I’ll give you a few more examples of how they work in Pt 2.

Character 'Classes':
[...] list of acceptable characters any of them
[^...] list of unacceptable characters.
[0-9] is the same as [0123456789]
[a-z] any letter
[A-Z] any capital letter
[a-zA-Z] any letter without case.
Quantifiers:
* matches 0 or more.
? matches 0 or 1
+ matches 1 or more occurrences
{4} matches 4 occurrences
{2,4} matches between 2 and 4 occurrences.
Flags:
g global match, find all matches
i ignore the case of the matches
m multiline match

Check out some of the references below. I’d highly recommend Regex101 if you’d like to play around.

Continue to Part 2 >

References:

Anthony Ferrara

--

--

Joshua Lacey

Fullstack Webdeveloper: Javascript, Node.js, React.js, Ruby on Rails