Why Every Marketer Should Know How to Use Regular Expressions (RegEx)

Note: this is a post from 2013 that was on a now-defunct blog that I’ve been asked to bring back to life…

Most Marketers Can’t Express Themselves Regularly

As Marketers are becoming more accountable, there are more and more of us who are sifting through numbers, charts and data in various analytical tools ranging from Google Analytics to Salesforce.com to good ol’ Excel. We need to be able to review vast amounts of data, recognize patterns, test hypotheses and even (sigh) put together some reports to satisfy the HiPPOs. We filter, sort, sift and pivot around all of this information looking for ways to gain insight and improve what our companies are doing and how we spend our money.

For many marketers, this growing need to crunch data is an additional requirement to their job and doesn’t replace all of the other tasks and projects that they need to manage. So, as you can imagine, we find ourselves needing to do more in a very limited amount of time. To get this all done, we must be efficient in pulling, aggregating and analyzing the data.

That is where Regular Expressions (aka RegEx) come into play. It’s a very powerful tool that most marketers have never tried or even heard of yet it can make their jobs much much easier.

Not Knowing RegEx is Like Bringing a Knife to an Analytics Gunfight

Most people who have ever used a PC or Microsoft Office are familiar with the wildcard “*”. Symbolized by an asterisk “*”, it let’s you search for files or text that match a certain pattern and the asterisk in the pattern means that anything could appear in its place.

For example, if I want to find any CSV file with my name in it I could search in Windows Explorer “michael*.csv” and that would find any file that matches that pattern such as:

michael-freeman-monthly-traffic.csv
michael-new-leads.csv
michael-really-helpful-data-2013.csv

Now, that is pretty handy, but a regular expression takes the concept of the wildcard and turns it up to eleven. It is a powerful “language” that allows sophisticated pattern matching. This “regular” language lets you combine many different criteria that encompass many different types of wildcards as well as include multiple optional criteria in a pattern. That may sound more complicated than it really is. Once you know just a few of the rules, you can start tapping into the power of RegEx. In fact, most online marketers are already using a regular expression tool and they don’t even know it. The default filter box in Google Analytics reports accepts regular expressions.

Google Analytics’ filter uses RegEx by default

The above example in the traffic sources report, tells GA to only show any results in the Source/Medium field that have “google” OR “bing” in any of the results. So, that simple little pattern takes my original list of 23 different sources and narrows it to 5:

And BAM! With simple RegEx you get easy but powerful filtering

You might be used to a lot of tools that force you to rules like “begins with” or “contains” or “does not contain” and then you need to combine multiple rules to get the data that you want. Or worse, you need to filter each set of information for each term/value, copy and paste it into another sheet and then repeat those steps for each different value you need to grab. From the above example, I was able to do that simply by adding the “|” (known as the pipe) symbol. In RegEx, it acts like an OR statement. That is AWESOME!

As you learn more about RegEx you can start to write more powerful patterns to help you find the needle in the haystack or get to exactly the data you need. Here’s a more complicated example that is saying the following:

Show me anything that starts with either “google” or “bing” followed by “.com / referral” AND the region must be from “California”, “New York” or “Illinois” (notice how illinois isn’t even fully written)
Turning up the filtering (and analysis) to eleven

Just imagine all of the cool ways you could filter and combine your data.

The possibilities are really endless and only limited by your imagination (and the quality of the data, of course). Sadly, many marketing tools still do not support RegEx (ahem, HubSpot, please add it soon). Nevertheless, GA alone is a compelling enough reason to learn RegEx. Also, there are multiple ways to enable RegEx support in Excel (by default, Excel doesn’t support RegEx- that is pathetic MSFT! ). And since every other tool ever created can export to Excel/CSV pretty much, you can then unleash regular expressions on your data.

Go. Get on it already!

Start Learning RegEx Today — You’ll Thank Me For It

With all of the analysis we need to do these days, if you don’t know RegEx, you are working with one hand (at least) tied behind your back. As with learning other languages, it does take some practice, but you’ll find that very quickly you will be able to do your analysis more quickly and easily.

Spend 30 minutes, learn the basics, and then just practice a bit in Google Analytics.

You may even find yourself having new ideas for analysis that hadn’t even occurred to you before. I agree with Avinash when he said “no good analyst can live without RegEx!”. In my opinion, this and Excel should be required for anyone who wants to be in Marketing.

There are multiple online resources that can teach you the rules, options and syntax for writing regular expression patterns. Instead of trying to recreate them, take a look at those yourself. Here are some of my favorite resources on RegEx:

  • RegEx One — They do a very good job of breaking up the topic into very digestible free lessons and have interactive tools for you to practice writing different types of patterns. They also have a good list of commonly used patterns (website addresses, phone number patterns, etc…)
  • Regular Expressions — This is probably the most complete (and one of the first RegEx sites I ever found) RegEx sites out there with countless examples and tutorials
  • The RegEx Coach — While there are other online RegEx testing tools, this downloadable, free app is very handy for writing and testing more complicated patterns (WINDOWS)
  • yRegex — This is my favorite MacOS RegEx testing tool
  • Helpful patterns to use in Google Analytics — This is an older post (e.g. keyword data was complete back then in the pre-”not provided” era) that has multiple examples of uses of regular expressions in Google Analytics
  • SEOTools — Excel Plugin (Windows only) — One of my favorite Excel plugins that includes regular expression support. Besides loads of other SEO friendly tools and functions, this plugin adds RegEx support and builds in some custom functions to do advanced RegEx based search as well as find and replace
  • Regular Expression Cheatsheet — A very handy resource showing the main rules as well as some common RegEx patterns such as finding only numbers, dates or email addresses, etc…

Do you already use RegEx? What are some of your favorite uses or patterns? Share them in the comments.