Match the Pattern

Athira Lonappan
Fields Data
Published in
3 min readAug 9, 2022

Whenever you are set to watch a movie, you have expectations. Of course you want the cinematography to be amazing, but more importantly, you want the cast to be perfect and the conclusion of the storyline to blow your mind. Similar is the case with dashboards. Of course you want the visualisations to grab your attention, but it’s the key parameters and the conclusion drawn from it that actually matter.

Suppose you create a dashboard for an e-commerce company. Instead of displaying how many products were sold this year, it would be more meaningful to compare this year’s sales with that of the previous years to make decisions accordingly. The process of creating such impactful dashboards would be a piece of cake if you understand the power of Regular expressions (RegEx).

RegEx is basically a sequence of characters that represent a pattern to be matched in a text.

It can be used to locate and replace a pattern as needed. In Google Data Studio, RegEx can be used to transform data in a way that helps to create filters and manipulate dimensions as required.

Before we even start writing RegEx formulas, it is important to understand the use of the following metacharacters:

Let’s take a few use cases to understand the implementation of RegEx.

  1. REGEXP_REPLACE()

Use Case : To replace text from a field

Syntax : REGEXP_REPLACE(Field Or Expr, Replace regex, Replacement string)

Output : Text

2. REGEXP_EXTRACT()

Use Case : To extract text and create modified dimensions

Syntax : REGEXP_EXTRACT(Field Or Expr, Extract regex pattern)

Output : Text

Example: Consider that the ‘Location’ field has all country names in uppercase, and you wish to format it such that only the first letter is capitalised (LAMWO → Lamwo). The following RegEx will be used:

CONCAT(REGEXP_EXTRACT(Location, “(^.)” ),LOWER(REGEXP_REPLACE(Location,”^.”, “”)))

This might be a little overwhelming at first, but let me break it down for you.

REGEXP_EXTRACT(Location, “(^.)” )

The first letter of each value in the Location field is extracted (L)

REGEXP_REPLACE(Location,”^.”, “”)

This expression replaces the first letter of the Location field with a blank character (AMWO)

LOWER(REGEXP_REPLACE(Location,”^.”, “”)))

The resulting expression is then changed into lowercase (amwo)

CONCAT(REGEXP_EXTRACT(Location, “(^.)” ),LOWER(REGEXP_REPLACE(Location,”^.”, “”)))

The expressions obtained from the above RegEx are concatenated to receive the final result (Lamwo)

3. REGEXP_MATCH()

Use Case : To check whether a text pattern matches with the values in the field

Syntax : REGEXP_MATCH(X, Reg Expr)

Output : Boolean

Example: Suppose you wish to represent a region instead of a country, the following RegEx will help us find the correct match.

CASE

WHEN REGEXP_MATCH(Country,”Angola|Ethiopia|Kenya|Zimbabwe”) THEN “Africa”

WHEN REGEXP_MATCH(Country,”Nepal|Pakistan|Myanmar”) THEN “Asia”

WHEN REGEXP_MATCH(Country,”Brazil, Chile”) THEN “South America”

ELSE “Other”

END

If the values in the Country field match with any of the countries mentioned in the RegEx, then they can be categorised into a specific region.

The examples above only scratch the surface of the power of RegEx. Regular expressions widen the possibilities of how we can visualise important data in a more meaningful way, by excluding unnecessary data and giving our dashboards an edge over others. Though RegEx might seem a little intimidating at first, it makes life a lot easier by saving plenty of time and adding value to our dashboard’s storyline.

So apply your logic, create your own RegEx formulas and let’s see if your dashboard can connect the dots!

--

--

Athira Lonappan
Fields Data

Penning down my learnings as I start out as a Data Engineer