Regular Expression 101

Haitian Wei
2 min readMar 6, 2019

--

Introduction

I have been using regular expression for some while but I never get too familiar with it. I think perhaps it’s because regular expression contain so many symbols and rules it’s almost like a language. If you don’t speak often, you will get rusty!

A Tool to Help

But luckily I find a online tool https://regexr.com/ . I really find it very helpful. To start with , you can input a regular expression and it will parse this expression and give a detailed explanation along with the result of sample text:

For example, I input [^a-zA-Z0–9] which is commonly used in NLP process for normalize text. And this explanation part tells that [] means negative set, a-z means range. I have to say, this is so good, just the thing I want!

This is totally a online teacher to me, the next time I come across a regular expression I don’t need to check the cheat sheet and spend minutes searching and guessing.

Learn with example

Next thing I notice is the left column which outline the general classification of regular expression tokens. For example, there is a big class called Escaped characters, and it means:

Escape sequences can be used to insert reserved, special, and unicode characters. All escaped characters begin with the \ character.

And if I wan to know the specific reserved character ‘\’ all I need to do is a click and here comes the answer:

The following character have special meaning, and should be preceded by a \ (backslash) to represent a literal character:

+*?^$\.[]{}()|/

Within a character set, only \, -, and ] need to be escaped.

Immediate Patterns

Another interesting feature is the patterns. You can either set you own or use community patterns. For example, there is a pattern that find IP address:

With this amazing tool I believe it would be much easier for you to get familiar with regular expressions!

--

--