Fall in love with Regex

— Why don’t you marry them ?

Sarvagya Sagar
Jul 4 · 11 min read

Getting bored ? or just Scrolling down boring News Feed . Don’t bothered I have an interesting talk for you . Slow down your car if you’re driving or Just Pull the Break , Because Today I’m going to talk about “R e g e x” or “Regular Expression” . I don’t know much more about Regex. But I would try to guide you , What I learned in Regex . So grab a cup of coffee with some snacks because this session is lil bit long . Let’s move on “Regex” without wasting more time .

— I fall in love with “R e g e x” but Why ? : I’m new to Infosec . I’m learning something new everyday . Few days before , I started to learn about “JavaScript” after learning few basics of Javascript , Suddenly I see the chapter “Regular Expression in Javascript” After that I searched on Google “Getting started quickly in Regular Expression” . Few minutes later I found an article “Rexegg” — which is pretty much cool , Well written and they have good refrences too . Wait wait , my searches not end here ! Again I searched on Google after this I found an article where few guys are talking in comment , “immike’s blog is best refrences for learnning regex but he deleted that post or blog” . When I see this comment I was curious to find the immike’s blog — after few minutes I got an idea to search in web archive or wayback machine , after struggling 15 minutes I successfully found it . Here is — “immike’s blog” . This is very old blog post date is 21–06–2007 . I was starting to read this blog post after reading this post and few other blog post of immike’s regex reference . Here ,

“ I fall in love with R e g e x ” .

— WTF is “Regex” : Regex is pretty much cool and I amazed How useful regex in my daily learning life . In your Infosec career , you’ll surely fall in love with regex . Alright ?

Regular Expression is just a bunch of characters or symbols which describes a specific pattern from a text or string . Regex pattern matches from “left” to “right” direction .

Regex can be used to solve infinite real world problems , Move on ..

Let , you’re creating a web application form in php and you are on signup section . Assume when an user chooses their email address and you want to fixed the set of rules , You want to allow the email address in lowercase letters , digits , underscore , period and two or more lowercase letters in tld section . Alright ?

Here , is a short pattern to validate email address . Starts with and Ends with, Both are Anchors we will learn more about this in Anchor Section .

— Play with Metacharacters : Meta characters is a character that have special meaning and this is used to define the search criteria and any text manipulations. Meta Chars are building blocks of the Regex . There are lots of Metachars like ^.*?\|[](){}+$ Don’t worry about this we will learn all metacharacters one by one .

— First meetup with Period and Character Class :

Note : In this article , Regex pattern is in Block Quoted inside the double quote “ R e g e x ” which is highlighted below and matches characters are also highlighted . Don’t forget to visit on Result Link , Alright ??

Assume you want to match the word “ week ” , but also you want to find the word “ weak ” then the character class helps you to match . Here I show you ,

we[ea]k ” : I would like to read this article twice in a week but my English is weak .

After visiting on result link ,You will get the Regex we[ea]k is interpreted as “w followed by letter e , followed by either e or a and e or a is enclosed in character set , followed by k” .Understand ? Let’s move forward on Period metacharacter “ .

 .all ” : ball call fall hall mall small taller all roll wall troll

When you focus on Regex .all is interpreted as “any character followed by a , followed by b” . This will match all single character followed by “all” .

— The Quantifiers +,*,?,{} : Four metacharacters “ Plus , asterisk , question mark and braces are known as Quantifiers . Basically they are used when you don’t know in advance how many characters need to be matched. Let’s understand One by One .

al* ” : call tall taaalllk all al a maaaaallllll l

If you see the result then you understand well , In this Regex al* is interpreted as “zero or more repetition of preceding consecutive lowercase letter a, followed by letter l” . As you see in last single letter “l” is not matched because this is not follow the letter “a” . Now , See the example of plus quantifiers .

he+y ” : heeeeey hey heyyyyy hhhhey hhhhheeeeyyy hhhhheyyyy

When you see result , The regex he+y interpreted as “lowercase letter h , followed by one or more repetition of preceding consecutive letter e , followed by letter y” . Understand ? Ok now move on other quantifiers ;

Assume , You’re going to match the american english or united english of the word “ color ” , You should use the R e g e x :

colou?r ” : color of my lappy is white and my desktop is black in colour .

Regex colou?r means “lowercase letter c is followed by o, followed by l, followed by o, followed by optional u , followed by letter r” .

Suppose you’re creating a login page and You’re on Password section . When an user chooses their password and you want to fixed the set of rules , You want to allow the password in lowercase letters , digits , period or underscore but password is more than 8 character or less than 16 characters . Here , you can use Regex pattern :

[a-z0-9._]{8,16} ” : m1tthu_sa.gar , M1tthu-sagarrr , mitthu , abcd_ef.g0

Regex [a-z0-9._]{8,16} is interpreted as “Characters in the range of lowercase letter a-z , 0–9 , hyphen or dot but characters are less than 8 or more than 16.” Okay ! Quantifiers are end , Now move on Rest Metacharacters .

— Escape Metachar “ \ ” : Backslash is used to escape the next metachar that have special meaning and allows us to matching reserved characters . Reserved Characters are “ ^+*$/[]\{}|().? ” If you want use reserved character , then you could use escape metachar before the reserved character .

— Parenthesis Metacharacters “ ( ) ” : Parenthesis is used to capture group of sub patterns which is inside the parenthesis ( xyz ) .

Let , We move on Alternation and then we’re going to see different types of examples .

— Alternation Metacharacters “ | ” : The verticle bar or Pipe character is used as “OR” , Alteration i like an OR statement , This allows you to combine multiple expressions in a single expression . Take a look on example of Alternation !

(f|g)ood ” : wood good mood food

Regex (f|g)ood means “either lowercase letter f or letter g , followed by lowercase letter o , followed by o , followed by d” . Here Paranthesis is used to capturing the pattern of alternation . Paranthesis is also known as “Capturing Group” .

Capturing Group : “ (m|r|s|f)at ” : mat bat rat cat fat hat sat vat

Non Capturing Group : “ (?:m|r|s|f)at ” : mat bat rat cat fat hat sat vat

Nothing ? differences between both capturing and non capturing group but when you see the result , then you will find that differences . Okay !

— Negated Character Class : This is used to negating character sets , denoted by [^] . When it is typed after the opening square brackets then the caret has a different meaning . Ok Seee now ,

g[^o]od ” : good goad gold

Regex g[^o]od means “lowercase letter g is followed by letter o which is negated is followed by o, followed by d” .

Anchors ^$ : Anchors is used to check if the matching char is starting or ending of the string . There are two types of anchors :

without caret : “ (l|L)aptop ” : Laptop : I have two laptops , one of the laptop is red in Color but other laptop is white in color .

with caret : “ ^(l|L)aptop ” : Laptop : I have two laptops , one of the laptop is red in Color but other laptop is white in color .

with dollar : “ (c|C)olor.$ ” : Laptop : I have two laptops , one of the laptop is red in Color but other laptop is white in color

— Shorthand Character Classes : There are lots of commonly used chars , So regex provides shorthand charset . Shorthand charsets are …

Shorthand char set is easy to understand that’s why I’m not show you examples ..

— Life hack with Regex : Want more on Regex ? If you want then go on Refrence Section Alright ?

Regex is cool this can helps you lot to solve real life problems , As you see above examples . Regex is also boring to understand, Don’t bothered from this , I know Regex is lil boring but if you dig deeper in regex , then you amazed that How useful are regex in your daily Programming life . This will helps you in real life problems like web scrapping , parsing , renaming lots of file in a few sec and many more . Ok Now I’ll give you a task :

Task : Make a Regex Pattern to validate the format of IPv4 Address . Ex :

If you solve this Task , then DM me on Twitter : Twitter Handle with regext pattern , or if you want more task then you’re free to ping me on Twitter . Now move on Refrences , Cheatsheet section … Here are few refrences , cheatsheet and tools

— Time to say Good Bye : That’s all for today . See you on next Article . If there is any mistakes then forgive me because I’m pure newbie in infosec . Hope you all understand my English . Thanks to all who supoorting me and Special thanks to NullCrowd . If you want more article on Regex then you can DM me on twitter .

Thank You ,

Contact : Twitter Handle *

    Sarvagya Sagar

    Written by

    Hack the hacker before hackers hack you || Offensive Mitthu || 19 YO || Geeky Badass || I Love (OS/GEO/SOCM)INT || NOOB || Indian || Aesthetic B|