Image for post
Image for post
Regular Expression

Regular expression or “REGEX” is used wildly in programming world. Every programming language supports regular expression. Regular Expression is basically used for searching any specific pattern in a string. There are lots of inbuilt method which are used to search a string within a string but all are very straight forward. With the help of regular expression we can search any type of pattern be it simple or complex.

Today I am going to explain how regular expression works and how to use it in different programming languages like Javascript, PHP, Python and Java. Along with this I will also share some of the most commonly used regular expressions with explanations and examples.

Syntax

/pattern/modifier

Modifier

Modifiers are used to perform case-insensitive and global searches

i : for case insensitive match

g : for global match( find all matches after the first one)

m : for multi-line match

Quantifiers

If n is the string, then

n+ : matches any string that contains atleast one n

n* : matches any string that contains zero or more occurrence of n

n? : matches any string that contains zero or one occurrence of n

n{X} : matches any string that contains a sequence of X n’s

n{X,Y} : matches any string that contains a sequence of X to Y n’s

n{X,} : matches any string that contains a sequence of at least X n’s

^n : matches any string that starts with n

n$ : matches any string that ends with n

Metacharacters

. : Find a single character, except newline or line terminator

\w : Find a word character.

\W : Find a non-word character

\d : Find a digit character.

\D : Find a non-digit character

\s : Find a white space character.

\S : Find a non-white space character

\0 : Find a null character.

\n : Find a new line character.

\t : Find a tab character.

Brackets

[abc] : Find any character between brackets

[^abc] : Find any character not between brackets

[0–9] : Find any digit between 0 to 9 (same as \d)

(a|b) : Find any of the alternative specified (either a or b)

Regular Expression in different languages

Note - I have used nazish as pattern and i as modifier

Javascript

var pattern = /nazish/i;var str = “Hello nazish”;str.search(pattern);  
// Returns index position of pattern in str i.e. 6
// it will return the first matching position. starting index 0

PHP

$pattern = "/nazish/i";$str = "Hello nazish";preg_match($pattern, $str);
// it will return TRUE if pattern matches in the string, else FALSE

Python

import repattern = "/nazish/i"
str = "Hello nazish"
m = re.search(pattern, str)
m.group(0)
// it will return the match string

Java

import java.util.regex.*;String pattern = "/nazish/i";
String str = "nazish";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(str);
boolean b = m.matches();
// b will return true or false

Example

  • Pattern for Name
/[a-zA-Z ]{3,}/matches all the strings which contain alphabets or blank space with minimum length 3Matched strings "and", "andrew", "Nazish Fraz"
Unmatched strings "I", "an","$@8"
  • Pattern for UserName
/^[a-z0-9_-]{4,20}$/Begin with one of the characters inside square bracket. 
The pattern says username may contain lowercase alphabets, numbers, dot, underscore or minus. its length can be 4 to 20
Matched strings "nfraz007", "nazish_1234"
Unmatched strings "NFRAZ007", "123"
  • Pattern for mobile number (India)
/(\+91|0)?[789][\d]{9}/(\+91|0)? whether +91 or 0 can comes 0 or 1 times
[789] first digit will be only 7, 8, 9
[\d]{9} after that digit will come which length will be 9
  • Pattern for Password
/^[a-z0-9_-]{4,12}$/A password can contain a-z, 0-9, underscore or minus whose length can be 4 to 12.
  • Pattern for hex value
/^#?([a-f0-9]{6}|[a-f0-9]{3})$/^#? start with # which can come 0 or 1 times
([a-f0-9]{6}|[a-f0-9]{3})$ ends with either group
[a-f0-9]{6} : group 1 : which contains a-f, 0-9 of length 6
[a-f0-9]{3} : group 2 : which contains a-f, 0-9 of length 3
  • Pattern for Email
/^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$/^([a-z0-9_\.-]+) start with any characters from a-z, 0-9, underscore, dot (to escape this, we use backslash) or minus which can come 1 or more times.
followed by @
([\da-z\.-]+) any digit, a-z, dot, minus can come 1 or more times
followed by dot (for escape use slash)
([a-z\.]{2,6})$ end with any a-z, dot which length will be 2 to 6
  • Pattern for URL
/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/^(https?:\/\/)? start with http (s can come 0 or 1 times) then : then // for escape use backslash. this whole group can come 0 or 1 times
([\da-z\.-]+) digit, a-z, dot, minus can come 1 or more times
followed by dot (escape by slash)
([a-z\.]{2,6}) any a-z or dot which length will be 2 to 6
([\/\w \.-]*)* then a slash, word, dot or minus any thing can come for 0 or more times, and the whole group can come 0 or more times
\/?$ end with / which can come 0 or 1 times.
  • Pattern for Vehicle Number
/^[a-zA-Z]{2}[0-9]{2}[a-zA-Z0-9]{1,2}[0-9]{4}$/^[a-zA-Z]{2} start with a-z or A-Z with length 2
[0-9]{2} followed by 0-9 of length 2
[a-zA-Z0-9]{1,2} followed by alpha numeric with length 1 or 2
[0-9]{4}$ then ends with 4 digits
  • Pattern for Adhaar Number
/^[\d{4}\0?\d{4}\0?\d{4}]$/it can contains 4 digits, then a blank space can come 0 or 1 time. then repeat the same.
  • Pattern for PAN Number
/[a-z]{5}[0-9]{4}[a-z]{1}/[a-z]{5} start with alphabets of length 5
[0-9]{4} then digits with length 4
[a-z]{1} then alphabets of 1 length

Practice

It is always a better idea to practice all example by your self. For this I know a pretty good website in which you can practice these example.

You guys can download a regex.txt file which I have created for practice. checkout my github repository

Written by

Busy in learning new things.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store