Regular Expressions are easy
“Give a man a regular expression and he’ll match a string… but by teaching him how to create them, you’ve given him enough rope to hang himself”
— Johann Wolfgang von Goethe
Regular Expressions are somehow difficult to understand and it requires a lot of practice in creating your own regular expressions (RegExp) but once you will start using it you will never stop.
In different languages RegExp is used and has the same concept with little bit different syntax. But by reading this article you may be able to understand its usage and can translate it to your preferred language syntax.
Definition
A Regular Expression is simply a type of object that is used to match character combinations in strings.
Creating Regex
RegExp are created in two different ways i.e either create RegExp by using backslashes or creat it by using a Constructor Method i.e. RegExp().
var re = /\s/;
var re = new RegExp(‘\s’);
Both of the above expressions are used for matching first space appears in the string.
Note: When the RegExp is unchanged or constant use Expression by using slashes for it else use RegExp constructor.
Cheat Sheet:

Regular Expression Flags

Examples with Explanation
Example 1:
Find the word “problem” in the string “The solution of every problem is another problem.”
var re = /problem/g;
In the above RegExp we used regular expression with forward slashes. We directly gave the word problem in the slashes and g flag is used to find the word throughout the string. If you skip the g flag in the expression than it just matches the first occurrence of the word problem. If the string contains the word problematic the regular expression still matches the sub word problem in the word problematic.
Example 2:
Write a regex that will match with the following conditions:
- Must be of length 6.
- First character should not be a digit.
- Second character should not be a lowercase vowel.
- Third character should not be b, c, D or F.
- Fourth character should not be a whitespace character (\r, \n, \t, \f or <space> ).
- Fifth character should not be a uppercase vowel.
- Sixth character should not be a . or , symbol.
var re = /^[\D][^aeiou][^bcDF][^\s][^AEIOU][^\.,]$/;
- The upper RegExp uses the 1st character ^ in the slash that defines the token begining.
- \D defines that match anything except i.e. digit.
- ^[\D] defines that the start of the word character should not include any digit.
- [^aeiou] defines that first character should not be followed by the character include in the bracket i.e the lowercase vowel.
- [^bcDF] defines that 2nd character should not be followed by the character include in the brackets i.e bcDF.
- [^\s] defines that 3rd character should not be followed by any of the space character.
- [^AEIOU] defines that 4th character should not be followed by any of the character include in brackets i.e uppercase vowel.
- [^\.,] defines that the 5th character should not be followed by any of the character include in the bracket i.e . (full stop) and , (comma).
- $ defines that the 6th character should be the last character of the matching token.
Sample Matching Output:
- Breath
- Friend
Example 3:
Write a regex that will match String using the following conditions:
● String should begin with 1 or 2 digits.
● After that, String should have 3 or more letters (both lowercase and uppercase).
● Then String should end with up to 3 . (full stop) symbol(s). You can end with 0 to 3 . symbol(s), inclusively.
var re = /^\d{1,2}[a-z]{3,}\.{0,3}$/i ;
● ^\d defines that the matching token should begin with the digit.
● {1, 2} defines that the digit should either occur one time or two times in the string.
● [a-z] defines that the tokens previous character(s) should be followed by the letters. It just defines that the character should be in lower case. The case insensitivity is done by writing i flag in the end of RegExp. Means it defines that the character can be in any of the case in the string.
● {3,} defines that the character should come at least 3 times. As the maximum range is not defined therefore the character should occur any number of times after occurrence of 3 times.
● \. defines that the previous token should be followed by a full stop.
● {0, 3} defines that the full stop occurrence should be in between 0 to 3 time including 0 & 3 times.
● $ defines that token should end up here.
Sample Matching Output:
● 3foo…
● 45Jumpers.
Example 4:
Write a regex which will match word starting with vowel (a,e,i,o, u, A, E, I , O or U).
● The matched word can be of any length.
● The matched word should consist of letters (lowercase and uppercase both) only.
● The matched word must start and end with a word boundary.
var re = /\b[aeiouAEIOU][a-zA-Z]*\b/;
● \b at the beginning and end of the RegExp defines that the matching string should be word bounded i.e. matching only the word without spaces.
● [aeiouAEIOU] defines that the RegExp should start with the either uppercase vowels or lowercase vowels.
● [a-zA-Z] defines that the words first character that is vowel should be followed by the alphabets either in uppercase or in lowercase.
● * defines that the word should be at least 1 character in length and at most it should be of any length.
Sample Matching Output:
● Acting.
● Unique.
RegExp Capturing Group:
Some patterns we search occurs multiple times in a string. In order to match multiple repeated substrings in string, capturing groups are used.
The repeated substrings patterns can match using capture groups which is represented by parenthesis ( and ). The regex that will required to remember and match more than one time are placed in the parenthesis.
● (x) Capturing Parentheses: Matches x and remember it for later usage.
● (?:x) Non-Capturing Parentheses: Matches x and does not remember it.
● x(?=y) Positive-Lookahead: Matches x only if it is followed by y.
● x(?!y) Negative-Lookahead: Matches x only if it is not followed by y.
● (?<=x)y Positive-Lookbehind: Matches y only if it is preceded by x.
● (?<!x)y Negative-Lookahead: Matches y only if it is not preceded by x.
To specify where that repeated string will appear, use backslash (\) and then a number i.e. \n. The number of the capturing group will be according to the capturing group occurrence, if it comes first from left to right then it will be represented by \1, if it occurs second time it will be represented by \2 and so on.
Another way of capturing group is to use $n where n is the number. These are used in the same scenario as \n but the slight difference between their usage is that $n is used outside the RegExp while \n is used inside the RegExp.
Example 5:
Write a RegEx that matches under the following conditions:
● Must start with Mr., Mrs., Ms., Dr. or Er..
● The rest of the string must contain only one or more English alphabetic letters (upper and lowercase).
var re = /^(Mr|Mrs|Ms|Dr|Er)\.[a-zA-Z]+$/;
● ^ defines the beginning of the token.
● (Mr|Mrs|Ms|Dr|Er) defines that the matching token should start with either Mr or Mrs or Ms or Dr or Er.
● \. defines that the preceding RegExp value should be followed by . (full stop).
● [a-zA-Z] defines that the preceding RegExp should be followed by any upper or lower case character.
● + defines that the preceding RegExp i.e. [a-zA-Z] should be used for matching at least one time or it could be at most of any number of times.
Sample Matching Output:
● Mr. Eline.
● Er. Bini.
Example 6:
Write a regex which will match , with following condition(s):
● String consists of 8 digits.
● String may have “” separator such that string gets divided in parts, with each part having exactly two digits. (Eg. 12–34–56–78).
var re = /^\d{2}(-)(\d{2}\1){2}\d{2}$/;
● ^ defines beginning of the token to match.
● \d{2} defines that the string should be a digit. {2} defines that digit should be of length 2.
● (-) defines that the preceding character should be followed by -. It is wrapped in () (brackets) to remember it for later usage. It is the first capturing group.
● (\d{2}\1) defines that the preceding RegExp matching should be followed by digits that is of the length two. \1 defines the first capturing group value i.e. — to match. As this RegExp pattern is wrapped in the round brackets therefore it is the second capturing group.
● \d{2} defines that the preceding RegExp should be followed by the string containing digits of length two.
● $ defines that the preceding character sequence is the end of the string.
Sample Matching Output:
● 12–32–53–23
● 50–15–86–10
Example 7:
Write a regex which will match , with following condition(s):
● String consists of 8 digits.
● String must have “ — -”, “-”, “.” or “:” separator such that string gets divided in 4 parts, with each part having exactly two digits.
● String must have exactly one kind of separator.
● Separators must have integers on both sides.
$Regex_Pattern = ‘^\d{2}( — -|-|\.|:)\d{2}\1\d{2}\1\d{2}$’;
● ^ defines beginning of the token.
● \d{2} defines that digit should be of length 2.
● ( — -|-|\.|:) defines that the separator of the digits should be of any of the symbols defined in the bracket. It should be remembered for later usage as it is wrapped in round brackets.
● \d{2}\1 defines that the two digits number will be followed by the separator remembered by the previous occurrence.
● $ defines the end of the token.
Sample Matching Outputs:
• 12–34022–76
• 76:23:95:76
• 65 — -78 — -25 — -97
Example 8
Write a regex which will match String with following condition(s):
● String consists of tic or tac.
● tic should not be immediate neighbor of itself.
● The first tic must occur only when tac has appeared at least twice before.
var re = /^tac(tac(tic)?)+$/I;
● ^tac defines that string should start with tac,
● (tac(tic)?)+ defines that tac should be followed by another tac and then it may or may not followed by tic. ? is used to defines that two tacs may follow one tic or none tic in the beginning. + symbol specifies that the first capturing group can occur one or more time.
● $ defines the end of the string.
Sample Matching Outputs:
• tactactic
• tactactictactic
Example 9
Write a regex that can match all occurrences of o followed immediately by oo in String.
$re = ‘/o(?=oo)+/;
● o(?=oo) defines that it is a positive lookahead. It finds o followed by pair of o, if yes then match the string, if no then don’t match the string.
● + symbol specifies that the substring “ooo” should come in the string one or more time(s).
Sample Matching Output:
● Goooooo!
● Ooops!
Example 10
Write a regex which can match all characters which are not immediately followed by that same character.
var re = /(.)(?!\1)/g;
● (.) defines that the character that occurs, remember it.
● (.)(?!\1) defines negative lookahead. It tells that the remembered character should not be followed by the same character or match the string where no pair of characters are same.
● g defines to match the RegExp throughout the string until & unless line break occurs.
Sample Matching Output:
● Chocolate
● Fantastic
Methods on Regular Expressions
1. r.test(s): This method takes string as parameter and test regular expression on the string and return Boolean value i.e. either true or false.
2. s.match(r): This method takes regular expressionRegExp as parameter and test the RegExp on to the string and returns the matched Group captures. If g is used than it only returns an array of Group 0 captured values.
3. r.exec(s): This method takes string as parameter and return the matching Groups arrays. Until and unless the matched in the expression becomes null.
4. string.split(): Passing a delimiter to it that is a character or sequence of characters where you want to split your string.
5. string.replace(re,s): Takes in a regular expression (re) and string (s) to match a string on which replace function is called with the regular expression and replace that string value with the second s parameter of the replace function. This is called string replace method.
Example 11
Write a RegExp to replace p in papa with m.
var name = “harry”;
var text = “Harry is a suspicious character.”;
var regexp = new RegExp(“\\b(“ + name + “)\\b”, “gi”);
console.log(text.replace(regexp, “_$1_”));
//_Harry_ is a suspicious character.
Example 12
Write a Regex to find numbers in the string with their indexes.
var input = “A string with 3 numbers in it… 42 and 88.”;
var number = /\b\d+\b/g;
var match;
while (match = number.exec(input)) {
console.log(“Found”, match[0], “at”, match.index);
}
// Found 3 at 14
// Found 42 at 33
// Found 88 at 40
Example 13
Write a RegExp to split the string and assign it into variable.
var input = ‘john smith~123 Street~Apt 4~New York~NY~12345’;
var [name, street, unit, city, state, zip] = input.split(‘~’);
console.log(name); // john smith
console.log(street); // 123 Street
console.log(unit); // Apt 4
console.log(city); // New York
console.log(state); // NY
console.log(zip); // 12345
Example 14:
Write a RegExp to split string into array string.
var str = ‘john smith~123 Street~Apt 4~New York~NY~12345’;
matches = str.match(/[^~]+/g);
console.log(matches); // [“john smith”, “123 Street”, “Apt 4”, “New York”, “NY”, “12345”]
Example 15:
Charlie has been given an assignment by his Professor to strip the links and the text name from the html.
var input=’<a href=”http://www.quackit.com/html/examples/html_links_examples.cfm">More Link Examples…</a></div>’;
var re = /href=(\”).+?(\”)/g;
var link;
link=’’ + input.match(re);
link=’’+link.replace(‘=’,’:’);
console.log(link); //href:”http://www.quackit.com/html/examples/html_links_examples.cfm"
re = /\<[^\>]+\>([a-z].+)\<[^\>]+\>/ig;
var text;
input.match(re);
re = /\<[^\>]+\>([a-z].+)\<[^\>]+\>/ig;
var text;
text=’’+input.match(re);
text=’’+text.replace(/<[a-z/].*?>/g,’’);
console.log(text); // More Link Examples…
Do more practice to make Regular Expressions easier for you.
Thanks for taking the time to check out my article!
If you liked this article, drop a few claps and recommend this article to your friends.