Puzzling with Regular Expression😕

Biswasindhu Mandal
20 min readApr 14, 2022

--

Edited by: Me

We are learning here “A B C” of Regular Expression.

Regular Expression is a sequence of characters that follow a pattern in the text.
Regular Expression are used in ‘Search Engines’ and ‘search & replace dialog of Text editors’

Example: /ab/g, /ab/, /ab/i

✍Open your browser console and write code there …
Test your regex here also: regextester.com, regexr.com, regex101.com.

Python: pythex.org

Learning basic syntax things with some examples:

The ^ symbol matches the position at the start of a string.
The $ symbol matches the position at the end of a string.
Both are called ‘anchor’

Task: Test a string, S start with 'a'
Ans: /^a/g.test(S)
✔ Valid: abcd, a, a123a
✘ Invalid: bab, 1aaaa
Task: Test a string, S end with 'a'
Ans: /a$/g.test(S)
✔ Valid: bcdsa, a, a123a
✘ Invalid: bab, aaaa1

The expression \d matches any digit [0–9].
The expression \D matches such character which is non-digit

Task: Test a string, S contains pattern xxXxxXxxxx.
Here, x denote digits & X denotes character
Ans: /\d\d\D\d\d\D\d\d\d\d/g.test(S);
✔ Valid: 23A89H0987, 98p89R2560
✘ Invalid: a8D88d8977, &8D90w8999
🎯Non-digit character regex defined as: /\D/g /[^\d]/g

The expression \s matches any whitespace character.[ \r(carriage return character), \n(enter), \t(tab), \f (metacharacter), space].
The expression \S matches any non-white space character. [a-zA-Z0–9] & specialCharacters

Special characters are:
@ # $ % ^ & * ` ( ) + _ = { } [ ] | - \ ~ / ‘ “ < > , . ; : ? !

Task: Test a string, S contains pattern XXxXXxXX
Here, x denotes whitespace characters, and X denotes non-white space characters.
Ans: /\S\S\s\S\S\s\S\S/g.test(S)
✔ Valid: AB cd eR, 11 B2 Ce
✘ Invalid: a BB CDe, 111 ab cde
🎯Whitespace character regex defined as: /\s/g /[^\S]/g

The expression \w will match any word character.[a-zA-Z0–9_]
The expression \W will match any non-word character. Non-word characters include characters other than alphanumeric characters [a-zA-Z0–9_]

Task: Test a string, S contains pattern xxxXxxxxxxxxxxXxxx
Here, x denotes any word & X denotes any non-word character
Ans: /\w\w\w\W\w\w\w\w\w\w\w\w\w\w\W\w\w\w/g.test(S)🎯Non-word character regex defined as: /\W/g /[^\w]/g

Regular Expression Grouping:
Matches Character/s: [
<expression>]
Not matches Charaster/s: [^
<expression>]
Capturing Group: (
<expression>)
Either or:
<expression>|<expression>

Task: Test a string, S start with vowels (a,e,i,o,u)
Ans: /^(a|e|i|o|u|A|E|I|O|U)/g.test(S) or /^[aeiouAEIOU]/g.test(S)
✔ Valid: apple, onion
✘ Invalid: mango, cat
Task: Test a string, S end with vowels
Ans: /(a|e|i|o|u|A|E|I|O|U)$/g.test(S) or /[aeiouAEIOU]$/g.test(S)
✔ Valid: apple, mango
✘ Invalid: onion, cat
Task: Test a string, S start with consonant
Ans: /^[^(a|e|i|o|u|A|E|I|O|U)]/g or /^[^(aeiouAEIOU)]/g.test(S)
✔ Valid: mango, cat
✘ Invalid: apple, onion
Task: Test a string, S start with letters
Ans: /^[a-zA-Z]/g
✔ Valid: mango, cat, uiiiuabcd
✘ Invalid: 123uiiiuabcd, !onion
Task: Test a string, S start without letters
Ans: /^[^(a-zA-Z)]/g
✔ Valid: '123abcd', ' abcd', '! abcd'
✘ Invalid: 'abcd', 'onion'

Regular Expression Quantifiers:
0 or more time/s:
<expression>*
1 or more times:
<expression>+
An exact number of characters:
<expression>{< number>}
0 or 1:
<expression>?

Task1: Test a string, S contains pattern aa, aba, abba, abbba ...
Ans: /^ab*a$/g.test(S)
✔ Valid: aa, abbbbbbbbbbbbbbba
✘ Invalid: a, aca, abbbcbbba, abaaaa
Task2: Test a string, S contains pattern aba, abba, abbba ...
Ans: /^ab+a$/g.test(S)
✔ Valid: aba, abbbbbbbbbbbbbbba
✘ Invalid: a, aa, aca, abbbaa
Task3: Test a string, S contains pattern a, aa, aba, abba, abbba ...
Ans: /^(a+|ab*a)$/g.test(S)
✘ Invalid: b, abab, abcba, abbbbbaa
Task4: Test a string, S contains pattern a, b, aa, bb, ab, ba, aaa, aba, aab, baa, bab, bba, abb, bbb, aaaa, aaab ...
Ans: /^(a+(ba)*b*|b+(ab)*a*)$/g.test(S)
✔example: 'a b aa bb ab ba aaa aba aab baa bab bba abb bbb aaaa aaab ...'.match(/\b(a+(ba)*b*|b+(ab)*a*)\b/g)
// 'a', 'b', 'aa', 'bb', 'ab', 'ba', 'aaa', 'aba', 'aab', 'baa', 'bab', 'bba', 'abb', 'bbb', 'aaaa', 'aaab'
Task5: Test a string, S contains atleast two consecutive vowels
Ans: /(a|e|i|o|u|A|E|I|O|U){2}/g.test(S) or /^[a-zA-Z]*(a|e|i|o|u|A|E|I|O|U){2}[a-zA-Z]*$/g.test(S)
✔ Valid: 'You', 'About', 'School', 'Beautiful'
✘ Invalid: 'Cow', 'Go', 'No', 'xy'
Task5: Resolve Task4 using quantifire ?
Ans: /^(a+(b?a?)*|b+(a?b?)*)$/g.test(S)
✔example: 'a b aa bb ab ba aaa aba aab baa bab bba abb bbb aaaa aaab ...'.match(/\b(a+(b?a?)*|b+(a?b?)*)\b/g) // 'a', 'b', 'aa', 'bb', 'ab', 'ba', 'aaa', 'aba', 'aab', 'baa', 'bab', 'bba', 'abb', 'bbb', 'aaaa', 'aaab'
Task6: Resolve Task4 in shortest way.
Ans: /^(a|b)+$/g.test(S)
✔example: 'a b aa bb ab ba aaa aba aab baa bab bba abb bbb aaaa aaab ...'.match(/\b(a|b)+\b/g) // 'a', 'b', 'aa', 'bb', 'ab', 'ba', 'aaa', 'aba', 'aab', 'baa', 'bab', 'bba', 'abb', 'bbb', 'aaaa', 'aaab'

The expression \b will match any word boundary(space).
The expression \B will match any non-word boundary(non-space character).
“\b” or “\B” is an anchor, like caret(^ ← start of string) or dollar ($ ← end of string), match position of string with boundary.

Task: There a following strings:
s1 = 'A paragraph is a series of sentences that are organized and coherent, and are all related to a single topic.'
s2 = 'Gold price for today: 10 grams of 24-carat priced at Rs 53450; silver sold at Rs 68100 per kilo.'
Q1) Find how many rupee avalible in s2
Q2) Find the words in s1 which start with 'se'
Q3) Find the words in S2 which contains 'r'
Q4) Fine the words which contains special character at end
Ans:
A1)
s2.match(/\bRs\s(\d+)\b/g) // 'Rs 53450', 'Rs 68100'
A2) s1.match(/\bse([a-zA-Z]+)\b/g) or s1.match(/\bse([a-zA-Z]+)/g)
// 'series', 'sentences'
A3) words start with 'r' ➨ /\br([a-zA-Z]+)\b/g
words end with 'r' ➨ /\b([a-zA-Z]+)r\b/g // 'for', 'silver', 'per'
words which contains 'r' not in first or end position ➨ /\b([a-zA-Z]+)r([a-zA-Z]+)\b/g // 'price', 'grams', 'carat', 'priced'
So, words contains 'r' ➨ /\b[a-zA-Z]*r[a-zA-Z]*\b/g or /\b\w*r\w*\b/g // 'price', 'for', 'grams', 'carat', 'priced', 'silver', 'per'
A4) words contains special characters at end ➨ /\w+[^(\s|\w)]/g // 'today:', '24-', '53450;', 'kilo.'
- \w+ : contains words multiple times
- [^[\s|\w)] : does not match (^) any whitespace character (when end of word is a space) or (|) any word characters (a-zA-Z0-9_)

The dot (.) matches anything (except for a newline).
Note: If you want to match (.) in the test string, you need to escape the dot by using a slash (\.)

In Java, use (\\.) instead of (\.)

Task: You have a test string S to write a regular expression that matches only and exactly strings of form: abc.def.ghi.jkx, where each variable a,b,c,d,e,f,g,h,i,j,k,x can be any single character except the newline.Ans: need validation of "'!@.#$%.^&*.()_" of below format
/^(\d|\D)[a-zA-Z0–9]{2}\..*[a-zA-Z0–9]{3}\..*[a-zA-Z0–9]{3}\.(\d|\D){3}$/g
denote the number in format: abc.def.ghi.jkx, where a,b,c,d,e,f,g,h,i,j,k,x is \d or \D
expression: start with \d or \D, next two characters are[a-zA-Z0–9], next come (.) in format(\.), :: checked [abc.]
next three characters are [a-zA-Z0–9], next come (.) in format(\.), :: checked [def.]
next three characters are [a-zA-Z0–9], next come (.) in format(\.), :: checked [gji.]
next three character end with [a-zA-Z0–9] i.e (\d|\D) :: checked [jkx]
✔ Valid: "123.456.abc.def"
✘Invalid: "1123.456.abc.def", "'!@.#$%.^&*.()_"

The character class [ ] matches only one out of several characters placed inside the square brackets.

Character Type:
lowercase letter: [a-z]
word character: [0–9a-zA-Z_] or [\w]
whitespace character: [\r\n\t\f] or [\s]
non word character: [\W] or [^(a-zA-Z_)]
digits: [0–9] or [\d]
non digit: [\D] or [^0-9] or [^(0-9]]
uppercase letter: [A-Z]
letters: [a-zA-Z] (either lowercase or uppercase)
non whitespace character: [a-zA-z0–9] & specialCharacters or [\S]
vowel: [aeiouAEIOU]
consonent: [^(aeiouAEIOU)]

Regex: Lookahead and Lookbehind :

The syntax is: X(?=Y), it means "look for X, but match only if followed by Y". That is the regular expression engine finds X and then checks if there’s Y immediately after it.
More complex tests are possible, e.g. X(?=Y)(?=Z) means:
1. Find X.
2. Check if Y is immediately after X (skip if isn’t).
3. Check if Z is also immediately after X (skip if isn’t).
4. If both tests passed, then the X is a match, otherwise continue searching.
Limitation of Lookahead & Lookbehind
lookahead-lookbehind
Task: You have a test string S. Write a regex that can match all occurrences of o followed immediately by `oo` in S.
That means `o` is comming before `oo` i.e. after a single `o` two `oo` is coming
S= goooglooo! (match 2 times), googoogoogooo (match 1 times), oooooogooo (match 5 times), gooooo! (match 3 times)
You have a test string S1 = `subalbasu`. Write a regex that can match all occurrences of any letter followed immediately by `ba` in S1.
Ans: /[a-zA-Z](?=ba)/g (2 times)
Check how many times after `su` a letter is comming
S2 = subalbasu, (here: 1 times)
S3 = subalbasua, (here: 2 times)
Regex: /(su=?)[a-zA-Z]/g

Learn about Regular Expression in Javascript:

A regular expression is an object that describes the pattern of characters within a string.
There is two way to create a Regular Expression object.
(i) Literal Notation: Here expressions are enclosed between slashes (/) and don’t use quotation marks(‘ or “).
(ii) Constructor: Here we create a regular expression object with expressions and flags(not mandatory). where expressions are enclosed by quotation marks (‘ or “) or slashes(/).

Example:
(i) Literal Notation pattern:
/<expression>/<index>
(ii) Constructor Pattern:
new RegExp('<expression>', '<index>')
or
new RegExp(/<expression>/, "<index>")
or
new RegExp("<expression>")

Regular Expression Properties:

1. Global:

Flag "g" indicates the regular expression should be tested against all possible matches in a string.Validation: regx.global = true;
Example:
const st = "A regular expression is an object that describes the pattern of characters within a string."
const regx = new RegExp('th');
st.match(regx); // match with first 'th', i.e. match with 'that' in string st.
const regx_g = new RegExp('th', 'g');
st.match(regx_g); // math with all 'th', i.e. math with word 'that' and 'the'
- Generally use to replace all substring from a string

2. Ignore Case:

Flag "i" indicates the regular expression should be match with the specific substring (not case sensitive), where  in a string. In general regular expression is case sensitive.Validation: regx.ignoreCase = true;
Example:

const st = "A Regular Expression is an object that describes the pattern of characters within a string."

const r1 = new RegExp('re')
const r2 = new RegExp('re', 'i')

st.match(r1) // ["re"] ➨ 'Expression'
st.match(r2) // ["Re"] ➨ 'Regular'
const reg1 = new RegExp('Re');
const reg2 = new RegExp('Re', 'i');

st.match(reg1) // ["Re"] ➨ 'Regular'
st.match(reg2) // ["Re"] ➨ 'Regular'
st.match(new RegExp('re', 'g')) // ["re"] ➨ 'Expression'
st.match(new RegExp('Re', 'g')) // ["Re"] ➨ 'Regular'
st.match(new RegExp('re', 'gi')) // ['Re', 're'] ➨ 'Regular' & Expression'
st.match(new RegExp('Re', 'gi')) // ['Re', 're'] ➨ 'Regular' & Expression'
- Usecase
vowel validation:
regx = new RegExp('[aeiou]', 'gi') or /[aeiou]/gi

3. Multiline:

Flag "m" indicates the multiline regular expression, that enables the regular expression engine to handle an input string that contains multiple lines or not.Validation: regex.multiline = true;
Example:

const st = "Don not feel safe. \nThe poet remembers. \nYou can kill me but another is born."
const regx0 = new RegExp('^[a-zA-z]+', 'gi')
st.match(regx0); // ["Don"]
const regx = new RegExp('^[a-zA-z]+', 'gmi')
st.match(regx); // ["Don", "The", "You"] ➨ show every first letter of after new line

Summary of Regular Expression Indicators

Regular Expression Indicators

Regular Expression Methods:

How does Regular Expression Work?

In our computer text is represented as a ‘string’ of characters. Regular Expression read the string from left to right and explains to the system what to match. It follows the chart:

Regular Expression Flowchart

Some Interesting Question Answer:

1. Contains vowels and start & end with the same vowel.
Example:
✔ valid: aeioa, eioe, Abca
✘ invalid: abb, abcd, a89b

regx = /^[a].*[a]$|^[e].*[e]$|^[i].*[i]$|^[o].*[o]$|^[u].*[u]$/gi;
Explanation:
start with a (^[a]), anything expect new line (.*) and end with a ([a]$)
or(|)
start with e (^[e]), anything expect new line (.*) and end with e ([e]$) …
using 'g': validate globally, 'i': ignoring case.

2. Create a regex starting with Mr., Mrs., Ms., Dr., or Er.
and end with at least one English alphabetic letter (i.e., [a-z] and [A-Z]).
✔ valid: Mr.X, Mrs.Y
✘ invalid: Dr#Joseph, Er .Abc

regex = /(?:Mr\.|Mrs\.|Ms\.|Dr\.|Er\.).*[a-zA-Z]$/g;Explanation: 
start with Mr., Mrs., Ms., Dr., or Er. i.e (?:Mr\.|Mrs\.|Ms\.|Dr\.|Er\.)
and (.*) : anything expect new line
and end with English alphabetic letters i.e. [a-zA-Z]$

3. create a regex check a string start with a letter belogs to set {b,c,d,f,g}
✔ valid: badc, fmnk, gaeiou
✘ invalid: abcd, 8uio, tggf

regx = /^[bcdfg]/g

4. regex of exactly 17 alphanumeric characters in length and must start with either “AB, “DE” or “GH”.

regx =  /^(AB|DE|GH)[a-zA-Z0–9]{15}$/g✔ Valid: AB163829F13246915, DET639601BA167860, GHF1973771A002957
✘ Invalid: XYZ63829F13246915, AAA639601BA167860, BBC1973771A002957

5. Decimal number validation:
✔ valid: 0.01, .234, 1234, 1234.9
✘ invalid: 345., ., 1234, abcd.123

regx = /^[0-9]*\.[0-9]+$/g or /^\d*\.\d+$/g Explanation:
— `^` match this line start
— `[0–9]` or \d first character is any digits
— `[0–9]*` or \d* before decimal places occorance of decimal digits o or more times
— `\.` contains decimal places
— `[0–9]+` after decimal place occorance of decimal digits 1 or more times
— `$` end with digits
🚩 Check valid Integer or Decimal use this regex: /^(\d*\.)?\d+$/g

6. Your test string, S should follow the conditions:
— S must be of length 6
— First character should not be a digit (1, 2, 3, 4, 5, 6, 7, 8, 9, or 0).
— Second character should not be a lowercase vowel (a, e, i, o, or u).
— Third character should not be b, c, D, or E.
— Fourth character should not be a whitespace character (\r, \n, \t, \f or the space character).
— Fifth character should not be an uppercase vowel (A, E, I, O, or U).
— Sixth character should not be a full-stop(.) or comma(,) symbol.

➧ ^ matches the line start.
➧ A caret, ^, inside the square brackets match a character from the string as long as that character is not found in the square bracket from the pattern.
➧ [^aeiou] will match any character that is not a, e, i, o or u.
➧ [^\d] will match any non-digit character, like \D.
➧ Again there are 6 groups of characters surrounded by square brackets,
➧ Thus ensuring the length of the string is 6, since we have the line start and line end at the ends of our pattern.

Regex_Pattern: /^[^\d][^aeiou][^bcDE][\S][^AEIOU][^\.\,]$/g
🚩 Used $ as string contains exactly 6 letters

7. Your test string S should have the following requirements:
S must be of length 6
First character: 1, 2 or 3
Second character: 1, 2 or 0
Third character: x, s or 0
Fourth character: 3, 0, A or a
Fifth character: x, s or u
Sixth character: full stop(.) or comma(,)

➧ ^ matches the line start.
➧ Any character from inside the square brackets can be matched with one character from the string.
➧ [123] can match 1, 2 or 3 in the string.
➧ Each group of characters inside the square brackets matches with one character in the string,
➧ There are 6 groups of characters surrounded by square brackets, each group to match one character from the string,
➧ Thus ensuring the length of the string is 6, since we have the line start and line end at the ends of our pattern.
Regex_Pattern: /^[123][120][xs0][30Aa][xsu][\.\,]$/g
🚩 Used $ as string contains exactly 6 letters

8. String should be matched with the following conditions:
— The test string’s length is greater than or equal to 5.
— The first character must be a lowercase English letter.
— The second character is a positive digit, cannot be zero.
— The third character must not be a lowercase English letter.
— The fourth character must not be an uppercase English letter.
— The fifth character must be an uppercase English letter.

➧ ^ matches the line start. We need the line start because we’re matching starting from the first character of the string.
➧ [a-z] will match any lowercase English letter from a to z both inclusive. [b-z] and [a-y] would match all lowercase English letters except for a and z respectively.
➧ [1–9] will match any digit except for zero.
➧ [^a-z] will match any character that is not a lowercase English letter.
➧ [^A-Z] will match any character that is not an uppercase English letter.
➧ [A-Z] will match any character that is an uppercase English letter.
➧ Note that for this pattern there is no line end, because only the first five characters of the string are relevant, the rest may be anything.
Regex_Pattern: /^[a-z][1–9][^a-z][^A-Z][A-Z]/g
🚩 Do’n used $ as string contains more that 5 letters

9. String S should be matched with the following conditions:
— S should begin with two or more digits
— after that, S should have zero or more lowercase letters.
— should end with zero or more uppercase letters

➧ S should begin with two or more digits : Regex: /^[\d][\d][\d]*/  or /^[\d]{2}[\d]*/ or /^[\d][\d]+/ 
➧ After that, S should have zero or more lowercase letters : Regex: /[a-z]*/
➧ should end with 0 or more uppercase letters : Regex: /[A-Z]*$/g
regex = /^[\d][\d][\d]*[a-z]*[A-Z]*$/g or /^[[\d]{2}[\d]*[a-z]*[A-Z]*$/g or /^[\d][\d]+[a-z]*[A-Z]*$/g

10. Test string should consist of only lowercase and uppercase letters (no numbers or symbols).
Test string should end with s.

As Test string contains only lowercase and uppercase letters: regex = /^[a-zA-Z]+/g or /^[a-z]+/gi or /^[A-Z]+/gi
String must be end with s : regex = /[s]$/g

So, regex = /^[a-zA-Z]+[s]$/g
But, "s" is a valid input of this question. Because it's lowercase character and end with 's'.
Therefore, regex: /^[a-zA-z]*[s]$/g

11. Testing String begins with one or more digits.
After that Testing string has one or more uppercase letters.
Testing string end with one or more lowercase letters.

➧ w+: It will match the character w one or more times
➧ ^[0–9]+ start with digits one or more times
➧ [a-z]+$ end with lowercase letters one or more times
Regex: /^[0–9]+[A-Z]+[a-z]+$/g

12. Testing string contains the exact format: abc.def.ghi.jkx
— where a, b, c, d, e, f, g, h, i, j, k, x are any single character except the newline (i.e. in Regex: [^\n])

Regex: /^[^\n][^\n][^\n][.][^\n][^\n][^\n][.][^\n][^\n][^\n][.][^\n][^\n][^\n]$/g
or /^([^\n]{3}[.]){3}[^\n]{3}$/g
or /^([^\n]{3}[.]){3}[^\n]{3}$/g
or /^[^\n]{3}([.][^\n]{3}){3}$/g

13. Testing string length equal to 45
First 40 characters should consist of letters (both lowercase & uppercase) or even digits(2,4,6,8,0)
Last 5 characters should consists of odd digits(1,3,5,7,9) or white space characters(\r, \n, \t, \f :: Regex \s)

➧ w{3} : It will match the character w exactly 3 times.
➧ First 40 characters should consist of letters or even digits : /^[a-zA-Z24680]{40}/g or /^[a-z24680]{40}/gi
➧ Last 5 characters should consists of odd digits or white space characters : /[13579\s]{5}$/g
Regex: /^[a-zA-Z24680]{40}[13579\s]{5}$/g
or /^[a-z24680]{40}[13579\s]{5}$/gi

14. A string contains 8 digits and symbols form “ — -”, “-”, “.” or “:”, such that,
✔ Valid Form: 12–34–56–78 or 12:34:56:78 or 12 — -34 — -56 — -78 or 12.34.56.78
✘ Invalid Form: 1–234–56–78 or 12–45.78:10

Regex: 
/^(\\d{2}:){3}\\d{2}$|^(\\d{2}[.]){3}\\d{2}$|^(\\d{2}[-]){3}\\d{2}$|^(\\d{2}(-){3}){3}\\d{2}$/g

15. You have a test string S.
Your task is to write a regex that will match S, with the following condition(s):
— S consists of `tic` or `tac`.
— `tic` should not be an immediate neighbor of itself.
— The first `tic` must occur only when `tac` has appeared at least twice before.
✔ Valid: `tactactic`, `tactactictactic`
✘ Invalid: `tactactictactictictac`, `tactictac`

➧ It’s clear that `tic` coming first time after 2 `tac`, after that `tic` comming only once time after 1/2 `tac`
➧ So, first `tic` is coming after two `tac` // Regx: /^(tic){2}(tic}/
➧ Next `tic` is comming after one or more `tac` // Regx: /((tic)(tac)|(tic)(tac){2}/
➧ If we think in another way that `tac` is coming first, then `tac` & `tic` are coming such a way that, after one to many times `tac`,`tic` is coming // Regx: /(tac(tic?))*/

Regex:
/^tac(tac(tic?))*$/

16. You have string in format: “jobCreatedByUserFirstName”, “jobCreatedByUserEmail”, “jobCreatedByPartnerName”
— You have to capitalize the first letter
— You have to add extra space before every Capital letter, without first letter.

➧ Change first letter to uppercase: 
str.replace(/^[a-z]/g, x => x.toUpperCase());
➧ Add Space before every CAPITAL LETTER
str.replace(/[A-Z]/g, function(x){return ‘ ‘+x;});
or
str.replace(/[A-Z]/g, “ $&”);
Ans:
str.replace(/^[a-z]/g, x => x.toUpperCase()).replace(/[A-Z]/g, " $&").trim();
Example:
▲ Input: ‘jobCreatedByUserEmail’;
▼ Output: ‘Job Created By User Email’

17. Email Validation:

➧ Start with a-zA-Z0–9_-. [a-zA-Z0–9_\\-\\.]
➧ Add @ & then add a-zA-Z_ [a-zA-Z_]
➧ Add . & then add a-zA-Z ?\.[a-zA-Z]

Regex: /[a-z0-9_\-\.]+\@[a-z_]+?(\.[a-z]+){1,}/gi

18. Remove all punctuation (. , ; : — _ ? ! ( ) [ ] { } / \) and replace them with space and also remove extra spaces.

const puntRegx =/[\.\,\;\:\-\_\?\!\"\/\'\(\)\{\}\[\]\\]/g;
st = st.replace(puntRegx, ' '); // remove all punctuation
st = st.replace(/\s\s+/g, ' ').trim(); // remove multiple space
Example:
▲ Input: " ABCDEF , GHIJKLMNO PQRSTUVWXYZ 01234567890 $#% abcdefghijklmnopqrstuvwxyz \\ .,;&*^@:-_?!/\"'_(){}[] "
▼ Output: "ABCDEF GHIJKLMNO PQRSTUVWXYZ 01234567890 $#% abcdefghijklmnopqrstuvwxyz &*^@"
🚩 Use \" for double quotation marks (") input into the input string

19. Capitalized specific substring within a string:
- In your message, place a Circumflex(^) syntex on both sides of the text (like: ^text^), to CAPITALIZED that substring of the string.
Example:
Input: ^t^he ^quick^ brown fo^x^ ju^mps ove^r the little lazy ^dog^
Output: The QUICK brown foX juMPS OVEr the little lazy DOG

str = "^t^he ^quick^ brown fo^x^ ju^mps ove^r the little lazy ^dog^"
str.replace(/[\^](\w*\s*)+[\^]/g, x => x.toUpperCase()).replace(/[\^]/g, "").trim();
▼ Output: 'The QUICK brown foX juMPS OVEr the little lazy DOG'

20. Example of Lookahead and Lookbehind:
— There is a input string find the patter to match “letter is followed by a digit” . — There is a input string find the patter to match “letter is preceded by a digit” .
Example:
— “He1lo W0r1d” — e,W, r (letter is followed by a digit)
— “He1lo W0r1d” — l,r,d(letter is precede by a digit)

letter is followed by digit:
regular expression: /[a-z](?=[\d])/gi
'He1lo W0r1d'.match(/[a-z](?=\d)/gi)
▼ Output: ['e', 'W', 'r']
letter is preceded by a digit:
regular expression: /(?<=[\d])[a-z]/gi
'He1lo W0r1d'.match(/(?<=\d)[a-z]/gi)
▼ Output: ['l', 'r', 'd']

21. IPv4 & IPv6 validation:

For IPv4 validation we need to check the format: 
[\d]{1,3}:[\d]{1,3}:[\d]{1,3}:[\d]{1,3}
For IPv6 validation we need to check the format:
[\da-f]{1,4}: (7times) then [\da-f]{1,4}
Therefore,
regex_IPv4 = /^([\d]{1,3}\.){3}[\d]{1,3}$/g;
regex_IPv6 = /^([0-9a-f]{1,4}\:){7}[0-9a-f]{1,4}$/gi;

For IPv4 need to check every portion digit/s should be less then or equal to (2^8-1) = 255, i.e. ip4.match(/[\d]+/g).every(ele => Number(ele) < 256) = true.
function ipValidation(ip) {
if(regex_IPv4.test(ip)
&& ip.match(/[\d]+/g).every(ele => Number(ele) < 256)) {
return 'IPv4';
} else if(regex_IPv6.test(ip)) {
return 'IPv6';
} else {
return 'Not IPv4 Neither IPv6';
}
}
▼ Test
22.212.113.3164 Not IPv4 Neither IPv6
1050:1000:1000:a000:5:600:300c:326b IPv6
1051:1000:4000:abcd:5:600:300c:326b IPv6
22.231.113.64 IPv4
22.231.113.164 IPv4
222.231.113.64 IPv4
Normal Text Not IPv4 Neither IPv6
999.212.113.31 Not IPv4 Neither IPv6

22. Break HTML tags:
— Your Input data is a HTML code, how to break all tags

Find all alphanumeric & special character within '<' & '>':
regular expression: /<(\w+)(|\s+[^>]*)>/gim
Example:
▲ Input:
htmltxt = `<!DOCTYPE html>
<html lang="">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width,initial-scale=1.0">
<link rel="icon" href="<%= BASE_URL %>favicon.ico">
<title><%= htmlWebpackPlugin.options.title %></title>
<link href="
https://cdn.jsdelivr.net/npm/bootstrap@5.1.0/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-KyZXEAg3QhqLMpG8r+8fhAXLRk2vvoC2f3B09zVXn8CA5QIVfZOJ3BCsw2P0p/We" crossorigin="anonymous">
</head>
<body>
<noscript>
<strong>Test String</strong>
</noscript>
<div id="app"></div>
<!-- built files will be auto injected -->
</body>
</html>
`
htmltxt.match(/<(\w+)(|\s+[^>]*)>/gim)▼ Output:
"<html lang=\"\">"
"<head>"
"<meta charset=\"utf-8\">"
"<meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge\">"
"<meta name=\"viewport\" content=\"width=device-width,initial-scale=1.0\">"
"<link rel=\"icon\" href=\"<%= BASE_URL %>"
"<title>"
"<link href=\"https://cdn.jsdelivr.net/npm/bootstrap@5.1.0/dist/css/bootstrap.min.css\" rel=\"stylesheet\" integrity=\"sha384-KyZXEAg3QhqLMpG8r+8fhAXLRk2vvoC2f3B09zVXn8CA5QIVfZOJ3BCsw2P0p/We\" crossorigin=\"anonymous\">"
"<body>"
"<noscript>"
"<strong>"
"<div id=\"app\">"

23. Input string should not allow only special character but allow special character with alphanumeric value.
✔ Valid: `##`, `tactactictactic`
✘ Invalid: `tactactictactictictac`, `tactictac`

➧ Start with a-zA-Z0–9_-. [a-zA-Z0–9_\\-\\.]
➧ Add @ & then add a-zA-Z_ [a-zA-Z_]
➧ Add . & then add a-zA-Z ?\.[a-zA-Z]

Regex: /[!^\w\s]$/

24. Generate regular expression not containing html `a-tag` don’t containing the `href` data .
✔ Valid: `<a class="testClass">data</a>`
✘ Invalid: `<a class=”testClass” href=”https://cdn.jsdelivr.net">data</a>`
🚩 stack-overflow question

➧ <a : Match <a
➧ (?![^<>]*href): Negative lookahead that makes sure that there is no href ahead after 0 or more of any char that are not < and >
➧ [^>]*: Match 0 or more of any char that are not >
➧ >: Match >
➧ .*?: Match 0 or more of any characters
➧ <\/a>: Match </a>
Regex: `/<a (?![^<>]*href)[^>]*>.*?<\/a>/gm`

25. Formatting contact number.
2124567890
212–456–7890
(212)456–7890
(212)-456–7890
212.456.7890
212 456 7890
+12124567890
+1 212 456 7890
+1 212.456.7890
+212–456–7890
1–212–456–7890
+1 (2124) (567–890)

st = '2124567890';
st.replace(/^(\d{3})(\d{3})(\d{4}).*/, '$1-$2-$3');
st.replace(/^(\d{3})(\d{3})(\d{4}).*/, '($1)$2-$3');
st.replace(/^(\d{3})(\d{3})(\d{4}).*/, '($1)-$2-$3');
st.replace(/^(\d{3})(\d{3})(\d{4}).*/, '$1.$2.$3');
st.replace(/^(\d{3})(\d{3})(\d{4}).*/, '$1 $2 $3');
st.replace(/^(\d{3})(\d{3})(\d{4}).*/, '+1$1$2$3');
st.replace(/^(\d{3})(\d{3})(\d{4}).*/, '+1 $1 $2 $3');
st.replace(/^(\d{3})(\d{3})(\d{4}).*/, '+1 $1.$2.$3');
st.replace(/^(\d{3})(\d{3})(\d{4}).*/, '+$1-$2-$3');
st.replace(/^(\d{3})(\d{3})(\d{4}).*/, '+1-$1-$2-$3');
st.replace(/^(\d{4})(\d{3})(\d{3}).*/, '+1 ($1) ($2-$3)');

26. Multiple Case change of a Paragraph
There is a paragraph. Now the to change following cases:
— CHANGE TO UPPERCASE
— change to lowercase
— Change to sentense case
— Change To Capitalize Each Word
— cHANGE tO tOGGLE cASE
— changeToCamelCase

// UPPER CASE
st.replace(/[a-z]/g, x => x.toUpperCase())
// change all lower case letters to Upper case

// lower case
st.replace(/[A-Z]/g, x => x.toLowerCase())
// Change all upper case letters to lower case

// Sentense case
st.replace(/[A-Z]/g, x => x.toLowerCase())
.replace(/(^|(?<=\.\s))[a-z]/g, x => x.toUpperCase())
or
st.replace(/(?!(\.\s))[A-Z]/g, x => x.toLowerCase())
.replace(/(^|(?<=\.\s))[a-z]/g, x => x.toUpperCase())
// First replace all uppercase letters to lowercase,
// then replace first lower case staring letter to uppercase in a sentense.

// Capital Each Word
st.replace(/[A-Z]/g, x => x.toLowerCase())
.replace(/(^|(?<=(\.\s|\s)))[a-z]/g, x => x.toUpperCase())
// First change all lower case letter to uppercase
// then change each words first letter to uppercase, if it's a lowercase letter.

// tOGGLE cASE
st.replace(/[a-z]/g, x => x.toUpperCase())
.replace(/(^|(?<=(\.\s|\s)))[A-Z]/g, x => x.toLowerCase())
// First change all upper case letter to lower case
// then change each words first letter to lower case, if it's a upper case letter.

// camelCase
st.replace(/[A-Z]/g, x => x.toLowerCase())
.replace(/(?<=(\.\s|\s))[a-z]/g, x => x.toUpperCase())
.replace(/\s/g, '');
// First change all to lower case
// then change each words first letter to uppercase without first letter
// then remove spaces

27. Uses of Lookahead and Lookbehind:
— Find repeated letters in the string
— Find non-repeated letters in the string

// Check Repeated letters
regex_r1 = new RegExp('(.)(?=\\1)', 'gi');
regex_r2 = new RegExp('(.)\1', 'gi');
regex_r3: /([a-z])\1{1,}/gi // only letters

// Check Non-repeated letters: /(.)(?!\1)/gi
regex_nr1 = new RegExp('(.)(?!\\1)', 'gi')
regex_nr2: /([a-z])(?!\1)/gi // only letters


Examples:
'gooohello'.match(/(.)(?!\1)/gi): non repeted letters
// ['g', 'o', 'h', 'e', 'l', 'o']: (g)oo(o)(h)(e)l(l)(o)

'gooohello'.match(/(.)(?=\1)/gi): repeted letters: 2 times
// ['o', 'o', 'l']: g(o)(o)ohe(l)lo

'gooohellowwww'.match(/(.)(?=\1{2,})/gi): repeted letters: 3 times
// ['o', 'w', 'w']: g(o)oohello(w)(w)ww

28. Remove Single/Multiline comments in C, C++ and Java programs
HackerRank Problem:
https://www.hackerrank.com/challenges/ide-identifying-comments/problem?isFullScreen=true
Singleline Comments:
// this is a single line comment
x = 1; // a single line comment after code

Multiline Comments:
/* This is one way of writing comments */
/* This is a multiline
comment. These can often
be useful*/

/** Another multiline
comment
**/

// \S: Non White-space Characters
// \s: All White-space Characters: [\r, \n, \t, \f, space]

// Singleline Comments Check
// Pattern: // Your Comments
Regex: /(\/){2,}([\S ]+)/gim

// Miltiline Comments Check
// Pattern: /* Your Comments */ or /** Your Comments **/
Regex: /(\/)(\\*)+([\\S\\s]+)(\\*)+(\/)/gim

---------------------------------
Solution:
---------------------------------
function main() {
const regex_singleLineComments = new RegExp("(\/){2,}([\\S ]+)", 'gi');
const regex_multiLineComments_start = new RegExp("(\/)(\\*)+([\\S\\s]+)", "gi");
const regex_multiLineComments_end = new RegExp("([\\S\\s]+)(\\*)+(\/)", "gi");

let isMultiLineComnts = false;

for(let i=0; i<inputLines.length; i++) {
// Check is that start with 'multiline comment pattern(/**)' or not
let arr = inputLines[i].match(regex_multiLineComments_start);
if(arr && !isMultiLineComnts) {
// Check is that a "single line multiline common pattern(/** ... **/)" or not
isMultiLineComnts = /(\*\/)$/gi.test(inputLines[i].trim()) ? false : true;
} else if(!arr && isMultiLineComnts) {
arr = inputLines[i].match(regex_multiLineComments_end);
if(!arr) {
arr = [inputLines[i]];
} else {
isMultiLineComnts = false;
}
} else {
arr = inputLines[i].match(regex_singleLineComments);
}

// console.log(i, isMultiLineComnts, arr ? arr.toString() : '');
if(arr) {
console.log(arr.toString().trim());
}
}
}

29. Uses of negative Lookahead and Lookbehind:
— Find substrings between special character(#)
✔ test string: `Puzzl#ing wi#th Reg#ular Expre#ssion`

const st = 'Puzzling with Regular Expression';

// regex: (?<!Y)X : X if after Y ➡ Negative Lookbehind
regex1 = /(?<![^#])[^#]+/g;
// ['Puzzl', 'ing wi', 'th Reg', 'ular Expre', 'ssion']

// alternative

// regex: X(?!Y) : X if not followed by Y ➡ Negative Lookahead
regex2 = /[^#]+(?![^#])/g;
// ['Puzzl', 'ing wi', 'th Reg', 'ular Expre', 'ssion']

If you have any question about regex, then attach your question/doubt in comment. I’ll try my best.

--

--

Biswasindhu Mandal

I am a Software Engineer, living in Kolkata, India. I love to work as a Javascript, Node.js Full Stack developer and OTT Frontend Developer.