Concept : regular expressions (regexp)
Regular expressions. One of the most interesting, confusing, complex yet easy topics that makes working with comparisons easier by compressing the looping around and bunch of comparisons into single line statements. Looking firstly at a regular expression declaration may set you wondering, buy once you know what means, it’s all a piece of cake.
Regular expressions or regexp for short is a way of defining a pattern in small.
Whether you want to use it for comparing the passed arguments or checking ID’s of DOM elements. Regular expressions provide the functionality of compare statements with the convenience of short code.
Although a statement like
May look terrifying while it’s simply stating a string with an alphabet at beginning that occurs any number of times followed by the number 5 followed by the backreference(more on it later) of the character that lead to the comparison being true.
OK forget the alien statement from above.
Some regular stuff about regular expressions.
- Inside the forward slash delimiters as /…/.
The first method is used whenever the pattern to be matched is unknown at the time of coding. The pattern can be passed as argument resulted from an expression of condition.
The second method is used when we already know the pattern.
Regexp cheat sheet
- /string/ or regexp(“string”)
for checking whether entered string is there or not.
2. /[xyz]/ or /[a-z]/
For matching any of the characters or the characters in the range provided by the dash operator.
New symbols : 1. [ ] The brackets are called set operators and are user to define a set of characters. Like if you want to check for occurrence of vowels, the set will be [aeiou].
2. ‘-’ or the dash operator
Used to define a range of characters. Such expressions usually consists of a left bound or lower bound and a right or upper bound. The complete expression would look like
Ex: for alphabets
A-Z or a-z
Some extra stuff about ‘-’
- If you leave the upper bound empty like 0- the expression would match for the highest possible limit.
The xor lookalikes
^ symbol is used in two ways.
If it is defined for a set of characters like case 4. It checks for all characters but the included set. Otherwise if used for a simple string or range, it checks that the string begins with given characters.
For matching that the string begins with xyz.
Matches any character other than the ones provided in the brackets.
What will be the result for given expression?
Ans: You guessed it right, the regular expression will match for the string to begin with any alphabet. Why so?, the parentheses saved the day.
Matches whether the string ends in given characters.
Matches that the string is comprised of only the given characters.
$ is used to match that the string ends in given characters.
7. / \\ /
Backslash serves as escape character and the string is searched for backslash.
The i matches the string for given string in case insensitive manner.
g Searches for given string occurrences globally i.e., all occurrences rather than just the first occurrence.
m allows matching for multiple lines, somewhat like g.
Multiple occurrences and substrings
? When placed after a character or a group of them makes the preceding character or the group optional, thus checking for single or no occurrence of the preceding characters.
For above example, the expression checks for both abcd and d.
‘+’ checks for one or more occurrences of preceding character group.
In above example it checks for abc, abcabc, abcabcabc ….etc.
‘*’ operator checks for zero or more occurrences of preceding group.
We can also define the range of occurrences within which a pattern can exist to be matched, suppose you are given a text which is passed through a filter which checks for number of occurrences of vowels in the text and the text is accepted only number of vowels is in range [10–30].
Expression would be
So basically to define a range for number of occurrences define it in braces.
Some escaped characters
Referring back with back references.
The very first expression of this text
Might be looking somewhat less alien or much familiar now, but there’s the last part that might get you confused, well don’t be, the backslash is used as back reference.
Back references are very interesting and useful for lot of tasks.
Suppose you’re checking for a palindrome string, the basic condition for a string to be palindrome is that the last character must be same as first.
If the condition fails, the string is not a palindrome.
This case can be solved without compare statements by regexps by using back references.
Now as you can define a regular expression to check for the starting element of string by /^[a-z]|[A-Z]|[0–9]/
Oh yeah I forgot to mention the or operator. Checks for either left or right element.
Now comes the bank reference part, now suppose you get a positive case or your regular expression is initiated, i.e., the first condition is set true. You can’t find which character triggered the regexp first, thus you can’t use it to compare whether the string is palindrome or not, there comes the role of back reference, just put a back slash with a number that defines which case we’re looking for.
For example if we want to check what’s the first element that marked the regexp true, thus write \1.
So to match first and last character, expression is
Back reference can be confusing, so do research on the topic yourselves.
Thanks again to