Intro to Regex in Python

Yu-Ting Lee
Analytics Vidhya
Published in
3 min readJun 26, 2021

Created: Yu-Ting Lee, Quert
Tags: python, regex

  • re.matchmatch the pattern with the entire string or substring, return the one matched iterator/string/match object. The module is anchored at the beginning of the string.
  • re.match(pattern, string, flags=re.M)
    flag re.M means matching multiple lines.
  • re.findallfind all matched patterns in a string
  • re.search search for one matched pattern in a string
  • re.sub(r'old', 'new', string)Replacement work in string.
  • re.splitSplit string with specific characters or regex.
  • re.compile()Use to build a new pattern, then we can use the pattern.findall(texts)
  • Quantifiers
    {times}: stands for the times we want
    {n, m}: n times at least, and m times at most
  • re.search() and re.match() both have attribute index for found start(), end()

\d: digit\D: non-digit\w: word character\W: non-word character \b: word boundary

\s: whitespace, and \t \b \f \S: non-whitespace +: one or more *: zero or more times ?: zero or one time .: match any characters ^: start of the string $: end of the string

.+ : anything with any quantities |: represents "or" // match "/"

Regex

Code Example for stripping hashtag in text file

Usage code for .

Usage code for ^ and $

OR operator | , [] , ^

[^] transforms the expression to negative

Group Characters for further processing

Use ()

Non-Captured group : (?:), (?:(a|b)

Numbered Group

Named Group

(?P<name>regex)

Select group number in regex

Look behind & Look Ahead

Positive lookahead (?=) makes sure that first part of the expression is followed by the lookahead expression.

Positive lookbehind (?<=) returns all matches that are preceded by the specified pattern.

Type Conversion for f-string

  • !s : string version
  • !r: string containing a printable representation (i.e. with quotes)
  • !a: convert to ASCII characters

Format specifiers

  • :e scientific notation
  • :.d digits
  • :.f float

Greedy and non-greedy matching

Other

Find substring

--

--