Any character $REGEXP = . Result: ALLAny character in the [] $REGEXP = [v5] Result: vince1234, 5566Any character not in the [],注意在[]裡的^意義不同 $REGEXP = [^v]ince Result: Vince1234, Vince
Anchors Position (Head/End) ^$
Start with V $REGEXP = ^V Result: Vince1234, VinceEnd with e $REGEXP = e$ Result: Vincei at the begin or end (not including the punctuation marks) $REGEXP = i\b Result: Gi!i is not at the begin or end $REGEXP = i\B Result: vince1234, Vince1234, Vince
Count []{} * +
Character in the [] repeats 2 times $REGEXP = [5]{2} Result: 5566 Note {1, 2} means repeat 1~2 timeRepeat 0~n times $REGEXP = [5]* Result: ALLRepeat 1~n times $REGEXP = [5]+ Result: 5566
Numbers or letters
\d = [0–9] \D = [^0-9] \w = [a-zA-Z0–9_] \W = [^a-zA-Z0–9_] \s = [ \t\n] \S = [^ \t\n]Any character isn't a number $REGEXP = \D Result: ALL expect 5566String without number $REGEXP = ^\D+$ or ^[^0-9]+$ Result: Vince Gi!Number only $REGEXP = ^[^a-zA-Z]+$ Result: 5566
Match text
with grep
grep -E REGEXP -r ./ grep -E REGEXP filename
with perl
perl -ne 'print if m/REGEXP/' < filename -n: assume "while () { ... }" loop around program -e: one line of program
with vim
/REGEXP
with python
re.findall Return all non-overlapping matches of pattern in string>>> import re >>> re.findall(r'^1.*4$', '1454654564') ['1454654564']
Replace
with sed
sed -E 's/before/after/'
with vim
:1,$s/before/after/gic
Group & Extract Substring
() 擷取出 substring,最後從 group 的結果中取得這些 substring . 代表任意字元 + 前面的字符必須出現一次或多次 ? 原本的意思是:前面緊接的那個字有出現一次則匹配,沒出現也是匹配 +? 前面如果接 + 或 * 代表使用 從預設貪婪改成非貪婪的方式尋找貪婪代表所有可能的匹配結果中,取字元數最多的 非貪婪就是取字元數最少的目標字串: aaa123bbbaaa456bbbaaa(.+)bbb Match 1 Full match: aaa123bbbaaa456bbb Group 1: 123bbbaaa456aaa(.+?)bbb Match 1: Full match: aaa123bbb Group 123Match 2 Full match: aaa456bbb Group 456