grep 整理
- 1. 正则表达式和通配符
- 2. grep
- 2.1 基本用法
- 2.2 进阶使用
- 2.3 配合正则表达式使用
- 2.4 grep增强版
1. 正则表达式和通配符
首先,我们回顾下正则表达式和通配符相关内容,这有助于接下来的grep
学习:
基础正则表达式
RE 字符 | 意义与范例 |
---|---|
^ word | 意义:待搜寻的字符串(word)在行首 !范例:搜寻行首为 # 开始的那一行,并列出行号。grep -n ‘^#’ regular_express.txt |
word$ | 意义:待搜寻的字符串(word)在行尾 !范例:将行尾为 ! 的那一行打印出来,并列出行号。grep -n ‘!$’ regular_express.txt |
. | 意义:代表『一定有一个任意字符 』的字符!范例:搜寻的字符串可以是 (eve) (eae) (eee) (e e), 但不能仅有 (ee) !亦即 e 与 e 中间『一定』仅有一个字符,而空格符也是字符!grep -n ‘e.e’ regular_express.txt |
\ | 意义:跳脱字符 ,将特殊符号的特殊意义去除!范例:搜寻含有单引号 ’ 的那一行!grep -n ’ regular_express.txt |
* | 意义:重复零个到无穷多个的前一个 RE 字符 。范例:找出含有 (es) (ess) (esss) 等等的字符串,注意,因为 * 可以是 0 个,所以 es 也是符合带搜寻字符串。另外,因为 * 为重复『前一个 RE 字符』的符号, 因此,在 * 之前必须要紧接着一个 RE 字符喔!例如任意字符则为 『.*』 !grep -n ‘ess*’ regular_express.txt |
[ list] | 意义:字符集合的 RE 字符,里面列出想要撷取的字符!范例搜寻含有 (gl) 或 (gd) 的那一行,需要特别留意的是,在 [] 当中『谨代表一个待搜寻的字符 』, 例如『 a[afl]y 』代表搜寻的字符串可以是 aay, afy, aly 即 [afl] 代表 a 或f 或 l 的意思!grep -n ‘g[ld]’ regular_express.txt |
[n1-n2 ] | 意义:字符集合的 RE 字符,里面列出想要撷取的字符范围 !范例:搜寻含有任意数字的那一行!需特别留意,在字符集合 [] 中的减号 - 是有特殊意义的,他代表两个字符之间的所有连续字符 !但这个连续与否与 ASCII 编码有关,因此,你的编码需要设定正确(在 bash 当中,需要确定 LANG 与 LANGUAGE 的变量是否正确!) 例如所有大写字符则为 [A-Z]。grep -n ‘[A-Z]’ regular_express.txt |
[^ list] | 意义:字符集合的 RE 字符,里面列出不要的字符串或范围 !范例:搜寻的字符串可以是 (oog) (ood) 但不能是 (oot) ,那个 ^ 在 [] 内时,代表的意义是『反向选择 』的意思。 例如,我不要大写字符,则为 [^A-Z]。但是,需要特别注意的是,如果以 grep -n [^A-Z] regular_express.txt 来搜寻,却发现该文件内的所有行都被列出,为什么?因为这个 [^A-Z] 是『非大写字符』的意思, 因为每一行均有非大写字符,例如第一行的 “Open Source” 就有 p,e,n,o… 等等的小写字。grep -n ‘oo[^t]’ regular_express.txt |
\{ n, m\} | 意义:连续 n 到 m 个的『前一个 RE 字符 』意义:若为 {n} 则是连续 n 个的前一个 RE 字符,意义:若是 {n,} 则是连续 n 个以上的前一个 RE 字符! 范例:在 g 与 g 之间有 2 个到3 个的 o 存在的字符串,亦即 (goog)(gooog)grep -n ‘go{2,3}g’ regular_express.txt |
兼容于 POSIX 的正则表达式
组 | 描述 |
---|---|
[[:alpha:]] | 匹配任意字母字符,不管大小写,[a-z],[A-Z] |
[[:alnum:]] | 匹配任意字母字符和数字,[0-9],[a-z],[A-Z] |
[[:upper:]] | 匹配大写字母,[A-Z] |
[[:lower:]] | 匹配小写字母,[a-z] |
[[:digit:]] | 匹配数字,[0-9] |
[[:blank:]] | 匹配空格或值表符 |
[[:cntrl:]] | 代表键盘上面的控制按键,亦即包括 CR, LF, Tab, Del… 等等 |
[[:graph:]] | 除了空格符 (空格键与 [Tab] 按键) 外的其他所有按键 |
[[:print:]] | 匹配任意可打印字符 |
[[:punct:]] | 匹配标点符号 |
[[:space:]] | 匹配任意空白字符,空格,制表,NL,FF,VT,CR |
[[:xdigit:]] | 代表 16 进位的数字类型,因此包括: 0-9, A-F, a-f 的数字与字符 |
上表中的[:alnum:], [:alpha:], [:upper:], [:lower:], [:digit:]代表区间合集,注意在匹配中只能代表一个字符
扩展正则表达式
RE 字符 | 意义与范例 |
---|---|
+ | 意义:重复『一个或一个以上』的前一个 RE 字符 范例:搜寻 (god) (good) (goood)… 等等的字符串。 那个 o+ 代表『一个以上的 o 』所以,底下的执行成果会将第 1, 9, 13 行列出来。 egrep -n 'go+d' regular_express.txt |
? | 意义:『零个或一个』的前一个 RE 字符 范例:搜寻 (gd) (god) 这两个字符串。 那个 o? 代表『空的或 1 个 o 』所以,上面的执行成果会将第 13, 14 行列出来。 有没有发现到,这两个案例( ‘go+d’ 与 ‘go?d’ )的结果集合与 ‘go*d’ 相同?想想看,这是为什么喔! ^_^ egrep -n 'go?d' regular_express.txt |
| | 意义:用或( or )的方式找出数个字符串 范例:搜寻 gd 或 good 这两个字符串,注意,是『或』! 所以,第 1,9,14 这三行都可以被打印出来喔!那如果还想要找出 dog 呢? egrep -n 'gd|good' regular_express.txt egrep -n 'gd|good|dog' regular_express.txt |
() | 意义:找出『群组』字符串 范例:搜寻 (glad) 或 (good) 这两个字符串,因为 g 与 d 是重复的,所以,我就可以将 la 与 oo 列于 ( ) 当中,并以 | 来分隔开来,就可以啦! egrep -n 'g(la|oo)d' regular_express.txt |
()+ | 意义:多个重复群组的判别 范例:将『AxyzxyzxyzxyzC』用 echo 叫出,然后再使用如下的方法搜寻一下! echo 'AxyzxyzxyzxyzC' | egrep 'A(xyz)+C' 上面的例子意思是说,我要找开头是 A 结尾是 C ,中间 有一个及以上 的 “xyz” 字符串的意思~ |
linux通配符
符号 | 意义 |
---|---|
* | 代表『 0 个到无穷多个』任意字符 |
? | 代表『一定有一个,单个字符』任意字符 |
[ ] | 同样代表『一定有一个在括号内』的字符(非任意字符)。例如 [abcd] 代表『一定有一个字符, 可能是 a, b, c, d 这四个任何一个』 |
[ - ] | 若有减号在中括号内时,代表『在编码顺序内的所有字符』。例如 [0-9] 代表 0 到 9 之间的所有数字,因为数字的语系编码是连续的! |
[^ ] | 若中括号内的第一个字符为指数符号 (^) ,那表示『反向选择』,例如 [^abc] 代表 一定有一个字符,只要是非 a, b, c 的其他字符就接受的意思。 |
linux特殊符号
符号 | 内容 |
---|---|
# | 批注符号:这个最常被使用在 script 当中,视为说明!在后的数据均不执行 |
\ | 跳脱符号:将『特殊字符或通配符』还原成一般字符 |
| | 管线 (pipe):分隔两个管线命令的界定(后两节介绍); |
; | 连续指令下达分隔符:连续性命令的界定 (注意!与管线命令并不相同) |
~ | 用户的家目录 |
$ | 取用变数前导符:亦即是变量之前需要加的变量取代值 |
& | 工作控制 (job control):将指令变成背景下工作 |
! | 逻辑运算意义上的『非』 not 的意思! |
/ | 目录符号:路径分隔的符号 |
>, >> | 数据流重导向:输出导向,分别是『取代』与『累加』 |
<, << | 数据流重导向:输入导向 (这两个留待下节介绍) |
’ ’ | 单引号,不具有变量置换的功能 ($ 变为纯文本) |
" " | 具有变量置换的功能! ($ 可保留相关功能) |
` ` | 两个『 ` 』中间为可以先执行的指令,亦可使用 $( ) |
( ) | 在中间为子 shell 的起始与结束 |
{ } | 在中间为命令区块的组合! |
没有对比就没有伤害~,我们发现正则表达式和通配符中有一些符号相同,但含义不同
符号 | RE含义 | 通配符含义 |
---|---|---|
* | 重复零个到无穷多个的前一个 RE 字符 | 代表『 0 个到无穷多个』任意字符 |
? | 『零个或一个』的前一个 RE 字符 | 代表『一定有一个,单个字符』任意字符 |
>[root@node-249 test]# touch {101..110}
[root@node-249 test]# ls
101 102 103 104 105 106 107 108 109 110
[root@node-249 test]# ls|grep '10*'
101
102
103
104
105
106
107
108
109
110
[root@node-249 test]# ls 10*
101 102 103 104 105 106 107 108 109
[root@node-249 test]# ls |egrep '10?'
101
102
103
104
105
106
107
108
109
110
[root@node-249 test]# ls 10?
101 102 103 104 105 106 107 108 109
2. grep
文本搜索工具
grep
(缩写来自Globally search a Regular Expression and Print)是一种强大的文本搜索工具,它能使用特定模式匹配(包括正则表达式)搜索文本,并默认输出匹配行。Unix的grep家族包括grep
、egrep
和fgrep
。Windows系统下类似命令FINDSTR
。
egrep
和fgrep
的命令只跟grep
有很小不同。egrep和fgrep都是grep的扩展,支持更多的re元字符
,fgrep就是fixed grep或fast grep,它们把所有的字母都看作单词,也就是说,正则表达式中的元字符表示回其自身的字面意义,不再特殊。linux使用GNU版本的grep。它功能更强,可以通过-G
、-E
、-F
命令行选项来使用egrep和fgrep的功能。
In addition, two variant programs egrep and fgrep are available. egrep is the same as grep -E. fgrep is the same as grep -F. Direct invocation as either egrep or fgrep is deprecated, but is provided to allow historical applications that rely on them to run unmodified.
egrep和fgrep并不推荐使用
默认grep只支持基础正则表达式,而通过grep -E
或者egrep
则可以使用扩展正则表达式
关于基础正则表达式和扩展正则表达式,可以查看
https://blog.csdn.net/u010230019/article/details/132075257
https://blog.csdn.net/u010230019/article/details/132097203
2.1 基本用法
grep [-acinv] [--color=auto] '搜寻字符串' filename
#选项与参数:
-a :将 binary 文件以 text 文件的方式搜寻数据
-c :计算找到 '搜寻字符串' 的行数
-i :忽略大小写的不同,所以大小写视为相同
-n :顺便输出行号
-v :反向选择,亦即显示出没有 '搜寻字符串' 内容的那一行!
--color=auto :可以将找到的关键词部分加上颜色的显示喔!
示例
[root@node-249 test]# cat txt
100
101
105
110
111
115
120
121
125
Dog
dog
[root@node-249 test]# grep -c '11' txt
3
[root@node-249 test]# vim txt
[root@node-249 test]# grep -i 'dog' txt
Dog
dog
[root@node-249 test]# grep 'dog' txt
dog
[root@node-249 test]# grep -ni 'dog' txt
10:Dog
11:dog
[root@node-249 test]# grep -v 'dog' txt
100
101
105
110
111
115
120
121
125
Dog
[root@node-249 test]# grep --color 'dog' txt
dog
[root@node-249 test]# grep --color '11' txt
110
111
115
[root@node-249 test]# grep --color '10' txt
100
101
105
110
[root@node-249 test]# alias
alias cp='cp -i'
alias egrep='egrep --color=auto'
alias fgrep='fgrep --color=auto'
alias grep='grep --color=auto'
alias l.='ls -d .* --color=auto'
alias ll='ls -l --color=auto'
alias ls='ls --color=auto'
alias mv='mv -i'
alias rm='rm -i'
alias which='alias | /usr/bin/which --tty-only --read-alias --show-dot --show-tilde'
2.2 进阶使用
[dmtsai@study ~]$ grep [-A] [-B] [--color=auto] '搜寻字符串' filename
#选项与参数:
-A :后面可加数字,为 after 的意思,除了列出该行外,以该行为锚点,后续的 n 行也列出来;
-B :后面可加数字,为 befer 的意思,除了列出该行外,以该行为锚点,前面的 n 行也列出来;
-n :显示匹配内容的行号
-i :忽略大小写
--color=auto 可将正确的那个撷取数据列出颜色
-l :列出文件名
示例
[root@node-249 test]# grep -A1 -B1 '110' txt
105
110
111
2.3 配合正则表达式使用
这里我们编写个内容更多的文件
[root@node-249 test]# cat txt
"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.
However, this dress is about $ 3183 dollars.^M
GNU is free air not free beer.^M
Her hair is very beauty.^M
I can't finish the test.^M
Oh! The soup taste good.^M
motorcycle is cheap than car.
This window is clear.
the symbol '*' is represented as start.
Oh! My god!
The gd software is a library for drafting programs.^M
You are the best is mean you are the no. 1.
The world <Happy> is the same with "glad".
I like dog.
google is the best tools for search keyword.
goooooogle yes!
go! go! Let's go.
# I am VBird
- 利用中括号
[]
来搜寻集合字符
#只能匹配a或e一个字符
[root@node-249 test]# grep -n 't[ae]ste*' txt
8:I can't finish the test.^M
9:Oh! The soup taste good.^M#匹配不包含g或o连接的oo字符串
[root@node-249 test]# grep -n '[^go]oo' txt
2:apple is my favorite food.
3:Football game is not use feet only.
18:google is the best tools for search keyword.#匹配不以小写字母连接的oo
[root@node-249 test]# grep -n '[^a-z]oo' txt
3:Football game is not use feet only.[root@node-249 test]# grep -n '[^[:lower:]]oo' txt
3:Football game is not use feet only.#匹配数字
[root@node-249 test]# grep -n '[0-9]' txt
5:However, this dress is about $ 3183 dollars.^M
15:You are the best is mean you are the no. 1.#匹配数字
[root@node-249 test]# grep -n [[:digit:]] txt
5:However, this dress is about $ 3183 dollars.^M
15:You are the best is mean you are the no. 1.
- 行首与行尾字符
^ $
#以the开头的行
[root@node-249 test]# grep -n '^the' txt
12:the symbol '*' is represented as start.#以大写字母开头的行
[root@node-249 test]# grep -n '^[A-Z]' txt
#不包含大写字母的单词
[root@node-249 test]# grep -n '[^A-Z]' txt#以小写字母开头的行
[root@node-249 test]# grep -n '^[[:lower:]]' txt
2:apple is my favorite food.
4:this dress doesn't fit me.
10:motorcycle is cheap than car.
12:the symbol '*' is represented as start.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.#不以小写字母开头的行
[root@node-249 test]# grep -n '^[^[:lower:]]' txt
1:"Open Source" is a good mechanism to develop programs.
3:Football game is not use feet only.
5:However, this dress is about $ 3183 dollars.^M
6:GNU is free air not free beer.^M
7:Her hair is very beauty.^M
8:I can't finish the test.^M
9:Oh! The soup taste good.^M
11:This window is clear.
13:Oh! My god!
14:The gd software is a library for drafting programs.^M
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
21:# I am VBird#不以字母开头的行
[root@node-249 test]# grep -n '^[^a-zA-Z]' txt
1:"Open Source" is a good mechanism to develop programs.
21:# I am VBird[root@node-249 test]# grep -n '^[^[:alpha:]]' txt
1:"Open Source" is a good mechanism to develop programs.
21:# I am VBird#以.结尾的行
[root@node-249 test]# grep -n '\.$' txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
...#匹配空行和非空行
[root@node-249 test]# echo '' >> txt
[root@node-249 test]# grep -n '^$' txt
22:
[root@node-249 test]# grep -nv '^$' txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.^M
6:GNU is free air not free beer.^M
...
- 任意一个字符
.
与重复字符*
.
(小数点):代表『一定有一个任意字符』的意思;
*
(星星号):代表『重复前一个字符, 0 到无穷多次』的意思,为组合形态
[root@node-249 test]# grep -n 'g..d' txt
1:"Open Source" is a good mechanism to develop programs.
9:Oh! The soup taste good.^M
16:The world <Happy> is the same with "glad".#至少两个 o 以上的字符串
[root@node-249 test]# grep -n 'ooo*' txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! The soup taste good.^M
18:google is the best tools for search keyword.
19:goooooogle yes!#以g开头,g结尾中间至少一个o
[root@node-249 test]# grep -n 'goo*g' txt
18:google is the best tools for search keyword.
19:goooooogle yes!#以g开头,g结尾,中间可以有任意个字符
[root@node-249 test]# grep -n 'g.*g' txt
1:"Open Source" is a good mechanism to develop programs.
14:The gd software is a library for drafting programs.^M
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.#至少有两个数字
[root@node-249 test]# grep -n '[0-9][0-9][0-9]*' txt
5:However, this dress is about $ 3183 dollars.^M
- 限定连续 RE 字符范围
{}
限制一个范围区间内的重复字符数
注意:因为 { 与 } 的符号在 shell 是有特殊意义的,因此, 我们必须要使用跳脱字符 \ 来让他失去特殊意义才行
#连续出现2次以上的o
[root@node-249 test]# grep -n 'o\{2\}' txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! The soup taste good.^M
18:google is the best tools for search keyword.
19:goooooogle yes!#数量限制
[root@node-249 test]# grep -n 'go\{2,5\}g' txt
18:google is the best tools for search keyword.
[root@node-249 test]# grep -n 'go\{2,\}g' txt
18:google is the best tools for search keyword.
19:goooooogle yes!
[root@node-249 test]# grep -n 'go\{,5\}g' txt
18:google is the best tools for search keyword.
2.4 grep增强版
这里提到的grep增强版
即egrep
,由于egrep
不推荐使用,所以这里我们用grep -E
代替
egrep是一种增强版的grep,它使用更多的正则表达式来搜索文本,比如可以使用更多的元字符,更多的重复模式,更多的可选项等。
#匹配()中字符串一个及以上
[root@node-249 test]# echo 'AxyzxyzxyzxyzC' | grep -E 'A(xyz)+C'
AxyzxyzxyzxyzC
[root@node-249 test]# echo 'AxyzxyzxyzxyzC' | grep -E 'A(x)+C'
[root@node-249 test]##匹配o零个或1个
[root@node-249 test]# grep -En 'o?' txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.^M
6:GNU is free air not free beer.^M
7:Her hair is very beauty.^M
8:I can't finish the test.^M
9:Oh! The soup taste good.^M
10:motorcycle is cheap than car.
11:This window is clear.
12:the symbol '*' is represented as start.
13:Oh! My god!
14:The gd software is a library for drafting programs.^M
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
21:# I am VBird
22:#匹配o 零个或多个
[root@node-249 test]# grep -En 'o*' txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.^M
6:GNU is free air not free beer.^M
7:Her hair is very beauty.^M
8:I can't finish the test.^M
9:Oh! The soup taste good.^M
10:motorcycle is cheap than car.
11:This window is clear.
12:the symbol '*' is represented as start.
13:Oh! My god!
14:The gd software is a library for drafting programs.^M
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
21:# I am VBird
22:#匹配o 1个及以上
[root@node-249 test]# grep -En 'o+' txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.^M
6:GNU is free air not free beer.^M
9:Oh! The soup taste good.^M
10:motorcycle is cheap than car.
11:This window is clear.
12:the symbol '*' is represented as start.
13:Oh! My god!
14:The gd software is a library for drafting programs.^M
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.#匹配good或glad
[root@node-249 test]# grep -En 'g(oo|la)d' txt
1:"Open Source" is a good mechanism to develop programs.
9:Oh! The soup taste good.^M
16:The world <Happy> is the same with "glad".#gd之间只能是o,至少一个
[root@node-249 test]# grep -En 'g(o)+d' txt
1:"Open Source" is a good mechanism to develop programs.
9:Oh! The soup taste good.^M
13:Oh! My god!
比较常用的命令
#去除空行和注释行,这个对查看配置文件很有用
[root@node-249 test]# grep -Ev '^$|^#' txt
"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.
However, this dress is about $ 3183 dollars.^M
...