定向班第二期_Shell 三剑客实战

环境

服务器:ssh 手机号后八位@shell.testing-studio.com
演练文件:copy /tmp/1206.log ~/1206.log

Shell介绍与答疑

netstat -tnp | grep ":22" | awk '{print $5}'  | awk -F: '{print $1}' | sort | uniq -c |sort -nr  | wc -l

了解格式

[00534760@izuf60jasqavbxb9efockpz ~]$ head -3 /tmp/1206.log
111.98.254.9 - - [05/Dec/2018:07:00:00 +0000] "GET /cable HTTP/1.1" 101 400 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36" 28.881 28.881 .
216.244.66.241 - - [05/Dec/2018:07:00:00 +0000] "GET /topics/7326/replies/65494/edit HTTP/1.1" 301 5 "-" "Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help@moz.com)" 0.001 0.001 .
216.244.66.241 - - [05/Dec/2018:07:00:01 +0000] "GET /topics/7326/replies/65516/reply_suggest HTTP/1.1" 301 5 "-" "Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help@moz.com)" 0.001 0.001 .

HTTP协议常见状态码的含义

  • 200
  • 30x
  • 400 404
  • 500 503

正则

基本表达式(BRE)

  • ^ 开头 $结尾
  • [a-z] [0-9] 区间
    • 0个或多个
  • . 表示任意字符

基本正则(BRE)与扩展正则的区别(ERE)

  • ? 非贪婪匹配
    • 一个或者多个
  • () 分组
  • {} 范围约束
  • | 匹配多个表达式的任何一个

要使用扩展表达式,需要grep sed awk 加上-E参数

练习题1

找出log中的404的报错数据,统计下共有多少404报错, 把你的命令贴到回复里,格式

作业

shell

结果:33333

练习题2

找出log中的404 500的报错 考察严谨性

练习题3

找出访问量最高的ip 统计分析, 把命令和访问量最高的前3条数据贴到回复里,在原来的回复里编辑新增就可以。

练习题4

找出访问 /topics/xxxxx,以及访问topics/nnnn/replies的 接口的请求分别有多少
预期结果
topics nnnn
topics/replies dddd

举例

216.244.66.241 - - [05/Dec/2018:07:00:03 +0000] "GET /topics/735?locale=zh-TW HTTP/1.1" 301 5 "-" "Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help@moz.com)" 0.001 0.001 .
216.244.66.241 - - [05/Dec/2018:07:00:02 +0000] "GET /topics/7351/replies/65837/reply_suggest HTTP/1.1" 301 5 "-" "Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help@moz.com)" 0.001 0.001 .
216.244.66.241 - - [05/Dec/2018:07:00:04 +0000] "GET /topics/7386/replies/66058/reply_suggest HTTP/1.1" 301 5 "-" "Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help@moz.com)" 0.001 0.001 .
216.244.66.241 - - [05/Dec/2018:07:00:05 +0000] "GET /topics/7392?locale=en HTTP/1.1" 301 5 "-" "Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help@moz.com)" 0.001 0.001 .

topics 2
replies 2

grep topics  1206.log |head -10 | grep -E "topics/[0-9]{1,10}[?]{1}" | sed -E 's#/[0-9]{1,10}#:int:#g' | awk '{print $7}' | sort | uniq -c
      1 /topics:int:?locale=en
      2 /topics:int:?locale=zh-TW
grep -E " /topics/[0-9]{1,}"  1206.log \\
| awk '{print $7}'  \\
| sed 's#?.*##g'  \\
| sed 's#/topics/[0-9]*$#/topics/topics#' \\
| sed 's#/topics/[0-9]*/replies/[0-9]*/.*#topics/repies#' \\
| sort | uniq -c |sort -nr |head -2

课后作业

找出访问量最高的页面地址,借助于sed的统计分析。只要标准路径,把变化的数字,query参数或者变化的噪音字段全部去掉。

/cable
/topics/7386/replies/66058/reply_suggest  变成 /topics/replies/reply_suggest  
/uploads/photo/2017/9eebf333-2729-467f-ac18-5a350608d865.png!large 变成 /uploads/photo/.png
/uploads/photo/2017/b34abf68-eb6d-4c41-8440-85673bad6d1e.png!large 变成 /uploads/photo/.png
/ddddxx 变成 /user  #拔高,不做��求

给他top10的最多路径访问数据

关闭