Linux 的 execsnoop:查看短期进程

每次我无法立即解释 CPU 使用情况时,我都想知道是否应该归咎于短期进程。在 Linux 系统上,我可以使用atop进行调试,它使用进程记帐来捕获这些转瞬即逝的进程。但我希望我有我的 execsnoop 工具,它可以创建每个进程的实时日志以供以后研究。
我只是将它移植到Linux。这是一个示例,其中“man ls”在另一个窗口中运行:

./execsnoop

跟踪 exec()s。Ctrl-C 结束。
PID PPID ARGS
20139 20135 mawk -W 交互式 -vo=1 -v opt_name=0 -v name= […]
20140 20138 猫-v trace_pipe
20171 16743 人LS
20178 20171 preconv -e UTF-8
20181 20171 寻呼机-s
20180 20171 nroff -manoc -rLL=173n -rLT=173n -Tutf8
20179 20171 吨
20184 20183 语言环境charmap
20185 20180 groff -mtty-char -Tutf8 -mandoc -rLL=173n -rLT=173n
20186 20185 troff -mtty-char -mandoc -rLL=173n -rLT=173n -Tutf8
20187 20185 格罗蒂
伟大的!前两行,显示 mawk 和 cat,来自 execsnoop 初始化。剩下的展示了 man 命令的神秘工作。
以下是 Linux 构建的一部分:

./execsnoop

跟踪 exec()s。Ctrl-C 结束。
PID PPID ARGS
25753 25588 /bin/sh -c echo ‘python’ | grep ^/ -q && 回显 y
25755 25753 grep ^/ -q
25756 25588 sh -c 命令 -v ‘python’ | awk ‘NR==1 {t=$0} NR>1 {t=t “m822df3020w6a44id34bt574ctac44eb9f4n” $0} END {printf t}’
25758 25756 awk NR==1 {t=$0} NR>1 {t=t “m822df3020w6a44id34bt574ctac44eb9f4n” $0} END {printf t}
25759 25588 /bin/sh -c echo ‘python’ | grep ^/ -q && 回显 y
25761 25759 grep ^/ -q
25762 25588 sh -c 命令 -v ‘python’ | awk ‘NR==1 {t=$0} NR>1 {t=t “m822df3020w6a44id34bt574ctac44eb9f4n” $0} END {printf t}’
25764 25762 awk NR==1 {t=$0} NR>1 {t=t “m822df3020w6a44id34bt574ctac44eb9f4n” $0} END {printf t}
25765 25588 /bin/sh -c echo ‘/usr/bin/python-config’ | grep ^/ -q && 回显 y
25767 25765 grep ^/ -q
25768 25588 sh -c test -f ‘/usr/bin/python-config’ -a -x ‘/usr/bin/python-config’ && echo y
25769 25588 /bin/sh -c echo ‘/usr/bin/python-config’ | grep ^/ -q && 回显 y
25771 25769 grep ^/ -q
25772 25588 sh -c test -f ‘/usr/bin/python-config’ -a -x ‘/usr/bin/python-config’ && echo y
25773 25588 /bin/sh -c ‘/usr/bin/python-config’ --ldflags 2>/dev/null
25774 25773 /usr/bin/python-config --ldflags
25775 25588 /bin/sh -c ‘/usr/bin/python-config’ --cflags 2>/dev/null
25776 25775 /usr/bin/python-config --cflags
25777 25588 /bin/sh -c touch PERF-FEATURES; cat PERF-特点
25778 25777 touch PERF-FEATURES
25779 25777 猫 PERF-FEATURES
25780 25588 printf …%30s: [ \033[32mon\033[m] 矮
25781 25588 printf …%30s: [ \033[31mOFF\033[m] 矮
25782 25588 printf …%30s: [ \033[32mon\033[m] glibc
25783 25588 printf …%30s: [ \033[31mOFF\033[m] glibc
25784 25588 printf …%30s: [ \033[32mon\033[m] gtk2
[…]
Wow. Application startups can also run a surprising number of processes. execsnoop can help you identify areas for performance improvement: excessive sh/grep/sed/awk invocations, that can often be rewritten to use more advanced shell or awk features.
You can also use execsnoop to catch unexpected behavior. Running it with -t for timestamps:

./execsnoop -t

Tracing exec()s. Ctrl-C to end.
TIMEs PID PPID ARGS
[…]
3799932.757407 29403 29390 hostname
3799932.763324 29405 29404 cat /sys/class/net/eth0/address
3799932.768432 29407 29406 grep -l [1]EC2SYNC=no([[:space:]#]|$) /etc/sysconfig/network-scripts/ifcfg-eth0
3799932.778556 29412 29411 curl -s -f http://169.254.169.254/latest/meta-data/network/interfaces/macs/0a:8b:c3:b8:9d:21/local-ipv4s
3799932.796555 29415 29413 grep inet .
secondary eth0
3799932.798570 29416 29413 awk {print $2}
3799932.801021 29414 29413 /sbin/ip addr list dev eth0 secondary
3799932.802559 29417 29413 cut -d/ -f1
3800159.574170 29419 29418 /bin/sh -c /usr/lib64/sa/sa1 1 1
Why is my system running curl on that address? Hm.
Options
execsnoop options are summarized by the USAGE message (there’s also a man page and examples file):

./execsnoop -h

USAGE: execsnoop [-hrt] [-a argc] [-d secs] [name]
-d seconds # trace duration, and use buffers
-a argc # max args to show (default 8)
-r # include re-execs
-t # include time (seconds)
-h # this usage message
name # process name to match (REs allowed)
eg,
execsnoop # watch exec()s live (unbuffered)
execsnoop -d 1 # trace 1 sec (buffered)
execsnoop grep # trace process names containing grep
execsnoop ‘log$’ # filenames ending in “log”
execsnoop traces events as they happen, unless the -d option is used, which uses in-kernel buffering.
What, Why, and How
This is another ftrace-based hack for my perf-tools collection. These are designed to work with fewest dependencies (including no kernel debuginfo, if possible), and on older Linux kernel versions, particularly my Linux 3.2 systems. I expect to rewrite them when new tracing features are added to Linux in the future.
This turned out to be difficult, and a number of times it I thought it might be impossible in this environment.
My first attempt, execsnoop-proc, traced sched:sched_process_exec with process arguments from /proc/PID/cmdline. This worked on many systems, but not all, as I don’t think /proc could always be read quickly enough for some processes. The sched:sched_process_exec tracepoint was missing on some systems as well.
The version I’m now using dynamically traces either stub_execve() or do_execve(), and walks the %si register as an array of strings using an unrolled loop. This is an enormous hack that I can hardly believe works, but it does work on all the systems I need it to.
Here’s the essence of what I’m doing, using the perf(1) command (perf_events):

perf probe --add ‘do_execve +0(+0(%si)):string +0(+8(%si)):string +0(+16(%si)):string +0(+24(%si)):string’

perf record --no-buffering -e probe:do_execve -a -o - | PAGER=“cat -v” stdbuf -oL perf script -i -

:10007 10007 [000] 557516.214765: probe:do_execve: (ffffffff811cccb0) arg1=“ls” arg2=“–color=auto” arg3=“-l” arg4=“^"
:10008 10008 [000] 557516.219168: 探测:do_execve: (ffffffff811cccb0) arg1=“sleep” arg2=“1” arg3=”^X" arg4=“”
#性能探测 --del do_execve
在这种情况下,我只包含了三个参数,但您可以看到展开循环是如何工作的。第一个单行适用于特定的内核版本和平台,可能由于多种原因不适合您,如果不先对其进行调整以匹配您所拥有的。
如果您想在此级别进行分析,您可能会发现我的kprobe工具更易于使用。将上述内容与以下内容进行比较:

./kprobe ‘p:do_execve +0(+0(%si)):string +0(+8(%si)):string +0(+16(%si)):string +0(+24( %si)):string’

跟踪 kprobe do_execve。Ctrl-C 结束。
kprobe-12484 [000] d… 7149163.889695: do_execve: (do_execve+0x0/0x20) arg1=“cat” arg2=“trace_pipe” arg3=(fault) arg4=“”
bash-12499 [001] d… 7149164.593106: do_execve: (do_execve+0x0/0x20) arg1=“ls” arg2=“–color=auto” arg3=“-l” arg4=(错误)
bash-12500 [001] d… 7149164.597399: do_execve: (do_execve+0x0/0x20) arg1=“sleep” arg2=“1” arg3=(fault) arg4=“”
^C
结束追踪…
kprobe(8) 很好。它会自动添加和删除 kprobe,并具有显示堆栈 (-s) 和列标题 (-H) 的选项。
如果我使用 SystemTap,我会这样做:

stap -ve ‘probe process.begin { printf(“%6d %6d %s\n”, pid(), ppid(), cmdline_str()); }’

(当我运行它时,由于某种原因,它首先列出所有当前进程,然后是新进程。我不知道这是一个特性还是一个错误。)
Linux 的未来应该支持像这样单行的跟踪器。同时,我可以使用 execsnoop 的破解版本来解决旧系统上的性能问题。
结论
短暂的进程可能会导致性能问题,并且在 top(1) 等间隔采样工具中是不可见的。在这篇文章中,我描述了我流行的 execsnoop 工具的 Linux 端口,多年来我一直使用它来识别和研究短期进程,并解决它们可能导致的性能问题。
这也是旧 Linux 内核(如opensnoop和iosnoop)使用现有 ftrace 和 kprobes 跟踪框架的另一个概念证明。还有其他方法可以做到这一点:您可能有内核调试信息和 SystemTap 可用,或者启用了审计。这个实现对于我没有安装内核调试信息的基本 Linux 3.2 云实例来说很方便。
警告适用:execsnoop 和这些单行程序在 Linux 上使用动态跟踪,过去曾出现过内核恐慌错误,所以要知道你在做什么,首先测试,使用风险自负。

转载翻译自: https://www.brendangregg.com


  1. [:space:] ↩︎