Linux 系统异常进程处理
一、僵尸进程
说明:僵尸进程对系统来说就是系统已经接管不了并处于异常状态的进程,既不会自动释放,也不能被系统接管,下面列出几种查看并kill僵尸进程的方式 。
方式一、使用如下命令查看目前系统状态为 Z 的僵尸进程,获取进程 ID 号
ps ux | awk '{if($8=="Z+") print}'
确认了进程 ID 后,我们可以得到它的父进程 ID
ps -o ppid= -p <child_id>
方式二、使用如下命令捕获僵尸进程 ID 号,捕获出来直接 kill
ps -A -ostat,pid,ppid | grep -e '[zZ]'
kill -9 父进程号
方式三、采用如下命令查看僵尸进程
ps -ef | grep defunct
附、查看或批量删除僵尸进程
ps -A -o stat,ppid,pid,cmd | grep -e '^[Zz]'
ps -A -o stat,ppid,pid,cmd | grep -e '^[Zz]' | awk '{print $2}' | xargs kill -9
二、linux程序后台进程启动后又stopped原因及解决
strace -e trace=none -p PID
strace: Process 2028 attached
--- stopped by SIGTTIN ---
给它个“黑洞”:
program </dev/null &
也可以利用trace工具分析问题原因。
ps -ax | grep T
ps -e j | grep T
stopped 进程的STAT为T
下面列出常用的进程状态码:
D uninterruptible sleep (usually IO)
I Idle kernel thread
R running or runnable (on run queue)
S interruptible sleep (waiting for an event to complete)
T stopped by job control signal
t stopped by debugger during the tracing
W paging (not valid since the 2.6.xx kernel)
X dead (should never be seen)
Z defunct ("zombie") process, terminated but not reaped by its parent
For BSD formats and when the stat keyword is used, additional characters may be displayed:
N low-priority (nice to other users)
L has pages locked into memory (for real-time and custom IO)