本教程基于 Ubuntu 20.10 gcc 10.2.0. 示例程序如果不能正常编译和执行，说明您系统和工具版本与我的不匹配，请自行查阅资料。

0 概述

先给出该信号的描述：

Signal	Value	Description
SIGCHLD	17	Child status has changed (POSIX). Signal sent to parent process whenever one of its child processes terminates or stops.See the YoLinux.com Fork, exec, wait, waitpid tutorial

参考：All signals in c/c++

意思是说，当父进程的多个子进程中，某个子进程结束或停止的时候，就会触发父进程的信号处理函数。

1 父进程与一个子进程

先看示例

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <signal.h>pid_t pid = 0;void sigchld_handler(int sig){printf("father call the SIGCHLD signal handler. num = %d\n",sig);
}int main(){signal(SIGCHLD,sigchld_handler);pid = fork();if(pid == -1){exit(1);} else if(pid == 0){ // child codeprintf("child process is running, pid = %d.\n",getpid());       } else { // father codepause();printf("father process runs again.\n");}   return 0;
}

输出结果是

child process is running, pid = 352252.
father call the SIGCHLD signal handler. num = 17
father process runs again.

1.1 程序分析

我们先分析一下这个程序

一开始使用了signal函数，并且设置了，如果子进程结束，则父进程的函数sigchld_handler会被触发执行。（类似于“开中断”）
父进程使用fork，分出来一个子进程
子进程执行child code代码，父进程执行father code代码（二者是并发执行的）
如果父进程执行结束之前，子进程先结束执行了，子进程会给父进程发送信号SIGCHLD，那么，父进程就会触发signal，转而执行函数sigchld_handler，执行完该函数后，父进程继续执行后面的代码。
如果子进程发送信号的时候，父进程已经执行结束了，那么函数sigchld_handler也不会执行了，因为父进程已经执行完了。

这里需要特别注意的是，signal的本质是软中断，也就是中断，因此，对于允许被信号SIGCHLD中断的父进程来说，在触发之前，父进程都是正常执行的，就与硬中断是一样的！

另外这里，其实是更高抽象层次的软件中断，与底层的软中断还不是一回事儿，需要明确这一点。

此外，尽管看起来这个中断是子进程发送给父进程的，但是实际上，是Linux操作系统发送给父进程的，也就是说，子进程其实先把信号给了OS。当然你可以先忽略这一点。因为……很多东西都要经过OS的控制的，所以说一直关注这个也没必要。

1.2 程序框架

下面来说一下，关于一个父进程和一个子进程的程序框架。

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <signal.h>pid_t pid = 0;void sigchld_handler(int sig){// 这里写处理函数
}int main(){signal(SIGCHLD,sigchld_handler);pid = fork();if(pid == -1){exit(1);} else if(pid == 0){ // child code// 这里写子进程代码     } else { // father code// 这里写父进程代码}   return 0;
}

然后分别说明一下。

首先是信号处理函数

void sigchld_handler(int sig){// 这里写处理函数
}

对于函数名，可以随意改，其他的不能改，函数参数只能是int类型，参数名可以改，返回值只能是void。内容随意写。

如果你想使用main函数中定义的参数，只能将其设置为全局变量了。

参考：Providing/passing argument to signal handler
You can’t have data of your own passed to the signal handler as parameters. Instead you’ll have to store your parameters in global variables. (And be really, really careful if you ever need to change those data after installing the the signal handler).

But why? Now I don’t know about it.

然后说说，父进程的代码。

...
else { // father code// 这里写父进程代码
}  ...

需要注意的是，我们前面说过，如果子进程发送结束信号的时候，父进程已经执行完了，父进程就不会调用信号处理函数了，因此，我们需要设置一个东西，保证子进程结束的时候，父进程一定没有执行完，当然你必须知道，这只是出于教学目的，让你看见，父进程是会调用信号处理函数的。真实情况下完全没有必要等着。

所以说，可以加什么呢？理论来说，其实加啥都行，只要让父进程执行地慢一点就好了。

举例

else { // father codesleep(10);// 这里写父进程代码
}

下面我就只说明，sleep(10)的位置可以替换什么。

1. pause();
2. wait(NULL);
3. for循环99999999次
4. ...

这些都可以，不过，其实使用pause()就好，让程序暂停，当触发信号服务函数之后，暂停自然结束，然后父进程就会执行信号处理函数，再执行其他代码。

对于for循环99999999次是不推荐的，因为执行完信号处理函数，这个循环需要继续执行，直到结束。而且数值太小了也没用，要知道执行的很快，再加上这样会占用CPU。

使用

wait(NULL);
pause();

是最好的，父进程会被挂起等待子进程执行结束，然后就执行信号处理函数，再执行其他代码。不会出现多余的等待时间（对比for循环）。

sleep(n)；也不会出现多余等待时间，被打断后，就没了，不会打断完回来接着睡，这个应该与内部实现有关，但是，睡眠的秒数是不确定的，谁知道子进程执行多久呢？因此不要采用这个。

最安全的是`wait waitpid`

wait(...);
waitpid(...);

因为pause()是无条件等待，这样一来，如果在执行它之前，子进程已经执行完并且调用了处理函数了，之后再执行了pause，那么父进程将陷入无限等待。

而wait就不一样了，它能够识别子进程是否执行完了，如果没执行，就等着执行完了执行服务程序；如果在此之前就执行完了，那就不等了，直接往下执行。(waitpid的option参数请使用产生阻塞的0)。

参考：Wait System Call in C

但是，有的时候pause也是很好用的，这个就视情况而定吧，看你自己了。

Use pause function to wait a signal arrivers.

1.3 默认的信号处理函数

对于信号SIGCHLD，默认处理参数SIG_DFL就是忽略。另外，默认的处理就相当于是系统提前安装好的处理方法，你可以直接调用默认的，也可以自定义，之前我们就是自定义。

参考
[1] IBM documentation: signal: Install signal handler
[2] signal(7) — Linux manual page

至于细节这里不说了，内容比较多，给出参考链接自己看看就好。

另外根据执行结果，你可以看到，信号处理函数参数sig的值是17，也就是信号SIGCHLD的value.

2 父进程与两个子进程

要知道，这个信号，父进程可以被任意一个它的子进程打断，那么，如何识别和区分多个子进程的结束呢？我们看示例程序

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <signal.h>
#include <sys/wait.h>pid_t fd1 = 0;
pid_t fd2 = 0;
int stat = 0;int wd1 = 0;
int wd2 = 0;void sigchld_handler(int sig){// represent if this function is called.printf("father call the SIGCHLD signal handler.\n");wd1 = waitpid(fd1,&stat,0);wd2 = waitpid(fd2,&stat,0);if(wd1 == fd1){ // child1 finishprintf("child1 process has finished.\n");}if(wd2 == fd2){ // child2 finishprintf("child2 process has finished.\n");}
}int main(){signal(SIGCHLD,sigchld_handler);fd1 = fork();if(fd1 == -1){exit(1);} else if(fd1 == 0){ // child1 codeprintf("child process is running, pid = %d.\n",getpid());       } else {fd2 = fork();switch(fd2){case -1:exit(1);break;case 0:// child2 codeprintf("child process is running, pid = %d.\n",getpid());   break;default: // father codepause();printf("father process runs again.\n");break;}}   return 0;
}

对于示例程序，执行结果可能有2种情况

child process is running, pid = 375332.        
child process is running, pid = 375333.      
father call the SIGCHLD signal handler.    
child1 process has finished.         
child2 process has finished.          
father call the SIGCHLD signal handler.                                    
father process runs again.

或者

child process is running, pid = 375332.        
child process is running, pid = 375333.      
father call the SIGCHLD signal handler.    
child1 process has finished.         
child2 process has finished.                                            
father process runs again.

少了1个father call the SIGCHLD signal handler.。

示例程序采用了一种粗暴的算法，这种方式反而是最好用的，一定能够捕获到两个子进程的结束。

首先pause暂停父进程，如果2个子进程，某一个结束，就会触发父进程的信号处理函数，然后该函数就等着，直到2个子进程都结束，才会继续执行相关处理操作，此过程中，并没有理会最初是哪个子进程的结束导致了触发。

因此来说，后面的两个if语句，一定会被执行。

此时需要知道，只要有子进程结束，就会发送信号，让父进程调用信号处理程序，因此，总体来说，发送的信号一定是2个，而父进程，一定会接收到第一个信号，第二个信号就不一定了，可能接收之前，父进程就已经退出了，就接收不到了。

所以才有了2种不同的输出结果，差异也是由此导致的。

参考：Signals Close Together Merge into One
If multiple signals of the same type are delivered to your process before your signal handler has a chance to be invoked at all, the handler may only be invoked once, as if only a single signal had arrived. In effect, the signals merge into one. This situation can arise when the signal is blocked, or in a multiprocessing environment where the system is busy running some other processes while the signals are delivered. This means, for example, that you cannot reliably use a signal handler to count signals. The only distinction you can reliably make is whether at least one signal has arrived since a given time in the past.

根据GNU官方的说法，多个相同类型的信号都被发送给一个进程的时候（在调用handler之前），这些信号可能会被合并，也就是说，handler可能仅仅执行一次。具体细节参考链接。

至于为什么第二次调用的时候，if语句块不会执行，那是因为，一旦waitpid( , ,0)执行完成，此时等待的进程就退出了，你再第二次调用，返回的一定是-1了，肯定就不会执行。

最后需要知道，父进程等待子进程结束再执行，得根据实际需求来，现在就纯粹是为了教学而已。

另外基于其中断的本质来说，中断本来就是随机的，父进程特意等待中断，其实有点奇怪的，不过有需求的话完全可以这样做。

图示说明刚才的程序

情况1：
情况2

这个信号处理函数，一定会被调用2次？

不一定！
在父进程没有结束的前提下，一定调用2次
父进程结束的话，可能调用1次或者0次
因此，最多被调用2次

但是一定会发送2次信号给父进程！每个子进程结束的时候发送1次！而父进程最多接收到2次！

参考：how a father process know which child process send the signal SIGCHLD

You cannot reliably use a signal handler to count signals.
在调用handler之前，不能保证父进程能够接收到多个相同类型的信号，这是其自身机制导致的。因此，你只能假设，父进程只接收到了1次信号，调用了1次hanlder，然后去做相关的事情。

signal的作用范围和作用时效到底是什么？多次调用会发生什么？（经过测试，好像在开头调用1次就够了，即便在handler内部再调用，结果也一样，似乎它是允许嵌套多次调用的），暂时没有查阅相关资料。