Linux多线程控制：深入理解与应用（万字详解！）

🎬慕斯主页：修仙—别有洞天

♈️今日夜电波：どうして (feat. 野田愛実)

0:44━━━━━━️💟──────── 3:01
🔄 ◀️ ⏸ ▶️ ☰

💗关注👍点赞🙌收藏您的每一次鼓励都是对我莫大的支持😍

如何创建线程？

pthread_self()

如何终止线程？

通过return nullptr来线程终止

通过pthread_exit()来线程终止

通过pthread_cancel()来取消线程（先看后面的等待在回头看这里）

线程等待

pthread_join()

pthread_detach()

一个小拓展

线程id详解

pthread库知识补充

clone()

系统调用问题

如何理解pthread库来管理线程

前面提到的LWP为什么和pthread_self()获得的不同？

线程局部存储是啥？

__thread

线性局部存储示例

如何创建线程？

在Linux中，可以使用POSIX线程库（pthread）来创建线程。pthread_create()函数是用于创建线程的函数。它定义在<pthread.h>头文件中，其声明如下：

#include <pthread.h>int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg);

pthread_create()函数的参数具体含义如下：

pthread_t *thread：这是一个指向pthread_t类型的指针，用于存储新创建线程的ID。在调用pthread_create()时，可以传递一个pthread_t类型的指针变量，或者直接传递某个pthread_t类型变量的地址。
const pthread_attr_t *attr：这个参数是一个指向pthread_attr_t类型的指针，用于设置线程的属性。如果设置为NULL，则使用默认属性创建线程。
void *(*start_routine) (void *)：这是一个函数指针，指向新线程将要执行的函数。这个函数通常被称为线程函数，它应该接受一个void *类型的参数，并返回一个void *类型的值。
void *arg：这是传递给线程函数的参数，可以是任意类型的指针。

pthread_create()函数在成功时返回0，失败时返回错误号。如果成功创建了线程，新线程将从start_routine指定的函数开始执行。

例子：

#include <iostream>
#include <string>
#include <functional>
#include <vector>
#include <time.h>
#include <unistd.h>
#include <pthread.h>// typedef std::function<void()> func_t;
using func_t = std::function<void()>;const int threadnum = 5;class ThreadData
{
public:ThreadData(const std::string &name, const uint64_t &ctime, func_t f): threadname(name), createtime(ctime), func(f){}public:std::string threadname;uint64_t createtime;func_t func;
};void Print()
{std::cout << "我是线程执行的大任务的一部分" << std::endl;
}// 新线程
void *ThreadRountine(void *args)
{int a = 10;ThreadData *td = static_cast<ThreadData *>(args);while (true){std::cout << "new thread"<< " thread name: " << td->threadname << " create time: " << td->createtime << std::endl;td->func();if(td->threadname == "thread-4"){std::cout << td->threadname << " 触发了异常!!!!!" << std::endl;// a /= 0; // 故意制作异常}sleep(1);}
}
// 如何给线程传参，如何创建多线程呢？？？ -- done
// 研究两个问题: 1. 线程的健壮性问题 2. 观察一下thread id// 获取返回值
// 主线程
int main()
{std::vector<pthread_t> pthreads;for (size_t i = 0; i < threadnum; i++){char threadname[64];snprintf(threadname, sizeof(threadname), "%s-%lu", "thread", i);pthread_t tid;ThreadData *td = new ThreadData(threadname, (uint64_t)time(nullptr), Print);pthread_create(&tid, nullptr, ThreadRountine, td);pthreads.push_back(tid);sleep(1);}std::cout << "thread id: ";for(const auto &tid: pthreads){std::cout << tid << ",";}std::cout << std::endl;while (true){std::cout << "main thread" << std::endl;sleep(3);}
}

解析：如上代码我们按照顺序进行解读：创建了一个ThreadData类，用于存储线程的名字、创建时间以及函数指针，接下来的Print()函数就是我们主要要传递的给ThreadData对象的函数，接下来的ThreadRountine函数则是用于传递给pthread_create()函数中的void *(*start_routine) (void *)函数指针变量，由于给线程执行。需要注意的是：其中有段代码是故意制作除0错误的，用于验证Linux使用多线程会造成健壮性降低的问题（只要其中一个线程出错误，那么其它线程也会收到影响，全部退出）。主函数中先是new出来ThreadData类型的对象，再将他通过pthread_create()函数中的*arg参数传递给新创建的线程。接着新线程执行对应的指令，主线程执行对应的指令。

`pthread_self()`

使用pthread_self()函数可以获取当前线程的ID。下面是pthread_self()函数的声明和用法示例：

#include <pthread.h>// 获取当前线程ID
pthread_t current_thread_id = pthread_self();

        在上面的示例中，pthread_self()函数被调用时，会返回当前线程的ID，并将其存储在current_thread_id变量中。

  pthread_self()函数通常用于多线程程序中，当需要获取当前线程的ID以进行一些特定的操作时使用。例如，可以使用当前线程的ID来区分不同线程的行为，或者将其作为参数传递给其他函数或数据结构。

        需要注意的是：pthread_self()函数只能获取当前线程的ID，不能用于获取其他线程的ID。如果需要获取其他线程的ID，可以使用pthread_equal()函数进行比较，或者将线程ID作为参数传递给其他函数进行处理。

在了解了这个函数后我们通过打印与ps -aL指令中的LWP做对比：

#include <iostream>
#include <pthread.h>
#include <unistd.h>void *ThreadRountine(void *args)
{usleep(1000);int a = 10;std::string name = static_cast<const char *>(args);while (true){std::cout << "i am a new thread, my name:" << name << " my id:" << pthread_self() << std::endl;sleep(1);}
}int main()
{pthread_t tid;pthread_create(&tid,nullptr,ThreadRountine,(void*)("thread -1"));while(true){std::cout << "i am man thread,"  << " my id:" << pthread_self() << std::endl;sleep(1);}return 0;
}

我们发现左半边为一串很大的数字，与右边完全不同：

接着将左半边的数字转换为16进制看看？我们发现他同地址很像，他们之间难道有什么关联吗？是的，因为thread id的本质就是一个地址！

如何终止线程？

通过return nullptr来线程终止

如下：在对应传入的函数中返回nullptr，需要特别注意：我们能使用exit()函数来退出吗？答案是不能！因为exit()是“进程终止”！如果调用，整个进程都会退出！

void *ThreadRountine(void *args)
{usleep(1000);int a = 10;std::string name = static_cast<const char *>(args);while (a--){std::cout << "i am a new thread, my name:" << name << " my id:" << ToHex(pthread_self()) << std::endl;sleep(1);}return nullptr;
}

通过pthread_exit()来线程终止

pthread_exit()是POSIX线程库中的一个函数，用于终止当前线程的执行。

下面是pthread_exit()函数的声明和用法示例：

#include <pthread.h>// 终止当前线程的执行
pthread_exit(nullptr);

在上面的示例中，pthread_exit()函数被调用时，会立即终止当前线程的执行，并返回到主线程或调用者。传递给pthread_exit()函数的参数是一个指向返回值的指针，这个返回值可以被其他线程通过pthread_join()函数获取。如果不需要传递返回值，可以传递nullptr作为参数。

需要注意的是：pthread_exit()函数不会释放线程所占用的资源，如堆栈、文件描述符等。这些资源的释放需要程序员手动进行。

通过pthread_cancel()来取消线程（先看后面的等待在回头看这里）

pthread_cancel()函数用于取消一个线程的执行。它的原型如下：

#include <pthread.h>
int pthread_cancel(pthread_t thread);

参数说明：

thread：需要取消执行的线程ID。

返回值：

成功时返回0；失败时返回错误码。

使用pthread_cancel()函数可以强制终止一个线程的执行，但需要注意的是：该函数并不会释放线程所占用的资源，如堆栈、线程描述符等。因此，在线程被取消后，还需要调用其他函数来回收这些资源。

下面是一个使用pthread_cancel()函数的例子：

#include <iostream>
#include <pthread.h>
#include <unistd.h>void* thread_function(void* arg) {int i;for (i = 0; i < 10; i++) {std::cout << "Thread is running..." << std::endl;sleep(1);}return NULL;
}int main() {pthread_t thread;int result = pthread_create(&thread, NULL, thread_function, NULL);if (result != 0) {std::cerr << "Error creating thread!" << std::endl;return 1;}sleep(3); // 让线程运行一段时间result = pthread_cancel(thread); // 取消线程执行if (result != 0) {std::cerr << "Error cancelling thread!" << std::endl;return 1;}// 等待线程结束，并回收资源void* retval;result = pthread_join(thread, &retval);if (result != 0) {std::cerr << "Error joining thread!" << std::endl;return 1;}std::cout << "Thread has been cancelled and joined successfully." << std::endl;return 0;
}

在这个例子中，我们创建了一个新线程，并在主线程中等待3秒钟后调用pthread_cancel()函数来取消该线程的执行。然后，我们使用pthread_join()函数等待线程结束，并回收其资源。

需要注意的是：线程如果是被分离的，他是可以被取消的，但是不能被join。thread线程以不同的方法终止,通过pthread_join得到的终止状态是不同的，总结如下:

如果thread线程通过return返回,value_ ptr所指向的单元里存放的是thread线程函数的返回值。
如果thread线程被别的线程调用pthread cancel异常终掉,value ptr所指向的单元里存放的是常数
PTHREAD_ CANCELED。
如果thread线程是自己调用pthread_exit终止的,value_ptr所指向的单元存放的是传给pthread_exit的参数。
如果对thread线程的终止状态不感兴趣,可以传NULL给value_ ptr参数。

线程等待

我们都知道进程退出，他的PCB不会立即释放，会处于僵尸状态，进程要等待。那么线程也需要等待吗？

是的，线程也是需要等待的！因为：线程退出没有等待，会导致累充进程的僵尸问题。我们可以通过pthread_join()来等待线程！

pthread_join()

pthread_join()函数用于等待一个线程的结束，并回收其资源。它的原型如下：

#include <pthread.h>
int pthread_join(pthread_t thread, void **retval);

参数说明：

thread：需要等待的线程ID。
retval：指向一个指针的指针，用于存储被等待线程的返回值。如果不关心返回值，可以设置为nullptr。（为什么是void ** 类型呢？因为我们通过pthread_create传入线程的函数的返回值是void *类型的返回值，我们用void **就可以接收这个函数的返回值，retval是一个输出型的参数）

返回值：

成功时返回0；失败时返回错误码。

例子1：

#include <iostream>
#include <pthread.h>
#include <unistd.h>void* print_hello(void* arg) {std::cout << "Hello from thread!" << std::endl;sleep(2);return NULL;
}int main() {pthread_t thread;void* retval;int ret;// 创建线程ret = pthread_create(&thread, NULL, print_hello, retval);if (ret != 0) {std::cerr << "Error creating thread!" << std::endl;return 1;}// 等待线程结束ret = pthread_join(thread, nullptr);if (ret != 0) {std::cerr << "Error joining thread!" << std::endl;return 1;}std::cout << "Thread joined successfully!" << std::endl;return 0;
}

在这个例子中，我们创建了一个新线程，该线程执行print_hello函数。然后，我们使用pthread_join()函数等待线程结束。当线程结束时，pthread_join()函数返回0，表示成功。

例子2：

#include <iostream>
#include <pthread.h>
#include <unistd.h>std::string ToHex(pthread_t tid)
{char id[64];snprintf(id, sizeof(id), "0x%lx", tid);return id;
}class ThreadReturn
{
public:ThreadReturn(pthread_t id, const std::string &info, int code): id_(id), info_(info), code_(code){}public:pthread_t id_;std::string info_;int code_;
};void *threadRoutine(void *arg)
{int cnt = 5;while (cnt--){std::cout << (const char *)arg << " is running..." << std::endl;sleep(1);}ThreadReturn *ret = new ThreadReturn(pthread_self(), "thread quit normal", 10);return ret;
}int main()
{// newpthread_t tid;pthread_create(&tid, nullptr, threadRoutine, (void *)("thread -1"));void *ret = nullptr;pthread_join(tid, &ret);ThreadReturn *r = static_cast<ThreadReturn *>(ret);std::cout << "main thread get new thread info:" << r->code_ << ", " << ToHex(r->id_) << ", " << r->info_ << std::endl;delete r;// mainwhile (true){std::cout << "i am man thread,"<< " my id:" << ToHex(pthread_self()) << std::endl;sleep(1);}return 0;
}

本例返回的是一个ThreadReturn *的值，运行效果如下：

pthread_detach()

线程默认的模式是joinable的，也就是说线程退出必须得等待，主线程必须是阻塞的等待新线程的。但是，我们也是可以设置为分离状态的！即:我们可以设置为非阻塞状态的，线程在退出时，对应的资源会被直接被回收。

pthread_detach()函数用于将线程设置为分离状态，从而在线程终止时自动回收其资源。它的原型如下：

#include <pthread.h>
int pthread_detach(pthread_t thread);

参数说明：

thread：需要设置为分离状态的线程ID。

返回值：

成功时返回0；失败时返回错误码。

需要注意的是：可以线程自己分离也可以由主线程分离。

例子：

#include <iostream>
#include <pthread.h>void* thread_function(void* arg) {// 线程执行的代码return nullptr;
}int main() {pthread_t thread;int result = pthread_create(&thread, nullptr, thread_function, nullptr);if (result != 0) {std::cerr << "Error creating thread!" << std::endl;return 1;}// 分离线程result = pthread_detach(thread);if (result != 0) {std::cerr << "Error detaching thread!" << std::endl;return 1;}// 主线程的其他操作return 0;
}

在这个例子中，我们首先创建了一个新线程，然后立即将其分离。这样，当线程结束时，它的资源会被自动回收，而不需要主线程显式等待其结束。

一个小拓展

我们都知道进程退出会有退出码，那线程退出会有退出码吗？答案是没有！因为如果线程因为异常终止了，那么整个进程也会跟着终止，所以不需要退出码！

线程id详解

pthread库知识补充

前面我们提到的对于线程控制的接口，实际上都不是系统直接提供的接口，而是原生线程库pthread（系统会自带这个库）提供的接口。图示如下：

通过pthread库，可以对线程进行管理，我们通过“先描述，在组织”的原则会在pthread库里面实现对应的结构体对象来描述，再通过一定的数据结构来组织。里面会包涵系统中“轻量级进程”的信息也会包涵用户的“用户级线程”信息。

clone()

clone()函数是Linux中的一个系统调用，用于创建新的执行线程或进程。它是fork()系统调用的泛化形式，具有更高的灵活性。实际上进程与线程的创建都是它的封装。以下是对clone()函数的详细解析：

1、函数原型：

#include <sched.h>
int clone(int (*fn)(void *), void *child_stack, int flags, void *arg, ... /* pid_t *pid, struct user_desc *tls, pid_t *ctid */);

2、参数说明：

- (*fn)(void *)：子进程（或线程）执行时调用的函数。
- child_stack：为子进程分配的堆栈指针。
- flags：控制新进程与原进程之间的共享资源以及其它行为的标志位集合。
- arg：传给子进程的参数，一般为0。
- ...：可选的附加参数，包括 pid_t *pid, struct user_desc *tls, pid_t *ctid。

3、flags参数详解：

- CLONE_PARENT：创建的子进程的父进程是调用者的父进程，使新进程与创建它的进程成为“兄弟”关系。
- CLONE_FS：子进程与父进程共享相同的文件系统，包括root、当前目录、umask等。

4、与fork()的区别：直接调用fork()等效于调用clone()时仅指定flags为SIGCHLD（共享信号句柄表）。fork()是Unix标准的复制进程的系统调用，而Linux实现了fork(), vfork(), clone三个系统调用。其中vfork()创造出来的是轻量级进程，也叫线程，是共享资源的进程。

5、使用场景：clone()通常用于实现多线程编程，因为它可以精细地控制哪些资源是共享的，哪些是私有的。这在多线程编程中是非常重要的，因为它允许创建高度定制的线程行为。

6、glibc封装：从Linux 2.3.3开始，glibc的fork()封装作为NPTL（Native POSIX Threads Library）线程实现的一部分。创建线程的函数pthread_create内部使用的也是clone函数。

系统调用问题

前面我们谈到线程虽然有很多的共享资源，但是也要有独立的属性，其中最重要的是：1、上下文。2、栈。

其中栈是每个新线程会在pthread库中维护的，而默认地址空间中的栈由主线程使用。前面在学习Linux动静态库的时候提到：加载库会将库加载到栈与堆之间的共享区中，pthread库当然也是，而库中的栈则是在其所属进程的虚拟地址空间中分配。

如何理解pthread库来管理线程

如下这张图想必大家都已经很熟悉了，我们在磁盘上存储的pthread库以及使用到pthread库的可执行程序都会被加载到物理内存中，然后通过页表映射到地址空间上。动态库，也叫共享库，只要在物理内存中映射了一次，之后都会被其他进程所共享。因此，pthread库会管理整个系统中所创建的进程！理解上：线程库是共享的，所以，内部要管理整个系统，多个用户所启动的所有线程。

如下为较为详细的pthread理解：其中mmap区域就是共享区，而其中动态库中struct_pthread可以理解为“先描述”，也可以理解为“TCB”。线程栈可以理解为一个指针，指向对应栈的地址。struct_pthread、线程局部存储和线程栈可以理解为一个一个的块，每一个线程都有。与接下来的一个一个的块被管理起来，可以理解为“在组织”。这些属性都会被库所维护。

如上struct_pthread会储存对应的退出信息，而我们的pthread_join()函数接口就是读取上面struct_pthread中的信息。如何找到的呢？我们是根据pthread_t tid来找到的，pthread_t tid就是线程属性集合在库中的地址！而其他pthread库中的接口也就是根据这个就是对库中的这些数据来进行维护的！

前面提到的LWP为什么和pthread_self()获得的不同？

        pthread_ create函数会产生一个线程ID，存放在第一个参数指向的地址中。该线程ID和前面说的线程ID不是一回事。

        前面讲的线程ID属于进程调度的范畴。因为线程是轻量级进程，是操作系统调度器的最小单位，所以需要一个数值来唯一表示该线程。

        pthread_ create函数第一个参数指向一个虚拟内存单元，该内存单元的地址即为新创建线程的线程ID，属于NPTL线程库的范畴。线程库的后续操作，就是根据该线程ID来操作线程的。

线程局部存储是啥？

__thread

__thread是GCC提供的线程局部存储（Thread-Local Storage, TLS）的关键字。

__thread用于声明线程局部变量，这意味着每个线程都会有该变量的一个独立实例。不同线程中的__thread变量互不干扰，各自保有自己独立的值，这对于多线程编程中需要为每个线程保持独立状态的场景非常有用。

具体来说，__thread关键字的使用场景和限制包括：

应用场景：适合修饰那些具有全局性质且值可能会发生变化的变量，但又不需要像全局变量那样进行保护的情况。
效率优势：__thread变量的存取效率可以与全局变量相媲美，这使得它在性能上非常有吸引力。
使用限制：只能修饰POD（Plain Old Data）类型，即那些不含有自定义构造、拷贝、赋值、析构函数的简单数据类型。因为__thread无法自动调用类的构造和析构函数，所以它不能用来修饰类类型的变量。
作用范围：可以用来修饰全局变量和函数内的静态变量，但不能修饰函数的局部变量或类的普通成员变量。
初始化限制：__thread变量的值只能初始化为编译器常量。

总的来说，在多线程编程中，__thread提供了一种方便高效的方式来为每个线程创建独立的变量副本，从而避免了共享数据带来的竞争条件和同步问题。

线性局部存储示例

如下为正常的全局变量在多线程情况下的示例：也印证了多线程共享资源的特性！

#include <iostream>
#include <string>
#include <cstdlib>
#include <unistd.h>
#include <sys/types.h>
#include <sys/syscall.h>
#include <pthread.h>using namespace std;int g_val = 100; // 全局变量，本身就是被所有线程共享的void *threadRoutine(void *args)
{std::string name = static_cast<const char *>(args);sleep(1);while (true){sleep(1);std::cout << name << ", g_val: " << g_val << " ,&g_val: " << &g_val << "\n"<< std::endl;g_val++;}return nullptr;
}int main()
{pthread_t tid;pthread_create(&tid, nullptr, threadRoutine, (void *)"thread1");while (true){sleep(1);std::cout << "main thread, g_val: " << g_val << " ,&g_val: " << &g_val << "\n"<< std::endl;}pthread_join(tid, nullptr);return 0;
}

如下为使用__thread的示例：可以发现本来共享的全局变量变成了线性局部的变量，值和地址都会变化！其中拓展了对于获取LWP通过调用系统调用 SYS_gettid 获取当前线程的 TID。

#include <iostream>
#include <string>
#include <cstdlib>
#include <unistd.h>
#include <sys/types.h>
#include <sys/syscall.h>
#include <pthread.h>using namespace std;// int g_val = 100; // 全局变量，本身就是被所有线程共享的
__thread int g_val = 100; // 线程的局部存储！有什么用？有什么坑？__thread pid_t lwp = 0;// __thread std::string threadname;pid_t gettid() {return syscall(SYS_gettid);
}void *threadRoutine(void *args)
{std::string name = static_cast<const char *>(args);lwp = gettid(); // 调用系统调用 SYS_gettid 获取当前线程的 TIDwhile (true){sleep(1);std::cout << name << ", g_val: " << g_val << " ,&g_val: " << &g_val << "\n"<< std::endl;std::cout <<"new thread: " << lwp << std::endl;g_val++;}return nullptr;
}int main()
{pthread_t tid;pthread_create(&tid, nullptr, threadRoutine, (void *)"thread1");lwp = gettid(); // 调用系统调用 SYS_gettid 获取当前线程的 TIDstd::cout <<"main thread: " << lwp << std::endl;while (true){sleep(1);std::cout << "main thread, g_val: " << g_val << " ,&g_val: " << &g_val << "\n"<< std::endl;}pthread_join(tid, nullptr);
}