环境:Linux version 5.4.0-1084-aws (buildd@lcy02-amd64-044) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #91~18.04.1-Ubuntu SMP Sun Aug 14 01:24:43 UTC 2022
JDK: 1.8.0_241
CPU分配是如何工作的?
如果您有或提供了一个,该库将读取您的 /proc/cpuinfo,它将确定您的 CPU 布局。 如果没有,它会假设每个 CPU 都在一个 CPU 插槽上。
该库通过查看默认情况下未运行的 CPU 来查找隔离的 CPU。 即,如果您有 16 个 CPU,但其中 8 个不可用于一般用途(由启动时进程的亲和力决定),它将开始分配给这些 CPU。
注意:如果有多个进程使用该库,则需要指定该进程可以使用哪些 CPU,否则它将为两个进程分配相同的 CPU。 要控制进程可以使用哪些 CPU,请将 -Daffinity.reserved={cpu-mask-in-hex} 添加到进程的命令行。
注意:CPU 0 是为操作系统保留的,它必须在某个地方运行。
服务器配置
配置isolcpus
两种方法:修改/etc/default/grub 中的配置,或添加如下配置之一
- GRUB_CMDLINE_LINUX_DEFAULT=“quiet splash isolcpus=1,3”(这里表示1和3两个cpu被隔离,cpu序号从0开始)
- GRUB_CMDLINE_LINUX=“isolcpus=1,3”
刷新isolcpus配置
两种方法:更新/boot/grub/grub.cfg文件
方法1
sudo update-grub
- 该方法是/etc/default/grub配置文件中第一行注释提供的方法:# If you change this file, run ‘update-grub’ afterwards to update
- 如果提示update-grub命令找不到,安装一下:sudo apt-get update; sudo apt-get install --reinstall grub
- 查看/boot/grub/grub.cfg文件的时间戳确认是否更新成功
方法2
sudo grub-mkconfig -o /boot/grub/grub.cfg
重启系统
验证配置是否生效
方法1:查看启动参数
查看/proc/cmdline里是不是有isolcpu参数
方法2:查看进程的cpu affinity
例如查看进程1的cpu affinity。服务器位32核,当前没有cpu隔离
taskset -cp 1
### 输出
pid 1's current affinity list: 0-31### 查看当前cpu隔离
taskset -cp $$
输出结果有可能是16进制,例如:ffffffff。也就是没有cpu隔离,二进制对应(32位1):11111111111111111111111111111111
如果cpu5被隔离,那么16进制对应:ffffffef,二进制对应:11111111111111111111111111101111
查看所有进程cpu分配情况
ps -eo 'pid,cmd,psr'
查看所有线程cpu分配情况
ps -To 'pid,lwp,psr,cmd'
irqbalance配置
目的:禁用“隔离cpu”的IRQ,可以减少IRQ中断导致的用户与内核之间的上下文切换。进一步提升性能
IRQ是什么
wiki百科解释
在计算机中,中断请求(或 IRQ)是发送到处理器的硬件信号,该信号暂时停止正在运行的程序并允许特殊程序(中断处理程序)运行。
In a computer, an interrupt request (or IRQ) is a hardware signal sent to the processor that temporarily stops a running program and allows a special program, an interrupt handler, to run instead.
网卡与操作系统的交互一般有两种方式:
IRQ(Interrupt Request 中断请求):网卡在收到了网络信号之后,主动发送中断到CPU。而CPU将会立即停下手边的活以便对这个中断信号进行分析
DMA(Direct Memory Access 直接存储器访问):允许硬件在无 CPU 干预的情况下将数据缓存在指定的内存空间内,在CPU合适的时候才处理
配置
官方文档说明较少
In /etc/sysconfig/irqbalance or /etc/defaults/irqbalance or /etc/default/irqbalance you can set:
IRQBALANCE_BANNED_CPUS=CC
In the case this is not working, turn off irqbalance instead using the following command:
## 关闭IRQ均衡,只在cpu0上进行IRQ。参考:https://serverfault.com/questions/380935/how-to-ban-hardware-interrupts-with-irqbalance-banned-cpus-on-ubuntu
$ sudo chkconfig irqbalance off
serverfault配置案例:
This is a hex mask without the leading ‘0x’, on systems with large numbers of processors each group of eight hex digits is sepearated ba a comma ‘,’. i.e. export IRQBALANCE_BANNED_CPUS=fc0 would prevent irqbalance from assigning irqs to the 7th-12th cpus (cpu6-cpu11) or export IRQBALANCE_BANNED_CPUS=ff000000,00000001 would prevent irqbalance from assigning irqs to the 1st (cpu0) and 57th-64th cpus (cpu56-cpu63).
So, you’d create /etc/default/irqbalance with the contents
ENABLED="1"
ONESHOT="0"
IRQBALANCE_BANNED_CPUS="3f"
To see what is going on, try
$ sudo service irqbalance stop
Stopping SMP IRQ Balancer: irqbalance.
$ source /etc/default/irqbalance
$ sudo irqbalance --debug
查看interrupts
cat /proc/interrupts
Java代码开发
pom
<dependency><groupId>net.openhft</groupId><artifactId>affinity</artifactId><version>3.23.3</version>
</dependency>
测试案例
单线程
默认按照任意cpu分配
import lombok.SneakyThrows;import lombok.extern.slf4j.Slf4j;import net.openhft.affinity.Affinity;
import net.openhft.affinity.AffinityLock;import java.util.*;@Slf4j
public class ThreadAffinityTest {private static volatile boolean flag = true;@SneakyThrowspublic static void main(String[] args) {test();}@SneakyThrowsprivate static void test() {Thread t = new Thread(() -> {try (final AffinityLock al = AffinityLock.acquireLock()) {while (flag) {// 返回当前线程亲和掩码BitSet bitSet = Affinity.getAffinity();// 返回当前线程IDlong threadId = Affinity.getThreadId();// 返回当前cpuIdint cpuId = Affinity.getCpu();}}});t.start();Thread.sleep(10_000);flag = false;Thread.sleep(1_000);log.info("t : {}", t.getState());}
}
输出日志
[2024-03-30 15:53:29.823][INFO ][Thread-0 ] net.openhft.affinity.AffinityLock - Assigning cpu 7 to Thread[Thread-0,5,main] on thread id 978944
[2024-03-30 15:53:39.339][INFO ][Thread-0 ] net.openhft.affinity.LockInventory - Releasing cpu 7 from Thread[Thread-0,5,main]
[2024-03-30 15:53:40.344][INFO ][main ] ...ThreadAffinityTest - t : TERMINATED
线程池
Disruptor<BaseEventWrapper> disruptor = new Disruptor<>(BaseEventWrapper::new, 8192,new AffinityThreadFactory("MY-EVENT-HANDLER-EXECUTOR"),ProducerType.MULTI,new BusySpinWaitStrategy());
为进程指定预保留的cpu
这可用于尝试保留 CPU,或者如果您有多个程序需要保留 CPU,这可确保它们不会尝试使用相同的 CPU。
## 不管有没有这个,您都可以使用以下命令设置保留 CPU 的位掩码:
-Daffinity.reserved=cc
申请整个core
注意,该用法会导致超线程空闲
You can reserve a whole core. If you have hyper-threading enabled, this will use one CPU and leave it’s twin CPU unused.
try (AffinityLock al = AffinityLock.acquireCore()) {// do some work while locked to a CPU.
}
控制cpu布局
在此示例中,库将优先选择与第一个线程相同的 Socket(插槽) 上的空闲 CPU,否则它将选择任何空闲 CPU。
try (final AffinityLock al = AffinityLock.acquireLock()) {System.out.println("Main locked");Thread t = new Thread(new Runnable() {@Overridepublic void run() {try (AffinityLock al2 = al.acquireLock(AffinityStrategies.SAME_SOCKET,AffinityStrategies.ANY)) {System.out.println("Thread-0 locked");}}});t.start();
}
API指定cpu执行
调用api:setAffinity
long currentAffinity = AffinitySupport.getAffinity();
Affinity.setAffinity(1L << 5); // lock to CPU 5
参考
https://github.com/OpenHFT/Java-Thread-Affinity?tab=readme-ov-file
https://github.com/peter-lawrey/Java-Thread-Affinity/wiki/Getting-started
https://serverfault.com/questions/380935/how-to-ban-hardware-interrupts-with-irqbalance-banned-cpus-on-ubuntu