场景
在系统上线后,经常会遇到运维的同学跑过来说:“这次发版后,cpu线程使用率到一场,到100%了”。这时候不要慌,可以使用堆转储来分析到底是哪个线程引起的。
查找元凶
[root@java_mofei_01 test]# top
Mem: 16333644k total, 9472968k used, 6860676k free, 165616k buffers
Swap: 0k total, 0k used, 0k free, 6665292k cachedPID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17850 root 20 0 7588m 112m 11m S 100.7 0.7 47:53.80 java 1552 root 20 0 121m 13m 8524 S 0.7 0.1 14:37.75 AliYunDun 3581 root 20 0 9750m 2.0g 13m S 0.7 12.9 298:30.20 java 1 root 20 0 19360 1612 1308 S 0.0 0.0 0:00.81 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root RT 0 0 0 0 S 0.0 0.0 0:00.14 migration/0
发现pid=17850的进程线程使用率100%,查询进程中哪个线程导致的问题
[root@java_mofei_01 test]# top -H -p 17850
top - 17:43:15 up 5 days, 7:31, 1 user, load average: 0.99, 0.97, 0.91
Tasks: 32 total, 1 running, 31 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.7%us, 8.9%sy, 0.0%ni, 87.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 16333644k total, 9592504k used, 6741140k free, 165700k buffers
Swap: 0k total, 0k used, 0k free, 6781620k cachedPID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17880 root 20 0 7588m 112m 11m R 99.9 0.7 50:47.43 java
17856 root 20 0 7588m 112m 11m S 0.3 0.7 0:02.08 java
17850 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17851 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.23 java
17852 root 20 0 7588m 112m 11m S 0.0 0.7 0:02.09 java
17853 root 20 0 7588m 112m 11m S 0.0 0.7 0:02.12 java
17854 root 20 0 7588m 112m 11m S 0.0 0.7 0:02.07 java
17855 root 20 0 7588m 112m 11m S 0.0 0.7 0:02.06 java
17857 root 20 0 7588m 112m 11m S 0.0 0.7 0:02.07 java
17858 root 20 0 7588m 112m 11m S 0.0 0.7 0:02.08 java
17859 root 20 0 7588m 112m 11m S 0.0 0.7 0:02.04 java
17860 root 20 0 7588m 112m 11m S 0.0 0.7 0:02.05 java
17861 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17862 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17863 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17864 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.10 java
17865 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.12 java
17866 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.09 java
17867 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.12 java
17868 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17869 root 20 0 7588m 112m 11m S 0.0 0.7 0:01.04 java
17870 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17871 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17872 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17873 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17874 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17875 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17876 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17877 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17878 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17879 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
17946 root 20 0 7588m 112m 11m S 0.0 0.7 0:00.00 java
发现17880的线程导致cpu飚高。
通过jstack查看线程信息
[root@java_mofei_01 test]# printf "%x\n" 17880
45d8
[root@java_mofei_01 test]# jstack 17850|grep 45d8 -A 30
"pool-1-thread-11" #20 prio=5 os_prio=0 tid=0x00007fc860352800 nid=0x45d8 runnable [0x00007fc8417d2000]java.lang.Thread.State: RUNNABLEat java.io.FileOutputStream.writeBytes(Native Method)at java.io.FileOutputStream.write(FileOutputStream.java:326)at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)- locked <0x00000006c6c2e708> (a java.io.BufferedOutputStream)at java.io.PrintStream.write(PrintStream.java:482)- locked <0x00000006c6c10178> (a java.io.PrintStream)at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104)- locked <0x00000006c6c26620> (a java.io.OutputStreamWriter)at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:185)at java.io.PrintStream.write(PrintStream.java:527)- eliminated <0x00000006c6c10178> (a java.io.PrintStream)at java.io.PrintStream.print(PrintStream.java:597)at java.io.PrintStream.println(PrintStream.java:736)- locked <0x00000006c6c10178> (a java.io.PrintStream)at com.demo.guava.HardTask.call(HardTask.java:18)at com.demo.guava.HardTask.call(HardTask.java:9)at java.util.concurrent.FutureTask.run(FutureTask.java:266)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at java.lang.Thread.run(Thread.java:745)"pool-1-thread-10" #19 prio=5 os_prio=0 tid=0x00007fc860345000 nid=0x45d7 waiting on condition [0x00007fc8418d3000]java.lang.Thread.State: WAITING (parking)at sun.misc.Unsafe.park(Native Method)- parking to wait for <0x00000006c6c14178> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
- 首先把pid 17880转成16进制
- 查询堆信息
- 发现HardTask(源码在末尾)第18行可能有问题
源码
public class HardTask implements Callable<String> {@Overridepublic String call() throws Exception {while(true){int a=100;int b=100;int rs=a+b/100-b;System.out.println(rs);}}
}public class SimpleTask implements Callable<String> {@Overridepublic String call() throws Exception {Thread.sleep(300);//什么都不做return null;}
}public class CpuTest {public static void main(String[] args){ExecutorService executorService=Executors.newFixedThreadPool(11);for (int i = 0; i < 10; i++) {executorService.submit(new SimpleTask());}executorService.submit(new HardTask());}
}
源码很简单,就是创建了2种线程HardTask和SimpleTask。然后通过CpuTest进行启动。
转载于:https://blog.51cto.com/4436396/2118787