大家好,我是烤鸭:
最近一直在研究全链路追踪,比如cat、skywalking、zipkin等。
发现 skywalking 是基于bytebuddy 实现的,想自己试着写一下demo。
demo的git地址,感兴趣的可以自己试下。代码在idea中可以跑,至于其他场景需要自己研究(比如用cmd或者linux可能会报NoClassDefDoundError)。
demo地址:(仅实现了http方式的链路,有需要的可以自己补充,比如dubbo或者其他rpc方式的拦截)
https://gitee.com/fireduck_admin/link-trace-demo
环境:
JDK 8
1. 设计目标
监控接口(方法)耗时和链路关系(http请求),对比aop方式,zipkin和cat 是基于拦截的形式。
2. bytebuddy
bytebuddy网上资料虽然不多,但是api比较简单,看看基本就会了。我也不介绍了。具体想看的去官网看下吧。
https://bytebuddy.net/
3. 接口耗时伪代码说明
由于测试链路,我们需要一个agent项目和demo项目(用于请求转发)。
agent项目创建拦截和写具体的拦截逻辑,这里以 拦截 Spring的service注解为例。这是在agent项目里的。
public static void premain(String agentArgs, Instrumentation inst) {System.out.println("==============Client=============== premain =============start============");AgentBuilder.Transformer transformerService = new AgentBuilder.Transformer() {@Overridepublic DynamicType.Builder<?> transform(DynamicType.Builder<?> builder, TypeDescription typeDescription, ClassLoader classLoader, JavaModule module) {return builder.method(ElementMatchers.<MethodDescription>any()) // 拦截任意方法.intercept(MethodDelegation.to(MyServiceAdvice.class)); // 委托}};// 拦截 ServiceAgentBuilder agentBuilder = agentBuilder.type(ElementMatchers.isAnnotatedWith(ElementMatchers.named("org.springframework.stereotype.Service"))) // 指定需要拦截的类.transform(transformerService);// 注入 instagentBuilder.installOn(inst);System.out.println("================Client============ premain ================finish===========");}
demo项目启动的时候需要在idea配置vm参数。不知道怎么配的看图。
-javaagent:\xx\xx\target\link-trace-demo-agent-1.0-SNAPSHOT.jar
拦截效果如图,这样就实现了接口(方法)调用的耗时统计。
4. 全链路伪代码说明
其实自从google在2010年提出了dapper论文后,后续的链路追踪基本都是按照这个思路来实现的,我这就是简易版。
agent拦截 controller 注解跟上面的service类似,就不贴代码了。
这里我们需要一个span对象,当前的请求信息记录在span对象(主要是谁调的你),并且放到threadlocal的调用堆栈中,这样当前的请求和线程就绑定了(方便单个服务内的流转,比如controller调service)。
这里截图可以看下只单独拦截了web,由于没有上游的信息,所以生成的新的span,seq为1。如图。
模拟下拦截web后调用web方法的链路信息。如图所示。pid(parentid)指的是上个链路的id,这样就可以获取到整个调用链的完整信息(出入参、时间、方法等)
再多链路,比如web-service-web或者更多服务的自己试下吧,思路就是这样的。
5. aop个agent的对比
只是单纯的统计aop方式和bytebuddy两种方式。aop底层有接口使用jdk 代理,无接口使用cglib(底层asm)。而bytebuddy底层也是 asm。
先放一张官方的对比图。
至于到底快不快,我试下。先链路的代码先注释,单纯调下接口试试。(测试方法在test包下)
先看下单次的:
单独调用controller的时候,bytebuddy明显快的,几乎没有损耗。
调用web+serivce的时候时间差不多。
下面单独调用两种情况,500次的平均值:
仅调用Controller:
AOP方式: 5.7 ms,主要损耗在首次调用。
[329, 12, 9, 6, 7, 5, 6, 5, 6, 7, 9, 8, 6, 5, 8, 8, 9, 22, 7, 11, 7, 5, 7, 6, 6, 6, 5, 6, 7, 10, 9, 9, 6, 8, 8, 13, 20, 10, 38, 7, 9, 8, 7, 10, 12, 9, 7, 7, 8, 9, 11, 11, 6, 7, 8, 5, 5, 5, 4, 5, 4, 3, 3, 4, 4, 4, 6, 4, 4, 4, 5, 6, 6, 7, 7, 6, 7, 5, 5, 8, 6, 7, 4, 7, 7, 6, 5, 5, 4, 4, 3, 4, 4, 4, 4, 5, 9, 4, 5, 5, 7, 4, 5, 11, 7, 7, 6, 9, 7, 22, 8, 14, 8, 4, 3, 4, 3, 3, 3, 4, 3, 4, 4, 5, 4, 5, 5, 11, 4, 4, 4, 4, 4, 7, 5, 8, 8, 7, 6, 6, 6, 7, 7, 4, 5, 3, 4, 4, 3, 4, 3, 4, 3, 4, 4, 5, 4, 4, 4, 4, 6, 10, 4, 6, 8, 7, 11, 12, 5, 8, 7, 6, 5, 4, 4, 5, 4, 5, 4, 4, 4, 4, 4, 4, 4, 4, 8, 4, 5, 8, 6, 12, 4, 8, 5, 8, 6, 7, 6, 9, 3, 4, 6, 4, 3, 4, 3, 3, 3, 3, 3, 3, 4, 5, 5, 4, 3, 3, 3, 4, 4, 5, 4, 6, 6, 5, 4, 7, 4, 9, 4, 5, 6, 5, 8, 5, 6, 4, 4, 4, 3, 3, 4, 4, 2, 3, 3, 3, 2, 3, 3, 3, 4, 3, 4, 5, 4, 3, 3, 3, 4, 4, 5, 4, 5, 4, 7, 4, 6, 5, 5, 6, 3, 3, 4, 4, 5, 5, 5, 4, 4, 3, 3, 3, 3, 3, 2, 3, 3, 3, 3, 2, 3, 3, 3, 2, 2, 3, 3, 3, 3, 2, 3, 6, 4, 4, 4, 4, 4, 7, 3, 4, 6, 3, 5, 7, 5, 6, 4, 6, 7, 5, 4, 7, 6, 3, 3, 3, 4, 3, 3, 2, 3, 3, 3, 3, 2, 3, 3, 4, 4, 3, 4, 3, 4, 5, 5, 4, 10, 12, 7, 7, 5, 10, 4, 6, 6, 5, 4, 6, 6, 5, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 3, 3, 3, 4, 13, 6, 11, 4, 6, 4, 4, 6, 4, 4, 5, 4, 4, 6, 4, 3, 4, 3, 4, 3, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 3, 3, 2, 2, 2, 3, 2, 3, 4, 3, 4, 5, 4, 4, 6, 5, 4, 6, 4, 4, 6, 5, 4, 7, 3, 4, 4, 5, 3, 4, 4, 3, 3, 12, 5, 3, 3, 3, 4, 3, 2, 2, 2, 3, 2, 3, 3, 4, 3, 4, 4, 2, 3, 5, 4, 3, 7, 4, 4, 5, 3, 12, 21, 15, 16, 7, 2, 3, 2, 3, 2, 3, 3, 3, 3, 3, 4, 2, 2, 3, 3, 2, 2, 2, 2, 2]
avg = OptionalDouble[5.706]
AGENT方式: 5.4 ms,主要损耗在首次调用。
[251, 9, 11, 12, 9, 9, 9, 9, 8, 11, 13, 6, 5, 5, 5, 6, 5, 5, 4, 6, 6, 6, 4, 10, 7, 6, 8, 7, 8, 6, 7, 7, 6, 6, 7, 6, 8, 5, 6, 5, 4, 4, 5, 5, 4, 4, 4, 4, 3, 4, 3, 5, 4, 3, 4, 4, 5, 5, 6, 5, 7, 6, 11, 10, 7, 9, 6, 5, 7, 5, 8, 6, 10, 7, 7, 7, 5, 5, 5, 5, 5, 5, 7, 5, 9, 19, 10, 7, 4, 3, 4, 4, 4, 3, 3, 4, 4, 6, 5, 4, 3, 5, 6, 4, 4, 5, 9, 10, 5, 3, 4, 5, 7, 4, 5, 10, 5, 8, 5, 11, 6, 6, 10, 4, 4, 5, 5, 7, 5, 4, 4, 7, 4, 4, 4, 5, 6, 4, 4, 5, 6, 6, 4, 6, 4, 4, 5, 8, 4, 5, 5, 4, 4, 5, 4, 4, 4, 7, 7, 10, 4, 4, 4, 3, 4, 3, 3, 4, 5, 4, 5, 8, 4, 5, 6, 7, 4, 4, 6, 5, 4, 7, 4, 4, 5, 5, 8, 5, 4, 6, 4, 9, 4, 5, 3, 3, 3, 4, 4, 3, 3, 4, 4, 4, 6, 5, 12, 6, 5, 5, 6, 8, 7, 14, 8, 5, 6, 6, 5, 5, 10, 5, 6, 5, 4, 4, 3, 4, 4, 4, 4, 5, 4, 6, 7, 4, 8, 5, 6, 4, 5, 7, 4, 6, 5, 5, 6, 3, 4, 3, 3, 4, 4, 3, 3, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 3, 4, 4, 4, 5, 3, 5, 7, 3, 4, 5, 4, 4, 7, 4, 5, 6, 5, 8, 4, 4, 5, 4, 3, 3, 3, 3, 4, 2, 3, 6, 4, 3, 3, 3, 3, 4, 4, 3, 3, 4, 4, 5, 8, 5, 6, 5, 10, 3, 4, 14, 5, 7, 3, 7, 3, 3, 8, 3, 4, 5, 3, 4, 4, 3, 3, 4, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 3, 4, 6, 6, 6, 7, 5, 7, 5, 5, 4, 4, 4, 3, 5, 5, 6, 4, 6, 3, 2, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 5, 4, 3, 3, 4, 4, 5, 8, 4, 6, 3, 5, 5, 4, 6, 8, 5, 3, 3, 4, 6, 7, 5, 8, 4, 4, 4, 5, 6, 5, 4, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 4, 7, 3, 7, 4, 6, 6, 3, 5, 7, 4, 5, 3, 5, 5, 4, 4, 5, 3, 3, 2, 5, 3, 26, 17, 8, 20, 9, 10, 3, 3, 4, 6, 3, 5, 6, 3, 5, 5, 4, 4, 5, 4, 4, 5, 3, 3, 3, 3, 2, 3, 2, 2, 2, 2, 2, 3, 2, 2, 2, 3, 3, 2, 2, 2, 2, 3, 2, 2, 2, 3, 3, 3, 2]
avg = OptionalDouble[5.438]
结论是在仅有cglib代理的时候(单独调用controller),耗时差不多,bytebuddy要稍快一些。
调用Controller+Service:
AOP方式: 5.7 ms,主要损耗在首次调用。
[328, 14, 8, 12, 11, 9, 8, 7, 8, 8, 13, 6, 6, 6, 7, 7, 6, 6, 8, 5, 7, 8, 11, 11, 9, 8, 9, 5, 12, 8, 8, 5, 5, 8, 6, 9, 6, 8, 6, 4, 4, 5, 5, 4, 5, 7, 7, 11, 6, 9, 5, 8, 10, 5, 10, 6, 6, 6, 7, 6, 5, 4, 4, 4, 4, 3, 5, 4, 5, 5, 5, 5, 6, 7, 9, 5, 8, 6, 7, 7, 6, 4, 5, 7, 5, 8, 5, 5, 5, 4, 5, 4, 5, 4, 6, 9, 8, 9, 6, 6, 5, 4, 7, 9, 4, 4, 10, 8, 25, 30, 29, 4, 5, 6, 6, 7, 4, 6, 5, 5, 6, 6, 9, 6, 10, 12, 8, 13, 8, 6, 6, 4, 5, 4, 3, 5, 3, 3, 3, 4, 3, 3, 4, 3, 3, 3, 3, 4, 4, 4, 6, 5, 5, 6, 4, 6, 4, 4, 6, 5, 4, 5, 6, 6, 5, 5, 5, 4, 4, 4, 6, 12, 7, 5, 6, 5, 4, 5, 4, 6, 9, 8, 10, 4, 3, 6, 4, 4, 4, 4, 11, 8, 5, 4, 14, 6, 7, 5, 6, 6, 5, 5, 9, 4, 6, 8, 5, 5, 5, 6, 3, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 2, 2, 3, 4, 4, 6, 14, 8, 6, 6, 4, 4, 9, 4, 6, 5, 6, 4, 4, 5, 6, 4, 4, 3, 4, 4, 4, 3, 5, 4, 3, 3, 5, 15, 11, 11, 6, 9, 5, 6, 5, 5, 7, 5, 8, 6, 7, 5, 4, 5, 5, 7, 4, 4, 4, 5, 5, 4, 4, 5, 9, 4, 6, 5, 4, 7, 5, 7, 6, 9, 7, 5, 7, 5, 5, 5, 5, 6, 6, 5, 6, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 3, 5, 3, 5, 5, 8, 5, 5, 6, 8, 5, 5, 4, 4, 6, 4, 4, 5, 4, 4, 4, 3, 4, 3, 4, 4, 3, 3, 3, 3, 2, 3, 3, 3, 4, 4, 3, 7, 4, 4, 8, 4, 7, 4, 3, 8, 4, 4, 6, 5, 3, 7, 3, 4, 5, 3, 3, 2, 3, 2, 3, 3, 3, 2, 3, 3, 3, 4, 7, 4, 6, 3, 3, 3, 3, 3, 3, 3, 4, 3, 3, 5, 4, 4, 4, 6, 4, 7, 4, 5, 6, 4, 4, 6, 5, 7, 4, 3, 6, 3, 3, 3, 3, 3, 2, 3, 2, 3, 3, 3, 3, 3, 3, 4, 3, 3, 2, 3, 4, 4, 4, 4, 4, 5, 4, 5, 3, 5, 4, 7, 4, 31, 5, 4, 5, 6, 5, 6, 5, 4, 5, 8, 4, 4, 4, 4, 4, 7, 4, 4, 4, 4, 4, 5, 4, 3, 8, 4, 32, 13, 37, 28, 5, 4, 3, 3, 3, 3, 2, 2, 3, 3, 3, 5, 6, 3, 3, 3, 4, 5, 3, 3, 3, 4]
avg = OptionalDouble[6.186]
AGENT方式: 5.89 ms,主要损耗在首次调用。
[306, 16, 7, 9, 10, 11, 9, 7, 8, 11, 18, 11, 8, 9, 11, 9, 6, 6, 6, 11, 7, 8, 10, 9, 15, 9, 7, 12, 22, 8, 6, 6, 5, 5, 5, 5, 4, 4, 5, 5, 5, 4, 5, 4, 8, 7, 11, 15, 17, 8, 9, 5, 6, 8, 7, 6, 9, 5, 4, 5, 5, 4, 4, 5, 4, 5, 4, 5, 4, 4, 6, 6, 10, 6, 6, 8, 5, 8, 6, 10, 6, 9, 7, 7, 6, 8, 6, 5, 6, 5, 5, 6, 7, 7, 6, 8, 6, 10, 8, 9, 6, 4, 5, 8, 4, 5, 6, 5, 33, 19, 19, 6, 5, 4, 6, 5, 6, 5, 7, 6, 6, 7, 7, 5, 6, 6, 5, 8, 5, 4, 4, 5, 4, 3, 3, 4, 3, 4, 3, 3, 3, 3, 3, 3, 4, 3, 5, 4, 4, 6, 6, 5, 7, 4, 5, 6, 4, 6, 4, 7, 6, 3, 6, 3, 3, 3, 4, 3, 4, 3, 5, 3, 4, 5, 4, 4, 3, 3, 4, 3, 8, 4, 11, 5, 6, 6, 9, 4, 5, 7, 5, 16, 6, 4, 6, 4, 4, 5, 3, 4, 2, 3, 3, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 6, 5, 6, 4, 5, 9, 5, 8, 4, 4, 5, 4, 8, 7, 8, 5, 5, 4, 5, 5, 7, 4, 6, 4, 4, 6, 4, 4, 4, 5, 4, 4, 5, 8, 18, 12, 5, 7, 5, 5, 5, 4, 4, 7, 5, 3, 3, 4, 3, 4, 3, 3, 3, 4, 3, 4, 3, 4, 3, 9, 4, 4, 4, 3, 3, 3, 5, 5, 4, 4, 5, 4, 5, 6, 4, 3, 4, 6, 4, 5, 7, 3, 5, 4, 4, 6, 4, 3, 4, 3, 3, 3, 3, 3, 2, 2, 3, 2, 3, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 7, 4, 5, 4, 4, 4, 4, 5, 4, 6, 4, 5, 7, 5, 6, 5, 3, 4, 4, 4, 3, 3, 4, 3, 3, 3, 3, 3, 3, 3, 2, 3, 2, 2, 3, 3, 3, 2, 3, 4, 4, 3, 4, 2, 6, 4, 4, 5, 3, 3, 5, 4, 5, 7, 7, 5, 12, 4, 5, 4, 3, 4, 5, 5, 3, 4, 4, 3, 3, 3, 3, 3, 3, 4, 3, 3, 3, 3, 3, 4, 4, 4, 6, 5, 6, 5, 5, 6, 4, 5, 6, 4, 7, 5, 5, 7, 6, 6, 4, 4, 5, 4, 5, 3, 3, 5, 4, 4, 4, 3, 4, 4, 4, 3, 3, 3, 4, 4, 3, 4, 4, 5, 4, 4, 4, 5, 4, 6, 4, 4, 6, 4, 4, 7, 4, 4, 6, 2, 2, 5, 3, 4, 4, 3, 3, 18, 7, 9, 10, 21, 28, 20, 12, 5, 4, 4, 3, 3, 5, 2, 3, 3, 2, 3, 2, 2, 2, 3, 3, 3]
avg = OptionalDouble[5.896]
结论是在仅有cglib代理+jdk proxy的时候(调用controller+service),耗时差不多,bytebuddy要稍快一些。
事实证明,确实这样,尤其是我只是测试一个简单接口,如果链路长的时候这个差距会更加明显。
6. 结论
事实上链路追踪的框架已经很很多了,选一款适合自己的就好,如果业务个性化需求比较多,自己开发也是一个不错的选择。你也看到了,自己写一个也没有那么复杂。如果选用开源框架的话,我推荐 skywalking,社区生态整体都挺好的,而且也方便二次开发,网上文章和文档也挺多的,就不多介绍了。
推荐几篇 javaagent 的文章:
https://www.jianshu.com/p/5c62b71fd882
https://www.jianshu.com/p/b72f66da679f
https://www.jianshu.com/p/7b2072513819