测试istio熔断管理。
采用httpbin镜像和fortio镜像,其中httpbin作为服务端,fortio是请求端。这两个的配置yaml文件都在istio的samples/httpbin目录下,fortio的配置文件在samples-client目录下。
[root@master httpbin]# ls
gateway-api httpbin-detinationrule.yaml httpbin-gateway.yaml httpbin-nodeport.yaml httpbin-vault.yaml httpbin.yaml README.md sample-client
启动httpbin和fortio的pod,fortio为用户提供了一个UI界面,但是fortio的服务默认是一个ClusterIP服务,因此需要修改,通过kubectl edit svc fortio来完成。获得nodePort端口后,在浏览器打开,获得以下页面:
其中URL为要访问的k8s服务名称,QPS是每秒的请求数量,Duration是持续时间,Thread是并发数,下面还可以勾选是http请求还是grpc请求。
设置好之后点击开始按钮,就可以得到请求的响应时间分布。
测试配置
- QPS: 每秒发出的请求数为 100(实际上达到了 99 个请求)。
- Connections: 使用了 1 个并发连接。
- Duration: 测试持续了 10秒。
- Jitter: 未使用 Jitter(请求发送间隔不随机)。
- Errors: 没有发生错误。
延迟统计
- 平均 (Average): 平均响应时间为 4.555ms。
- 百分位数 (Percentiles):
- 50th percentile (p50): 50% 的请求响应时间小于或等于 4.63ms。
- 75th percentile (p75): 75% 的请求响应时间小于或等于 5.34ms。
- 90th percentile (p90): 90% 的请求响应时间小于或等于 5.91ms。
- 99th percentile (p99): 99% 的请求响应时间小于或等于 7.4ms。
- 99.9th percentile (p99.9): 99.9% 的请求响应时间小于或等于 10.154ms。
- 最大 (Max): 最大响应时间为 10.804ms。
- 最小 (Min): 最小响应时间为 2.149ms。
测试熔断
设置并发请求数量为2,由于destinationRule的设置,只能有一个并发请求:
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:name: httpbin-destinationrule
spec:host: httpbintrafficPolicy:connectionPool:http:http1MaxPendingRequests: 1maxRequestsPerConnection: 1 tcp:maxConnections: 1outlierDetection:consecutiveGatewayErrors: 1interval: 1sbaseEjectionTime: 3mmaxEjectionPercent: 100
触发熔断机制:
可以看到503的返回代码代表错误,由于并发量为2,所以当两个请求同时到达时,有可能一个请求会被拒绝。
[root@master httpbin]# kubectl exec -it fortio-deploy-7c89478c84-wg9x6 -c fortio -- /usr/bin/fortio load -c 2 -qps 0 -n 20 -loglevel warning http://httpbin:8000/get
06:05:19 I logger.go:127> Log level is now 3 Warning (was 2 Info)
Fortio 1.11.3 running at 0 queries per second, 16->16 procs, for 20 calls: http://httpbin:8000/get
Starting at max qps with 2 thread(s) [gomax 16] for exactly 20 calls (10 per thread + 0)
06:05:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
06:05:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
06:05:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
06:05:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
06:05:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
06:05:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
06:05:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
06:05:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
06:05:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
06:05:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
06:05:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
06:05:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
06:05:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
Ended after 32.680055ms : 20 calls. qps=611.99
Aggregated Function Time : count 20 avg 0.0026904031 +/- 0.003126 min 0.000292639 max 0.008801973 sum 0.053808062
# range, mid point, percentile, count
>= 0.000292639 <= 0.001 , 0.000646319 , 55.00, 11
> 0.001 <= 0.002 , 0.0015 , 65.00, 2
> 0.003 <= 0.004 , 0.0035 , 70.00, 1
> 0.004 <= 0.005 , 0.0045 , 75.00, 1
> 0.006 <= 0.007 , 0.0065 , 80.00, 1
> 0.007 <= 0.008 , 0.0075 , 90.00, 2
> 0.008 <= 0.00880197 , 0.00840099 , 100.00, 2
# target 50% 0.000929264
# target 75% 0.005
# target 90% 0.008
# target 99% 0.00872178
# target 99.9% 0.00879395
Sockets used: 14 (for perfect keepalive, would be 2)
Jitter: false
Code 200 : 7 (35.0 %)
Code 503 : 13 (65.0 %)
Response Header Sizes : count 20 avg 80.5 +/- 109.7 min 0 max 230 sum 1610
Response Body/Total Sizes : count 20 avg 385.9 +/- 197.5 min 241 max 655 sum 7718
All done 20 calls (plus 0 warmup) 2.690 ms avg, 612.0 qps
测试超时
在生产环境中经常会碰到由于调用方等待下游的响应过长,堆积大量的请求阻塞了自身服务,造成雪崩的情况,通过通过超时处理来避免由于无限期等待造成的故障,进而增强服务的可用性,Istio 使用虚拟服务来优雅实现超时处理。
首先部署两个服务,一个nginx一个tomcat,并编写相应的virtualservice。
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:name: nginx-vs
spec:hosts:- nginx-svchttp:- route:- destination: host: nginx-svctimeout: 2s
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:name: tomcat-vs
spec:hosts:- tomcat-svchttp:- fault: delay:percentage:value: 100fixedDelay: 10sroute:- destination:host: tomcat-svc
修改nginx的pod,从而使得他作为tomcat服务的反向代理:
[root@master ~]# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-746868558-zpsns 2/2 Running 0 5h48m
tomcat-6df5fcfcc7-2zhqc 2/2 Running 0 5h46m
[root@master ~]# kubectl exec -it nginx-746868558-zpsns -- sh
/ # vi /etc/nginx/conf.d/default.conf
/ # nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
/ # exit
向 /etc/nginx/conf.d/default.conf文件中添加如下命令:
proxy_pass http://tomcat-svc:8080;
proxy_http_version 1.1;
编辑完后,再执行如下语句验证配置和让配置生效:
/ # nginx -t
/ # nginx -s reload
反向代理
正向代理指的是当客户端向服务器请求的时候,客户端可以通过代理服务器,此时后端的服务不知道具体哪个客户发起的请求,而只与代理服务器交互。正向代理部署在客户端,而反向代理的服务部署在服务器端,客户只和反向代理服务器交互,而不管反向代理将请求发送给哪个服务。
在上面的操作中,将nginx作为一个反向代理,将到达nginx的请求转发到tomcat-svc。
超时
在virtualservice的设置中,nginx的规则是两秒超时返回,而tomcat设置了延时,10秒才会响应,因此会触发超时警报。运行如下代码,可以看到每隔2s就返回一条超时警告。
[root@master timeout]# kubectl run busybox --image=busybox:1.28 --restart=Never --rm -it -- shIf you don't see a command prompt, try pressing enter.
E1005 21:00:31.642947 43296 websocket.go:296] Unknown stream id 1, discarding message/ # time wget -q -O - http://nginx-svc
wget: server returned error: HTTP/1.1 504 Gateway Timeout
Command exited with non-zero status 1
real 0m 2.01s
user 0m 0.00s
sys 0m 0.00s/ # while true; do wget -q -O - http://nginx-svc; done
wget: server returned error: HTTP/1.1 504 Gateway Timeout
wget: server returned error: HTTP/1.1 504 Gateway Timeout
wget: server returned error: HTTP/1.1 504 Gateway Timeout
wget: server returned error: HTTP/1.1 504 Gateway Timeout
测试重试
针对以上部署的两个服务,重写virtualservice,从而配置新的反向代理规则。
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:name: nginx-vs
spec:hosts:- nginx-svchttp:- route:- destination: host: nginx-svcretries:attempts: 2perTryTimeout: 2s
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:name: tomcat-vs
spec:hosts:- tomcat-svchttp:- fault:abort:percentage:value: 100httpStatus: 503route:- destination:host: tomcat-svc
以上规则设置了nginx服务重试三次,tomcat服务abort所有的请求并返回503状态码。
接下来打开一个busybox,并在pod中请求nginx服务:
[root@master ~]# kubectl run busybox --image=busybox:1.28 --restart=Never --rm -it -- sh
If you don't see a command prompt, try pressing enter./ # wget -q -O - http://nginx-svc
wget: server returned error: HTTP/1.1 503 Service Unavailable
可以看到返回了请求失败的状态码503,同时在另一个终端查看nginx的日志,发现8条请求,这是因为nginx请求tomcat的时候被拒绝,记录一次,而本身nginx的istio-proxy也会记录一条。所以其实是发起了四次请求,包括第一次请求失败后又发起的三次重试。
[root@master timeout]# kubectl logs -f nginx-746868558-zpsns -c istio-proxy
##############################
[2024-10-05T13:42:00.432Z] "GET / HTTP/1.1" 503 FI fault_filter_abort - "-" 0 18 0 - "-" "Wget" "e429b140-737f-92ea-9248-93bfc0891399" "tomcat-svc:8080" "-" outbound|8080||tomcat-svc.default.svc.cluster.local - 10.101.63.23:8080 10.244.104.61:54528 - -
[2024-10-05T13:42:00.430Z] "GET / HTTP/1.1" 503 - via_upstream - "-" 0 18 1 0 "-" "Wget" "e429b140-737f-92ea-9248-93bfc0891399" "nginx-svc" "10.244.104.61:80" inbound|80|| 127.0.0.6:58994 10.244.104.61:80 10.244.104.1:44584 invalid:outbound_.80_._.nginx-svc.default.svc.cluster.local default
[2024-10-05T13:42:00.454Z] "GET / HTTP/1.1" 503 FI fault_filter_abort - "-" 0 18 0 - "-" "Wget" "e429b140-737f-92ea-9248-93bfc0891399" "tomcat-svc:8080" "-" outbound|8080||tomcat-svc.default.svc.cluster.local - 10.101.63.23:8080 10.244.104.61:54532 - -
[2024-10-05T13:42:00.454Z] "GET / HTTP/1.1" 503 - via_upstream - "-" 0 18 0 0 "-" "Wget" "e429b140-737f-92ea-9248-93bfc0891399" "nginx-svc" "10.244.104.61:80" inbound|80|| 127.0.0.6:58994 10.244.104.61:80 10.244.104.1:45216 invalid:outbound_.80_._.nginx-svc.default.svc.cluster.local default
[2024-10-05T13:42:00.471Z] "GET / HTTP/1.1" 503 FI fault_filter_abort - "-" 0 18 0 - "-" "Wget" "e429b140-737f-92ea-9248-93bfc0891399" "tomcat-svc:8080" "-" outbound|8080||tomcat-svc.default.svc.cluster.local - 10.101.63.23:8080 10.244.104.61:54536 - -
[2024-10-05T13:42:00.471Z] "GET / HTTP/1.1" 503 - via_upstream - "-" 0 18 0 0 "-" "Wget" "e429b140-737f-92ea-9248-93bfc0891399" "nginx-svc" "10.244.104.61:80" inbound|80|| 127.0.0.6:58994 10.244.104.61:80 10.244.104.1:45220 invalid:outbound_.80_._.nginx-svc.default.svc.cluster.local default
[2024-10-05T13:42:00.492Z] "GET / HTTP/1.1" 503 FI fault_filter_abort - "-" 0 18 0 - "-" "Wget" "e429b140-737f-92ea-9248-93bfc0891399" "tomcat-svc:8080" "-" outbound|8080||tomcat-svc.default.svc.cluster.local - 10.101.63.23:8080 10.244.104.61:54540 - -
[2024-10-05T13:42:00.492Z] "GET / HTTP/1.1" 503 - via_upstream - "-" 0 18 0 0 "-" "Wget" "e429b140-737f-92ea-9248-93bfc0891399" "nginx-svc" "10.244.104.61:80" inbound|80|| 127.0.0.6:33849 10.244.104.61:80 10.244.104.1:45224 invalid:outbound_.80_._.nginx-svc.default.svc.cluster.local default
[2024-10-05T13:47:57.692Z] "GET / HTTP/1.1" 503 FI fault_filter_abort - "-" 0 18 0 - "-" "Wget" "23bfca55-6e47-9835-8175-accdf2a6a664" "tomcat-svc:8080" "-" outbound|8080||tomcat-svc.default.svc.cluster.local - 10.101.63.23:8080 10.244.104.61:55482 - -
[2024-10-05T13:47:57.691Z] "GET / HTTP/1.1" 503 - via_upstream - "-" 0 18 1 0 "-" "Wget" "23bfca55-6e47-9835-8175-accdf2a6a664" "nginx-svc" "10.244.104.61:80" inbound|80|| 127.0.0.6:49218 10.244.104.61:80 10.244.104.3:58492 invalid:outbound_.80_._.nginx-svc.default.svc.cluster.local default
[2024-10-05T13:47:57.705Z] "GET / HTTP/1.1" 503 FI fault_filter_abort - "-" 0 18 0 - "-" "Wget" "23bfca55-6e47-9835-8175-accdf2a6a664" "tomcat-svc:8080" "-" outbound|8080||tomcat-svc.default.svc.cluster.local - 10.101.63.23:8080 10.244.104.61:55486 - -
[2024-10-05T13:47:57.704Z] "GET / HTTP/1.1" 503 - via_upstream - "-" 0 18 0 0 "-" "Wget" "23bfca55-6e47-9835-8175-accdf2a6a664" "nginx-svc" "10.244.104.61:80" inbound|80|| 127.0.0.6:49218 10.244.104.61:80 10.244.104.3:58496 invalid:outbound_.80_._.nginx-svc.default.svc.cluster.local default
[2024-10-05T13:47:57.732Z] "GET / HTTP/1.1" 503 FI fault_filter_abort - "-" 0 18 0 - "-" "Wget" "23bfca55-6e47-9835-8175-accdf2a6a664" "tomcat-svc:8080" "-" outbound|8080||tomcat-svc.default.svc.cluster.local - 10.101.63.23:8080 10.244.104.61:55490 - -
[2024-10-05T13:47:57.732Z] "GET / HTTP/1.1" 503 - via_upstream - "-" 0 18 1 0 "-" "Wget" "23bfca55-6e47-9835-8175-accdf2a6a664" "nginx-svc" "10.244.104.61:80" inbound|80|| 127.0.0.6:49218 10.244.104.61:80 10.244.104.3:58500 invalid:outbound_.80_._.nginx-svc.default.svc.cluster.local default
#############