服务熔断保护实践--Hystrix

概述

微服务有很多互相调用的服务，构成一系列的调用链路，如果调用链路中某个服务失效或者网络堵塞等问题，而有较多请求都需要调用有问题的服务时，这是就会造成多个服务的大面积失效，造成服务“雪崩”效应。

服务“雪崩”的根本原因在于服务之间的强依赖，为了预防服务“雪崩”这一问题，可以做好服务隔离、服务熔断降级、服务限流。

服务隔离：当某个服务故障时，不波及其他模块，不影响整体服务。

服务熔断：当下游服务因为请求压力过大造成响应慢或响应失败时，上游服务为了保护系统，暂时切断对下游服务的调用，直接返回一个降级的内容，从而保全整体系统。

服务限流：限制系统的输入和输出达到保护系统的目的，例如：限制请求速率，超出的请求不处理或者暂缓处理或降级处理。

本文介绍的服务熔断组件是Hystrix。

本文的操作是在微服务负载均衡实践的基础上进行。

环境说明

jdk1.8

maven3.6.3

mysql8

spring cloud2021.0.8

spring boot2.7.12

idea2022

步骤

在请求的接口类使用hystrix的服务熔断

1.添加依赖

在order-service里添加依赖

<dependency><groupId>org.springframework.cloud</groupId><artifactId>spring-cloud-starter-netflix-hystrix</artifactId><version>2.2.10.RELEASE</version>
</dependency>

在spring cloud 的2021.0.8版本里，找不到netflix-hystrix了，引入依赖需要单独添加版本号

2.激活hystrix

在order-service启动类上方添加注解@EnableHystrix注解

@EnableHystrix
public class OrderApplication {

3.降级处理

修改OrderController类

（1）在OrderController添加降级方法

/*** 降级方法*  和需要受到保护的方法的 返回值一致、方法参数一致*/
public Product orderFallBack(Long id){Product product = new Product();product.setProductName("触发降级方法");return product;
}

（2）在需要受到保护的方法上使用@HystrixCommand配置

/*** 使用注解配置熔断保护*   fallbackmethod：配置熔断之后的降级方法*/
@HystrixCommand(fallbackMethod = "orderFallBack")
@RequestMapping(value = "/buy/{id}", method = RequestMethod.GET)
public Product findById(@PathVariable Long id){

4.模拟网络延迟

在product-service的服务方法，添加2秒睡眠时间

		try {Thread.sleep(2000l);//模拟网络延迟} catch (InterruptedException e) {throw new RuntimeException(e);}

5.Hystrix配置

order-service的application.yml中添加如下配置

hystrix:command:default:execution:isolation:strategy: ExecutionIsolationStrategy.SEMAPHORE #信号量隔离#strategy: # ExecutionIsolationStrategy.THREAD 线程池隔离thread:timeoutInMilliseconds: 2000 #默认的连接超时时间1秒,若1秒没有返回数据,自动的触发降级逻辑circuitBreaker:requestVolumeThreshold: 5 #触发熔断的最小请求次数，默认20 /10秒sleepWindowInMilliseconds: 10000 #熔断多少秒后去尝试请求 默认 5   打开状态的时间errorThresholdPercentage: 50 #触发熔断的失败请求最小占比，默认50%

6.测试

启动eureka、product、order服务

浏览器访问

http://localhost:9002/order/buy/1

因为product服务方法里模拟请求网络延迟，设置有2s的睡眠时间，加上到数据库请求数据的一些耗时过程，总的请求时间将大于2s，而线程超时时间设置为2s，所以会触发降级方法。

把线程超时时间调大，例如：调整为6秒

重新启动order-service，浏览访问测试，能正常访问到数据。

7.统一的降级方法

如果每个方法都写一个降级方法，方法多的时候，很麻烦，可以统一指定降级方法。

修改OrderController类，添加统一降级方法

	/*** 指定统一的降级方法*   注意方法没有参数*/public Product defaultFallBack(){Product product = new Product();product.setProductName("触发统一的降级方法");return product;}

在OrderController类上方添加@DefaultProperties注解

@DefaultProperties(defaultFallback = "defaultFallBack")
public class OrderController {

修改findById方法上方的@HystrixCommand注解，将
@HystrixCommand(fallbackMethod = "orderFallBack")
改为
@HystrixCommand

修改order-service的application.yml配置，将线程超时时间改为2000

重启order服务

测试

触发了统一的降级方法，说明统一的服务降级方法生效。

Feign结合hystrix的服务熔断

1.复制服务

复制order-service得到order-service-feign_hystrix（注意：在idea里直接复制会有问题。在文件资源管理器里复制才不会出现问题。）

修改order-service-feign_hystrix的pom.xml，修改artifactId为如下

<artifactId>order-service-feign_hystrix</artifactId>

在父工程pom.xml添加一个模块

<module>order-service-feign_hystrix</module>

在order-service-feign_hystrix服务操作

2.feign中开启hystrix及相关配置

修改order-service-feign_hystrix服务application.yml在Fegin中开启hystrix

# 在feign中开启hystrix熔断
feign:circuitbreaker:enabled: true

修改端口号及服务名称

server:port: 9003
spring:application:name: service-order-feign_hustrix

hystrix设置

hystrix:command:default:execution:isolation:thread:timeoutInMilliseconds: 2000 #默认的连接超时时间1秒,若1秒没有返回数据,自动的触发降级逻辑circuitBreaker:enabled: truerequestVolumeThreshold: 5errorThresholdPercentage: 10sleepWindowInMilliseconds: 10000

3.接口实现类中实现降级逻辑

编写接口实现类，编写熔断降级方法

package org.example.order.feign;import org.example.order.entity.Product;
import org.springframework.stereotype.Component;@Component
public class ProductFeginClientCallBack implements ProductFeignClient{/*** 降级方法*/@Overridepublic Product findById(Long id) {Product product = new Product();product.setProductName("触发Feign熔断降级方法");return product;}
}

4.接口注解申明降级类

修改ProductFeignClient接口类

在@FeignClient添加使用降级的方法所在的类 fallback = ProductFeginClientCallBack.class

@FeignClient(name = "service-product", fallback = ProductFeginClientCallBack.class)
public interface ProductFeignClient {/*** 配置需要调用的微服务接口* @return*/@RequestMapping(value = "/product/{id}", method = RequestMethod.GET)Product findById(@PathVariable("id") Long id);
}

5.注释或删除之前的Hystrix相关代码

注释或删除启动类的@EnableHystrix注解

//@EnableHystrix
public class OrderApplication {

注释或删除OrderController之前的Hystrix相关的内容

6.启动及测试

启动eureka、product、order（9003）服务

浏览器访问

http://localhost:9003/order/buy/1

请求响应时间超过线程时间限制，触发了Feign熔断降级方法。

修改order的application.yml配置文件，设置线程超时时间为6000毫秒

重启order（9003）服务

浏览器访问

返回时间小于设置的超时时间，不触发降级方法，返回正常数据。

Feign实现服务熔断总结：

1.配置添加开启服务熔断，并设置hystrix超时参数

2.添加ProductFeignClient接口的实现类，在实现类里实现降级逻辑。

3.在ProductFeignClient接口类中的注解声明降级类

hystrix监控

如何才能了解断路器中的状态呢？hystrix监控可以做到这一点。

在order-service操作

1.导入依赖

        <!--监控--><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-actuator</artifactId></dependency><dependency><groupId>org.springframework.cloud</groupId><artifactId>spring-cloud-starter-netflix-hystrix-dashboard</artifactId><version>2.2.10.RELEASE</version></dependency>

2.启用HystrixDashboard监控

在启动类上方添加@EnableHystrixDashboard注解

@EnableHystrixDashboard
public class OrderApplication {

3.启动服务

启动eureka、product、order服务

4.初步测试

浏览器访问

http://localhost:9002/hystrix

http://localhost:9002/actuator

返回的数据格式化后，如下：

{"_links": {"self": {"href": "http://localhost:9002/actuator","templated": false},"health": {"href": "http://localhost:9002/actuator/health","templated": false},"health-path": {"href": "http://localhost:9002/actuator/health/{*path}","templated": true}}
}

6.修改配置

修改order服务的application.yml

给actuator暴露端点

# 给actuator暴露端点
management:endpoints:web:exposure:include: "*"

注意：隔离设置为线程池隔离

hystrix:command:default:execution:isolation:
#          strategy: SEMAPHORE #信号量隔离strategy: THREAD #线程池隔离

配置允许的主机

hystrix:dashboard:proxy-stream-allow-list: localhost

7.测试

重启order服务

浏览器访问

http://localhost:9002/actuator/hystrix.stream

正常可以看到持续输出ping:

浏览器访问

http://localhost:9002/hystrix

在Hystrix Dashboard页面输入框中，填写

http://localhost:9002/actuator/hystrix.stream

点击Monitor Stream

此时Circuit和Threads Pools都是loading状态，还没有具体数据

访问一次9002的服务

http://localhost:9002/order/buy/1

查看hystrix的dashboard，有数据了。

访问多次

http://localhost:9002/order/buy/1

看到监控页面，有折线图，表示触发熔断的情况

各指标的含义

断路器的三种状态：Closed、Open、Half Open

Closed(关闭)：

所有请求都可以正常访问

Open(开启)

所有请求都会进入降级方法中

Half Open(半开)

维持Open状态一段时间（5s）后，断路器进入半开状态，尝试释放一个请求到远程微服务，如果服务访问正常，断路器就进入Closed状态，如果不能访问，继续保存Open状态5s。

完成！enjoy it!