文章目录
- SpringCloud Hystrix 熔断器、服务降级防止服务雪崩
- 需求背景
- 引入依赖
- 启动类加Hystrix注解
- 接口配置熔断
- 常规配置
- 超时断开
- 错误率熔断
- 请求数熔断限流
- 全局配置
- 可配置项
- HystrixCommand.Setter参数
- Command Properties
- 服务降级
SpringCloud Hystrix 熔断器、服务降级防止服务雪崩
Hystrix,英文意思是豪猪,全身是刺,刺是一种保护机制。Hystrix也是Netflflix公司的一款组件。
Hystrix是什么?
在分布式环境中,许多服务依赖项中的部分服务必然有概率出现失败。Hystrix是一个库,通过添加延迟和容错逻辑,来帮助你控制这些分布式服务之间的交互。Hystrix通过隔离服务之间的访问点阻止级联失败,通过提供回退选项来实现防止级联出错。提高了系统的整体弹性。与Ribbon并列,也几乎存在于每个Spring Cloud构建的微服务和基础设施中。
Hystrix被设计的目标是:
对通过第三方客户端库访问的依赖项(通常是通过网络)的延迟和故障进行保护和控制。
在复杂的分布式系统中阻止雪崩效应。
快速失败,快速恢复。
回退,尽可能优雅地降级。
需求背景
项目有慢接口,多个渠道都需要这个接口,一旦有突发大流量过来会导致这个接口变慢,甚至超时。引发第三方重试不断调用我们接口,这个接口又会调用其他接口,在突发大流量时会引发雪崩,且雪崩积攒的重试流量会把刚滚动启动的实例(实例有探针发现持续两次不响应ping就会滚动重启实例,起一个新的,回收旧实例)又打垮,因此需要加入服务熔断、服务降级,让系统将无法消化的流量直接熔断返回报错,让服务不被阻塞住。
引入依赖
<dependency><groupId>org.springframework.cloud</groupId><artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
</dependency>
启动类加Hystrix注解
启动类加@EnableHystrix 或@EnableCircuitBreaker 注解都可以,建议用@EnableHystrix ,因为后一个在后续版本被弃用,@EnableHystrix是包含了@EnableCircuitBreaker的
@Target(ElementType.TYPE)
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Inherited
@EnableCircuitBreaker
public @interface EnableHystrix {}
@SpringCloudApplication
@EnableHystrix
public class ManageApplication{}
接口配置熔断
熔断效果测试可以使用jmeter这样的工具来实现
https://jmeter.apache.org/download_jmeter.cgi
常规配置
配置的值自己根据自己需求去改,这里是测试用设的值
@HystrixCommand(commandProperties = {// 隔离策略,这个配置项默认是THREAD(线程),另外一个选项是SEMAPHORE(信号量),在信号量策略下无法使用超时时间设置。@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_STRATEGY, value = "THREAD"),// 单实例vivo接口最大并发请求数150,多处的直接拒绝@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_SEMAPHORE_MAX_CONCURRENT_REQUESTS, value = "1"),// 线程超时时间,10秒@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_THREAD_TIMEOUT_IN_MILLISECONDS, value = "100"),//错误百分比条件,达到熔断器最小请求数后错误率达到百分之多少后打开熔断器@HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_ERROR_THRESHOLD_PERCENTAGE, value = "10"),//断容器最小请求数,达到这个值过后才开始计算是否打开熔断器@HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_REQUEST_VOLUME_THRESHOLD, value = "3"),// 默认5秒; 熔断器打开后多少秒后 熔断状态变成半熔断状态(对该微服务进行一次请求尝试,不成功则状态改成熔断,成功则关闭熔断器@HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_SLEEP_WINDOW_IN_MILLISECONDS, value = "5000")})
EXECUTION_ISOLATION_STRATEGY隔离策略,这个配置项默认是THREAD(线程),另外一个选项是SEMAPHORE(信号量),在信号量策略下无法使用超时时间设置。
超时断开
@RequestMapping("testTimeout")@HystrixCommand(commandProperties = {// 线程超时时间,10秒@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_THREAD_TIMEOUT_IN_MILLISECONDS, value = "100")})public String testTimeout() throws InterruptedException {Thread.sleep(5000);return "test";}
错误率熔断
请求报错率达到多少之后开启熔断
@RequestMapping("test")@HystrixCommand(groupKey = "vivoApi", commandProperties = {//错误百分比条件,达到熔断器最小请求数后错误率达到百分之多少后打开熔断器@HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_ERROR_THRESHOLD_PERCENTAGE, value = "10")})public String test() throws Exception {Thread.sleep(50);return "test";}
请求数熔断限流
处理调用的线程池核心线程一个,任务队列长度为10,超出的请求会被熔断打回,在执行和队列里的请求没事,正常执行。
@HystrixCommand(groupKey = "vivoApi", threadPoolProperties = {@HystrixProperty(name = HystrixPropertiesManager.CORE_SIZE, value = "1"),@HystrixProperty(name = HystrixPropertiesManager.MAX_QUEUE_SIZE, value = "10")})public String test() throws Exception {Thread.sleep(50);return "test";}
全局配置
全局配置并非全局所有接口自动配置上熔断、超时等操作,而是当你注解没有声明的配置会去读全局默认配置
例如下面配置了全局默认配置,那接口里用的时候你可以直接只加@HystrixCommand
/*** @author humorchen* date: 2024/4/7* description: 全局熔断配置**/
@Configuration
public class GlobalHystrixPropertiesConfiguration {/*** 熔断器配置** @return*/@Beanpublic HystrixCommandProperties.Setter commandPropertiesConfig() {return HystrixCommandProperties.Setter()// 熔断器在整个统计时间内是否开启的阀值.withExecutionTimeoutInMilliseconds(3000).withExecutionTimeoutEnabled(true).withExecutionIsolationThreadInterruptOnTimeout(true)// 报错比例决定熔断器是否开启.withCircuitBreakerErrorThresholdPercentage(10)// 熔断器开启必须有超过多少个请求.withCircuitBreakerRequestVolumeThreshold(10);}/*** 线程池配置** @return*/@Beanpublic HystrixThreadPoolProperties.Setter threadPoolConfig() {return HystrixThreadPoolProperties.Setter().withCoreSize(1).withMaxQueueSize(10);}
}
可配置项
可配置项 官方文档 https://github.com/Netflix/Hystrix/wiki/Configuration
配置的中文翻译 Hystrix配置中文文档
关键配置类 HystrixCommandProperties.java、HystrixPropertiesManager.java
HystrixCommand.Setter参数
HystrixCommandGroupKey:区分一组服务,一般以接口为粒度。
HystrixCommandKey:区分一个方法,一般以方法为粒度。
HystrixThreadPoolKey:一个HystrixThreadPoolKey下的所有方法共用一个线程池。
HystrixCommandProperties:基本配置
Command Properties
hystrix.command.default.execution.isolation.strategy 隔离策略,默认是Thread,可选Thread|Semaphore。thread用于线程池的隔离,一般适用于同步请求。semaphore是信号量模式,适用于异步请求
hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds 命令执行超时时间,默认1000ms
hystrix.command.default.execution.timeout.enabled 执行是否启用超时,默认启用true
hystrix.command.default.execution.isolation.thread.interruptOnTimeout 发生超时是是否中断,默认true
hystrix.command.default.execution.isolation.semaphore.maxConcurrentRequests 最大并发请求数,默认10,该参数当使用ExecutionIsolationStrategy.SEMAPHORE策略时才有效。如果达到最大并发请求数,请求会被拒绝。理论上选择semaphore size的原则和选择thread size一致,但选用semaphore时每次执行的单元要比较小且执行速度快(ms级别),否则的话应该用thread。
hystrix.command.default.execution.isolation.thread.interruptOnCancel
public abstract class HystrixCommandProperties {private static final Logger logger = LoggerFactory.getLogger(HystrixCommandProperties.class);/* defaults *//* package */ static final Integer default_metricsRollingStatisticalWindow = 10000;// default => statisticalWindow: 10000 = 10 seconds (and default of 10 buckets so each bucket is 1 second)private static final Integer default_metricsRollingStatisticalWindowBuckets = 10;// default => statisticalWindowBuckets: 10 = 10 buckets in a 10 second window so each bucket is 1 secondprivate static final Integer default_circuitBreakerRequestVolumeThreshold = 20;// default => statisticalWindowVolumeThreshold: 20 requests in 10 seconds must occur before statistics matterprivate static final Integer default_circuitBreakerSleepWindowInMilliseconds = 5000;// default => sleepWindow: 5000 = 5 seconds that we will sleep before trying again after tripping the circuitprivate static final Integer default_circuitBreakerErrorThresholdPercentage = 50;// default => errorThresholdPercentage = 50 = if 50%+ of requests in 10 seconds are failures or latent then we will trip the circuitprivate static final Boolean default_circuitBreakerForceOpen = false;// default => forceCircuitOpen = false (we want to allow traffic)/* package */ static final Boolean default_circuitBreakerForceClosed = false;// default => ignoreErrors = false private static final Integer default_executionTimeoutInMilliseconds = 1000; // default => executionTimeoutInMilliseconds: 1000 = 1 secondprivate static final Boolean default_executionTimeoutEnabled = true;private static final ExecutionIsolationStrategy default_executionIsolationStrategy = ExecutionIsolationStrategy.THREAD;private static final Boolean default_executionIsolationThreadInterruptOnTimeout = true;private static final Boolean default_executionIsolationThreadInterruptOnFutureCancel = false;private static final Boolean default_metricsRollingPercentileEnabled = true;private static final Boolean default_requestCacheEnabled = true;private static final Integer default_fallbackIsolationSemaphoreMaxConcurrentRequests = 10;private static final Boolean default_fallbackEnabled = true;private static final Integer default_executionIsolationSemaphoreMaxConcurrentRequests = 10;private static final Boolean default_requestLogEnabled = true;private static final Boolean default_circuitBreakerEnabled = true;private static final Integer default_metricsRollingPercentileWindow = 60000; // default to 1 minute for RollingPercentile private static final Integer default_metricsRollingPercentileWindowBuckets = 6; // default to 6 buckets (10 seconds each in 60 second window)private static final Integer default_metricsRollingPercentileBucketSize = 100; // default to 100 values max per bucketprivate static final Integer default_metricsHealthSnapshotIntervalInMilliseconds = 500; // default to 500ms as max frequency between allowing snapshots of health (error percentage etc)@SuppressWarnings("unused") private final HystrixCommandKey key;private final HystrixProperty<Integer> circuitBreakerRequestVolumeThreshold; // number of requests that must be made within a statisticalWindow before open/close decisions are made using statsprivate final HystrixProperty<Integer> circuitBreakerSleepWindowInMilliseconds; // milliseconds after tripping circuit before allowing retryprivate final HystrixProperty<Boolean> circuitBreakerEnabled; // Whether circuit breaker should be enabled.private final HystrixProperty<Integer> circuitBreakerErrorThresholdPercentage; // % of 'marks' that must be failed to trip the circuitprivate final HystrixProperty<Boolean> circuitBreakerForceOpen; // a property to allow forcing the circuit open (stopping all requests)private final HystrixProperty<Boolean> circuitBreakerForceClosed; // a property to allow ignoring errors and therefore never trip 'open' (ie. allow all traffic through)private final HystrixProperty<ExecutionIsolationStrategy> executionIsolationStrategy; // Whether a command should be executed in a separate thread or not.private final HystrixProperty<Integer> executionTimeoutInMilliseconds; // Timeout value in milliseconds for a commandprivate final HystrixProperty<Boolean> executionTimeoutEnabled; //Whether timeout should be triggeredprivate final HystrixProperty<String> executionIsolationThreadPoolKeyOverride; // What thread-pool this command should run in (if running on a separate thread).private final HystrixProperty<Integer> executionIsolationSemaphoreMaxConcurrentRequests; // Number of permits for execution semaphoreprivate final HystrixProperty<Integer> fallbackIsolationSemaphoreMaxConcurrentRequests; // Number of permits for fallback semaphoreprivate final HystrixProperty<Boolean> fallbackEnabled; // Whether fallback should be attempted.private final HystrixProperty<Boolean> executionIsolationThreadInterruptOnTimeout; // Whether an underlying Future/Thread (when runInSeparateThread == true) should be interrupted after a timeoutprivate final HystrixProperty<Boolean> executionIsolationThreadInterruptOnFutureCancel; // Whether canceling an underlying Future/Thread (when runInSeparateThread == true) should interrupt the execution threadprivate final HystrixProperty<Integer> metricsRollingStatisticalWindowInMilliseconds; // milliseconds back that will be trackedprivate final HystrixProperty<Integer> metricsRollingStatisticalWindowBuckets; // number of buckets in the statisticalWindowprivate final HystrixProperty<Boolean> metricsRollingPercentileEnabled; // Whether monitoring should be enabled (SLA and Tracers).private final HystrixProperty<Integer> metricsRollingPercentileWindowInMilliseconds; // number of milliseconds that will be tracked in RollingPercentileprivate final HystrixProperty<Integer> metricsRollingPercentileWindowBuckets; // number of buckets percentileWindow will be divided intoprivate final HystrixProperty<Integer> metricsRollingPercentileBucketSize; // how many values will be stored in each percentileWindowBucketprivate final HystrixProperty<Integer> metricsHealthSnapshotIntervalInMilliseconds; // time between health snapshotsprivate final HystrixProperty<Boolean> requestLogEnabled; // whether command request logging is enabled.private final HystrixProperty<Boolean> requestCacheEnabled; // Whether request caching is enabled./*** Isolation strategy to use when executing a {@link HystrixCommand}.* <p>* <ul>* <li>THREAD: Execute the {@link HystrixCommand#run()} method on a separate thread and restrict concurrent executions using the thread-pool size.</li>* <li>SEMAPHORE: Execute the {@link HystrixCommand#run()} method on the calling thread and restrict concurrent executions using the semaphore permit count.</li>* </ul>*/public static enum ExecutionIsolationStrategy {THREAD, SEMAPHORE}
服务降级
简单来说也就是在@HystrixCommand注解上配置fallbackMethod参数指定降级后调用的方法名即可。当接口不可用、超时、错误率高于阈值时自动调用fallbackMethod指定的方法处理请求,称为服务降级,fallbackMethod方法可以直接返回错误,或者提供带缓存的高性能的实现等,如果fallbackMethod也不可用,会发生熔断报错返回。
@RestController
@RequestMapping("hystrix")
public class HystrixController {@Autowiredprivate HystrixService hystrixService;// 熔断降级@GetMapping("{num}")@HystrixCommand(fallbackMethod="circuitBreakerFallback", commandProperties = {@HystrixProperty(name=HystrixPropertiesManager.CIRCUIT_BREAKER_ENABLED, value = "true"),// 是否开启熔断器@HystrixProperty(name=HystrixPropertiesManager.CIRCUIT_BREAKER_REQUEST_VOLUME_THRESHOLD,value = "20"), // 统计时间窗内请求次数@HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_ERROR_THRESHOLD_PERCENTAGE, value = "50"),// 在统计时间窗内,失败率达到50%进入熔断状态@HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_SLEEP_WINDOW_IN_MILLISECONDS, value = "5000"), // 休眠时间窗口@HystrixProperty(name = HystrixPropertiesManager.METRICS_ROLLING_STATS_TIME_IN_MILLISECONDS, value = "10000") // 统计时间窗})public String testCircuitBreaker(@PathVariable Integer num, @RequestParam String name) {if (num % 2 == 0) {return "请求成功";} else {throw RunTimeException("");}}// fallback方法的参数个数、参数类型、返回值类型要与原方法对应,fallback方法的参数多加个Throwablepublic String circuitBreakerFallback(Integer num, String name) {return "请求失败,请稍后重试";}// 超时降级@GetMapping@HystrixCommand(fallbackMethod = "timeoutFallback", commandProperties = {@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_TIMEOUT_ENABLED, value = "true"),// 是否开启超时降级@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_THREAD_TIMEOUT_IN_MILLISECONDS, value = "10000"),// 请求的超时时间,默认10000@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_THREAD_INTERRUPT_ON_TIMEOUT, value = "true")// 当请求超时时,是否中断线程,默认true})public String testTimeout(@RequestParam String name) throws InterruptedException{Thread.sleep(200)return "success";}public String timeoutFallback(String name) {return "请求超时,请稍后重试";}// 资源隔离(线程池)触发降级@GetMapping("isolation/threadpool")@HystrixCommand(fallbackMethod = "isolationFallback",commandProperties = {@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_STRATEGY, value = "THREAD")},threadPoolProperties = {@HystrixProperty(name = HystrixPropertiesManager.CORE_SIZE, value = "10"),@HystrixProperty(name = HystrixPropertiesManager.MAX_QUEUE_SIZE, value = "-1"),@HystrixProperty(name = HystrixPropertiesManager.QUEUE_SIZE_REJECTION_THRESHOLD, value = "2"),@HystrixProperty(name = HystrixPropertiesManager.KEEP_ALIVE_TIME_MINUTES, value = "1"),})public String testThreadPoolIsolation(@RequestParam String name) throws InterruptedException {Thread.sleep(200)return "success";}public String isolationFallback(String name) {return "资源隔离拒绝,请稍后重试";}// 信号量资源隔离@GetMapping("isolation/semaphore")@HystrixCommand(fallbackMethod = "isolationFallback",commandProperties = {@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_STRATEGY, value = "SEMAPHORE"),@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_SEMAPHORE_MAX_CONCURRENT_REQUESTS, value = "2")})public String testSemaphoreIsolation(@RequestParam String name) throws InterruptedException {Thread.sleep(200)return "success";}public String isolationFallback(String name) {return "资源隔离拒绝,请稍后重试";}}