前言
快速配置请直接跳转至汇总配置
K8s + SpringBoot实现零宕机发布:健康检查+滚动更新+优雅停机+弹性伸缩+Prometheus监控+配置分离(镜像复用)
配置
健康检查
健康检查类型:就绪探针(readiness)+ 存活探针(liveness)探针类型:exec(进入容器执行脚本)、tcpSocket(探测端口)、httpGet(调用接口)
业务层面
项目依赖 pom.xml
<dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-actuator</artifactId></dependency>
定义访问端口、路径及权限 application.yaml
management:server:port: 50000 # 启用独立运维端口endpoint: # 开启health端点health:probes:enabled: trueendpoints:web:exposure:base-path: /actuator # 指定上下文路径,启用相应端点include: health
将暴露/actuator/health/readiness和/actuator/health/liveness两个接口,访问方式如下:
http://127.0.0.1:50000/actuator/health/readiness
http://127.0.0.1:50000/actuator/health/liveness
运维层面
k8s部署模版deployment.yaml
apiVersion: apps/v1
kind: Deployment
spec:template:spec:containers:- name: {APP_NAME}image: {IMAGE_URL}imagePullPolicy: Alwaysports:- containerPort: {APP_PORT}- name: management-portcontainerPort: 50000 # 应用管理端口readinessProbe: # 就绪探针httpGet:path: /actuator/health/readinessport: management-portinitialDelaySeconds: 30 # 延迟加载时间periodSeconds: 10 # 重试时间间隔timeoutSeconds: 1 # 超时时间设置successThreshold: 1 # 健康阈值failureThreshold: 6 # 不健康阈值livenessProbe: # 存活探针httpGet:path: /actuator/health/livenessport: management-portinitialDelaySeconds: 30 # 延迟加载时间periodSeconds: 10 # 重试时间间隔timeoutSeconds: 1 # 超时时间设置successThreshold: 1 # 健康阈值failureThreshold: 6 # 不健康阈值
滚动更新
k8s资源调度之滚动更新策略,若要实现零宕机发布,需支持健康检查
apiVersion: apps/v1
kind: Deployment
metadata:name: {APP_NAME}labels:app: {APP_NAME}
spec:selector:matchLabels:app: {APP_NAME}replicas: {REPLICAS} # Pod副本数strategy:type: RollingUpdate # 滚动更新策略rollingUpdate:maxSurge: 1 # 升级过程中最多可以比原先设置的副本数多出的数量maxUnavailable: 1 # 升级过程中最多有多少个POD处于无法提供服务的状态
优雅停机
在K8s中,当我们实现滚动升级之前,务必要实现应用级别的优雅停机。否则滚动升级时,还是会影响到业务。使应用关闭线程、释放连接资源后再停止服务
业务层面
项目依赖 pom.xml
<dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
定义访问端口、路径及权限 application.yaml
spring:application:name: <xxx>profiles:active: @profileActive@lifecycle:timeout-per-shutdown-phase: 30s # 停机过程超时时长设置30s,超过30s,直接停机server:port: 8080shutdown: graceful # 默认为IMMEDIATE,表示立即关机;GRACEFUL表示优雅关机management:server:port: 50000 # 启用独立运维端口endpoint: # 开启shutdown和health端点shutdown:enabled: truehealth:probes:enabled: trueendpoints:web:exposure:base-path: /actuator # 指定上下文路径,启用相应端点include: health,shutdown
将暴露/actuator/shutdown接口,调用方式如下:
curl -X POST 127.0.0.1:50000/actuator/shutdown
运维层面
确保dockerfile模版集成curl工具,否则无法使用curl命令
FROM openjdk:8-jdk-alpine
#构建参数
ARG JAR_FILE
ARG WORK_PATH="/app"
ARG EXPOSE_PORT=8080#环境变量
ENV JAVA_OPTS=""\JAR_FILE=${JAR_FILE}#设置时区
RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo 'Asia/Shanghai' >/etc/timezone
RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.ustc.edu.cn/g' /etc/apk/repositories \&& apk add --no-cache curl
#将maven目录的jar包拷贝到docker中,并命名为for_docker.jar
COPY target/$JAR_FILE $WORK_PATH/#设置工作目录
WORKDIR $WORK_PATH
指定于外界交互的端口
EXPOSE $EXPOSE_PORT
配置容器,使其可执行化
ENTRYPOINT exec java $JAVA_OPTS -jar $JAR_FILE
k8s部署模版deployment.yaml
注:经验证,java项目可省略结束回调钩子的配置
此外,若需使用回调钩子,需保证镜像中包含curl工具,且需注意应用管理端口(50000)不能暴露到公网
apiVersion: apps/v1
kind: Deployment
spec:template:spec:containers:- name: {APP_NAME}image: {IMAGE_URL}imagePullPolicy: Alwaysports:- containerPort: {APP_PORT}- containerPort: 50000lifecycle:preStop: # 结束回调钩子exec:command: ["curl", "-XPOST", "127.0.0.1:50000/actuator/shutdown"]
弹性伸缩
为pod设置资源限制后,创建HPA
apiVersion: apps/v1
kind: Deployment
metadata:name: {APP_NAME}labels:app: {APP_NAME}
spec:template:spec:containers:- name: {APP_NAME}image: {IMAGE_URL}imagePullPolicy: Alwaysresources: # 容器资源管理limits: # 资源限制(监控使用情况)cpu: 0.5memory: 1Girequests: # 最小可用资源(灵活调度)cpu: 0.15memory: 300Mi
kind: HorizontalPodAutoscaler # 弹性伸缩控制器
apiVersion: autoscaling/v2beta2
metadata:name: {APP_NAME}
spec:scaleTargetRef:apiVersion: apps/v1kind: Deploymentname: {APP_NAME}minReplicas: {REPLICAS} # 缩放范围maxReplicas: 6metrics:- type: Resourceresource:name: cpu # 指定资源指标target:type: UtilizationaverageUtilization: 50
Prometheus集成
业务层面
项目依赖 pom.xml<!-- 引入Spring boot的监控机制-->
<dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency><groupId>io.micrometer</groupId><artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
定义访问端口、路径及权限 application.yaml
management:server:port: 50000 # 启用独立运维端口metrics:tags:application: ${spring.application.name}endpoints:web:exposure:base-path: /actuator # 指定上下文路径,启
用相应端点
include: metrics,prometheus
将暴露/actuator/metric和/actuator/prometheus接口,访问方式如下:
http://127.0.0.1:50000/actuator/metric
http://127.0.0.1:50000/actuator/prometheus
运维层面 deployment.yaml
apiVersion: apps/v1
kind: Deployment
spec:template:metadata:annotations:prometheus:io/port: "50000"prometheus.io/path: /actuator/prometheus # 在流水线中赋值prometheus.io/scrape: "true" # 基于pod的服务发现
配置分离
方案:通过configmap挂载外部配置文件,并指定激活环境运行
作用:配置分离,避免敏感信息泄露;镜像复用,提高交付效率
通过文件生成configmap
通过dry-run的方式生成yaml文件
kubectl create cm -n <APP_NAME> --from-file=application-test.yaml --dry-run=1 -oyaml > configmap.yaml
更新
kubectl apply -f configmap.yaml
挂载configmap并指定激活环境
apiVersion: apps/v1
kind: Deployment
metadata:name: {APP_NAME}labels:app: {APP_NAME}
spec:template:spec:containers:- name: {APP_NAME}image: {IMAGE_URL}imagePullPolicy: Alwaysenv:- name: SPRING_PROFILES_ACTIVE # 指定激活环境value: testvolumeMounts: # 挂载configmap- name: confmountPath: "/app/config" # 与Dockerfile中工作目录一致readOnly: truevolumes:- name: confconfigMap:name: {APP_NAME}
汇总配置
业务层面
项目依赖 pom.xml<!-- 引入Spring boot的监控机制-->
<dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency><groupId>io.micrometer</groupId><artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
定义访问端口、路径及权限 application.yaml
spring:application:name: project-sampleprofiles:active: @profileActive@lifecycle:timeout-per-shutdown-phase: 30s # 停机过程超时时长设置30s,超过30s,直接停机server:port: 8080shutdown: graceful # 默认为IMMEDIATE,表示立即关机;GRACEFUL表示优雅关机management:server:port: 50000 # 启用独立运维端口metrics:tags:application: ${spring.application.name}endpoint: # 开启shutdown和health端点shutdown:enabled: truehealth:probes:enabled: trueendpoints:web:exposure:base-path: /actuator # 指定上下文路径,启用相应端点include: health,shutdown,metrics,prometheus
运维层面
确保dockerfile模版集成curl工具,否则无法使用curl命令
FROM openjdk:8-jdk-alpine
#构建参数
ARG JAR_FILE
ARG WORK_PATH="/app"
ARG EXPOSE_PORT=8080#环境变量
ENV JAVA_OPTS=""\JAR_FILE=${JAR_FILE}#设置时区
RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo 'Asia/Shanghai' >/etc/timezone
RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.ustc.edu.cn/g' /etc/apk/repositories \&& apk add --no-cache curl
#将maven目录的jar包拷贝到docker中,并命名为for_docker.jar
COPY target/$JAR_FILE $WORK_PATH/#设置工作目录
WORKDIR $WORK_PATH# 指定于外界交互的端口
EXPOSE $EXPOSE_PORT
# 配置容器,使其可执行化
ENTRYPOINT exec java $JAVA_OPTS -jar $JAR_FILE
k8s部署模版deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:name: {APP_NAME}labels:app: {APP_NAME}
spec:selector:matchLabels:app: {APP_NAME}replicas: {REPLICAS} # Pod副本数strategy:type: RollingUpdate # 滚动更新策略rollingUpdate:maxSurge: 1maxUnavailable: 0template:metadata:name: {APP_NAME}labels:app: {APP_NAME}annotations:timestamp: {TIMESTAMP}prometheus.io/port: "50000" # 不能动态赋值prometheus.io/path: /actuator/prometheusprometheus.io/scrape: "true" # 基于pod的服务发现spec:affinity: # 设置调度策略,采取多主机/多可用区部署podAntiAffinity:preferredDuringSchedulingIgnoredDuringExecution:- weight: 100podAffinityTerm:labelSelector:matchExpressions:- key: appoperator: Invalues:- {APP_NAME}topologyKey: "kubernetes.io/hostname" # 多可用区为"topology.kubernetes.io/zone"terminationGracePeriodSeconds: 30 # 优雅终止宽限期containers:- name: {APP_NAME}image: {IMAGE_URL}imagePullPolicy: Alwaysports:- containerPort: {APP_PORT}- name: management-portcontainerPort: 50000 # 应用管理端口readinessProbe: # 就绪探针httpGet:path: /actuator/health/readinessport: management-portinitialDelaySeconds: 30 # 延迟加载时间periodSeconds: 10 # 重试时间间隔timeoutSeconds: 1 # 超时时间设置successThreshold: 1 # 健康阈值failureThreshold: 9 # 不健康阈值livenessProbe: # 存活探针httpGet:path: /actuator/health/livenessport: management-portinitialDelaySeconds: 30 # 延迟加载时间periodSeconds: 10 # 重试时间间隔timeoutSeconds: 1 # 超时时间设置successThreshold: 1 # 健康阈值failureThreshold: 6 # 不健康阈值resources: # 容器资源管理limits: # 资源限制(监控使用情况)cpu: 0.5memory: 1Girequests: # 最小可用资源(灵活调度)cpu: 0.1memory: 200Mienv:- name: TZvalue: Asia/Shanghai
---
kind: HorizontalPodAutoscaler # 弹性伸缩控制器
apiVersion: autoscaling/v2beta2
metadata:name: {APP_NAME}
spec:scaleTargetRef:apiVersion: apps/v1kind: Deploymentname: {APP_NAME}minReplicas: {REPLICAS} # 缩放范围maxReplicas: 6metrics:- type: Resourceresource:name: cpu # 指定资源指标target:type: UtilizationaverageUtilization: 50