文章目录
- 前言
- 一、JobFailMonitorHelper作用:
- 二、JobFailMonitorHelper源码内容:
- 2.1 start() 初始化
- 2.1.1 任务失败重试:
- 2.1.2 任务失败告警信息发送:
- 2.1.2.1 JobAlarmer 告警类:
- 2.1.2.2 alarm 告警信息发送:
- 2.2 toStop() 终止线程释放资源:
- 总结
前言
本文对JobFailMonitorHelper的工作内容进行介绍。
一、JobFailMonitorHelper作用:
JobFailMonitorHelper是xxl-job-admin中的一个辅助类,用于监控任务执行失败情况并进行处理。其主要作用包括:
-
监控任务执行失败:JobFailMonitorHelper定时检测任务执行情况,发现任务执行失败的情况,例如任务执行超时、执行异常等。
-
处理任务执行失败:一旦发现任务执行失败,JobFailMonitorHelper会根据预设的处理策略来进行处理,例如重新执行任务、发送告警通知等。
-
统计和报表:JobFailMonitorHelper会统计任务执行失败的情况,生成报表并提供给管理员查看,以便对任务执行情况进行监控和分析。
-
增强系统健壮性:通过及时监控和处理任务执行失败情况,JobFailMonitorHelper能够提高系统的健壮性和可靠性,确保任务能够按时正确执行。
总的来说,JobFailMonitorHelper在xxl-job-admin中扮演着监控任务执行失败情况并进行处理的重要角色,帮助管理员及时发现和处理任务执行异常情况,提高系统的稳定性和可靠性。
二、JobFailMonitorHelper源码内容:
2.1 start() 初始化
2.1.1 任务失败重试:
// 定义log 对象
private static Logger logger = LoggerFactory.getLogger(JobFailMonitorHelper.class);
// 实例化 JobFailMonitorHelper 对象
private static JobFailMonitorHelper instance = new JobFailMonitorHelper();
public static JobFailMonitorHelper getInstance(){return instance;
}// ---------------------- monitor ----------------------
// 定义监控线程 monitorThread
private Thread monitorThread;
// 任务while 循环标识
private volatile boolean toStop = false;
public void start(){monitorThread = new Thread(new Runnable() {@Overridepublic void run() {// monitorwhile (!toStop) {try {// 获取最近的 1000 条任务执行失败的任务id 集合/*** SELECT id FROM `xxl_job_log`WHERE !((trigger_code in (0, 200) and handle_code = 0)OR(handle_code = 200))AND `alarm_status` = 0ORDER BY id ASCLIMIT #{pagesize}**/List<Long> failLogIds = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().findFailJobLogIds(1000);if (failLogIds!=null && !failLogIds.isEmpty()) {for (long failLogId: failLogIds) {// 遍历失败的任务id 集合// lock log 乐观锁占用/*** UPDATE xxl_job_logSET`alarm_status` = #{newAlarmStatus}WHERE `id`= #{logId} AND `alarm_status` = #{oldAlarmStatus}**/int lockRet = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().updateAlarmStatus(failLogId, 0, -1);if (lockRet < 1) {// 锁抢占失败,绩效下个任务遍历continue;}// 获取任务执行的log 对象/*** SELECT <include refid="Base_Column_List" />FROM xxl_job_log AS tWHERE t.id = #{id}**/XxlJobLog log = XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().load(failLogId);// 获取 任务的详情/*** SELECT <include refid="Base_Column_List" />FROM xxl_job_info AS tWHERE t.id = #{id}**/XxlJobInfo info = XxlJobAdminConfig.getAdminConfig().getXxlJobInfoDao().loadById(log.getJobId());// 1、fail retry monitor 判断任务的失败重试次数if (log.getExecutorFailRetryCount() > 0) {// 失败重试次数大于0 触发任务重试,每次重试都将任务的 重试次数-1/*** 任务的触发执行细节可以参考 文章:* 原理篇-- 定时任务xxl-job-服务端(admin)项目启动过程--JobTriggerPoolHelper 初始化 (3)* 连接: https://blog.csdn.net/l123lgx/article/details/136349951**/JobTriggerPoolHelper.trigger(log.getJobId(), TriggerTypeEnum.RETRY, (log.getExecutorFailRetryCount()-1), log.getExecutorShardingParam(), log.getExecutorParam(), null);String retryMsg = "<br><br><span style=\"color:#F39C12;\" > >>>>>>>>>>>"+ I18nUtil.getString("jobconf_trigger_type_retry") +"<<<<<<<<<<< </span><br>";// 追加任务重试的执行结果log.setTriggerMsg(log.getTriggerMsg() + retryMsg);// 更新log 对象信息XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().updateTriggerInfo(log);}// 2、fail alarm monitor 失败告警信息发送int newAlarmStatus = 0; // 告警状态:0-默认、-1=锁定状态、1-无需告警、2-告警成功、3-告警失败if (info != null) {// 执行告警业务boolean alarmResult = XxlJobAdminConfig.getAdminConfig().getJobAlarmer().alarm(info, log); // 标识告警的结果newAlarmStatus = alarmResult?2:3;} else {// 无需告警newAlarmStatus = 1;}// 更新log 对象的告警信息/*** UPDATE xxl_job_logSET`alarm_status` = #{newAlarmStatus}WHERE `id`= #{logId} AND `alarm_status` = #{oldAlarmStatus}**/XxlJobAdminConfig.getAdminConfig().getXxlJobLogDao().updateAlarmStatus(failLogId, -1, newAlarmStatus);}}} catch (Exception e) {if (!toStop) {logger.error(">>>>>>>>>>> xxl-job, job fail monitor thread error:{}", e);}}try {TimeUnit.SECONDS.sleep(10);} catch (Exception e) {if (!toStop) {logger.error(e.getMessage(), e);}}}logger.info(">>>>>>>>>>> xxl-job, job fail monitor thread stop");}});// 设置 monitorThread线程 守护线程,名字,运行monitorThread.setDaemon(true);monitorThread.setName("xxl-job, admin JobFailMonitorHelper");monitorThread.start();
}
2.1.2 任务失败告警信息发送:
XxlJobAdminConfig.getAdminConfig().getJobAlarmer().alarm(info, log) 获取所有实现了JobAlarm 接口的bean 遍历调用 doAlarm 方法;
2.1.2.1 JobAlarmer 告警类:
1)JobAlarmer 的初始化:
// @Component 标识被spring 识别 构建 JobAlarmer 的bean 对象并放入到单例池中
@Component
public class JobAlarmer implements ApplicationContextAware, InitializingBean {private static Logger logger = LoggerFactory.getLogger(JobAlarmer.class);private ApplicationContext applicationContext;private List<JobAlarm> jobAlarmList;// 实现ApplicationContextAware 在容器创建后可以执行 setApplicationContext 注入容器的上下文@Overridepublic void setApplicationContext(ApplicationContext applicationContext) throws BeansException {this.applicationContext = applicationContext;}// 实现InitializingBean 在JobAlarmer 的bean 属性注入后,调用afterPropertiesSet 完成初始化@Overridepublic void afterPropertiesSet() throws Exception {// 从当前项目容器中获取到 JobAlarm类型的所有bean对象Map<String, JobAlarm> serviceBeanMap = applicationContext.getBeansOfType(JobAlarm.class);if (serviceBeanMap != null && serviceBeanMap.size() > 0) {// 将bean 对象填充到 jobAlarmList 集合中jobAlarmList = new ArrayList<JobAlarm>(serviceBeanMap.values());}}
}
2) JobAlarm 告警接口类:
public interface JobAlarm {/*** job alarm** @param info* @param jobLog* @return*/public boolean doAlarm(XxlJobInfo info, XxlJobLog jobLog);}
3)JobAlarm 的实现类 EmailJobAlarm:
目前服务端实现告警的类只有EmailJobAlarm 通过邮件告警,如果要扩展可以仿照EmailJobAlarm 实现 JobAlarm 重写 doAlarm 方法即可;
/*** job alarm by email** @author xuxueli 2020-01-19*/
@Component
public class EmailJobAlarm implements JobAlarm {private static Logger logger = LoggerFactory.getLogger(EmailJobAlarm.class);/*** fail alarm 执行告警逻辑** @param jobLog*/@Overridepublic boolean doAlarm(XxlJobInfo info, XxlJobLog jobLog){boolean alarmResult = true;// send monitor emailif (info!=null && info.getAlarmEmail()!=null && info.getAlarmEmail().trim().length()>0) {// 告警邮件地址不为空// alarmContentString alarmContent = "Alarm Job LogId=" + jobLog.getId();if (jobLog.getTriggerCode() != ReturnT.SUCCESS_CODE) {alarmContent += "<br>TriggerMsg=<br>" + jobLog.getTriggerMsg();}if (jobLog.getHandleCode()>0 && jobLog.getHandleCode() != ReturnT.SUCCESS_CODE) {alarmContent += "<br>HandleCode=" + jobLog.getHandleMsg();}// email infoXxlJobGroup group = XxlJobAdminConfig.getAdminConfig().getXxlJobGroupDao().load(Integer.valueOf(info.getJobGroup()));String personal = I18nUtil.getString("admin_name_full");String title = I18nUtil.getString("jobconf_monitor");String content = MessageFormat.format(loadEmailJobAlarmTemplate(),group!=null?group.getTitle():"null",info.getId(),info.getJobDesc(),alarmContent);Set<String> emailSet = new HashSet<String>(Arrays.asList(info.getAlarmEmail().split(",")));// 遍历邮件地址 发送信息for (String email: emailSet) {// make mailtry {MimeMessage mimeMessage = XxlJobAdminConfig.getAdminConfig().getMailSender().createMimeMessage();MimeMessageHelper helper = new MimeMessageHelper(mimeMessage, true);helper.setFrom(XxlJobAdminConfig.getAdminConfig().getEmailFrom(), personal);helper.setTo(email);helper.setSubject(title);helper.setText(content, true);XxlJobAdminConfig.getAdminConfig().getMailSender().send(mimeMessage);} catch (Exception e) {logger.error(">>>>>>>>>>> xxl-job, job fail alarm email send error, JobLogId:{}", jobLog.getId(), e);alarmResult = false;}}}return alarmResult;}/*** load email job alarm template** @return*/private static final String loadEmailJobAlarmTemplate(){String mailBodyTemplate = "<h5>" + I18nUtil.getString("jobconf_monitor_detail") + ":</span>" +"<table border=\"1\" cellpadding=\"3\" style=\"border-collapse:collapse; width:80%;\" >\n" +" <thead style=\"font-weight: bold;color: #ffffff;background-color: #ff8c00;\" >" +" <tr>\n" +" <td width=\"20%\" >"+ I18nUtil.getString("jobinfo_field_jobgroup") +"</td>\n" +" <td width=\"10%\" >"+ I18nUtil.getString("jobinfo_field_id") +"</td>\n" +" <td width=\"20%\" >"+ I18nUtil.getString("jobinfo_field_jobdesc") +"</td>\n" +" <td width=\"10%\" >"+ I18nUtil.getString("jobconf_monitor_alarm_title") +"</td>\n" +" <td width=\"40%\" >"+ I18nUtil.getString("jobconf_monitor_alarm_content") +"</td>\n" +" </tr>\n" +" </thead>\n" +" <tbody>\n" +" <tr>\n" +" <td>{0}</td>\n" +" <td>{1}</td>\n" +" <td>{2}</td>\n" +" <td>"+ I18nUtil.getString("jobconf_monitor_alarm_type") +"</td>\n" +" <td>{3}</td>\n" +" </tr>\n" +" </tbody>\n" +"</table>";return mailBodyTemplate;}}
2.1.2.2 alarm 告警信息发送:
获取所有实现了JobAlarm 接口的bean 遍历调用 doAlarm 方法;
public boolean alarm(XxlJobInfo info, XxlJobLog jobLog) {// List<JobAlarm> jobAlarmList 遍历 所有实现了JobAlarm 接口的bean boolean result = false;if (jobAlarmList!=null && jobAlarmList.size()>0) {result = true; // success means all-successfor (JobAlarm alarm: jobAlarmList) {boolean resultItem = false;try {resultItem = alarm.doAlarm(info, jobLog);} catch (Exception e) {logger.error(e.getMessage(), e);}if (!resultItem) {result = false;}}}返回告警结果return result;}
2.2 toStop() 终止线程释放资源:
public void toStop(){toStop = true;// interrupt and waitmonitorThread.interrupt();try {monitorThread.join();} catch (InterruptedException e) {logger.error(e.getMessage(), e);}
}
总结
本文对 JobFailMonitorHelper的工作内容进行介绍。