摘要: 如果 app 连续 crash 两次无法启动,用户往往会选择卸载。本文介绍如何该类 crash 的自修复技术。
点此查看原文:http://click.aliyun.com/m/41487/
作者:阿里云-移动云-大前端团队
前言
如果 app 连续 crash 两次无法启动,用户往往会选择卸载。
连续启动 crash 应该是 crash 类型中最严重的一类,该问题常常与数据库操作有关,比如:数据库损坏、服务端返回数据错误,存入数据库,app 读取时产生数组越界、找不到方法。
那么除了热修复,能否“自修复”该问题呢?
在微信读书团队发布的《iOS 启动连续闪退保护方案》 一文中,给出了连续启动crash的自修复技术的思路讲解,并在GitHub上给出了技术实现,并开源了 GYBootingProtection。方案思路很好,很轻量级。
实现原理
在微信读书团队给出的文章中已经有比较详细的阐述,在此不做赘述,实现的流程图如下所示:
但有个实现上可以优化下,可以降低50%以上误报机率,监听用户手动划掉 APP 这个事件,其中一些特定场景,是可以获取的。另外在这里也给出对其 API 设计的建议。最后给出优化后的实现。
优化:降低50%以上误报机率
用户主动 kill 掉 APP 分为两种情况:
App在前台时用户手动划掉APP的时候
APP在后台时划掉APP
第一种场景更为常见,可以通过监听 UIApplicationWillTerminateNotification 来捕获该动作,捕获后恢复计数。第二种情况,无法监听到。但也足以降低 50% 以上的误报机率。
对原有API设计的几点优化意见
1. 机制状态应当用枚举来做为API透出
该机制当前所处的状态,比如:NeedFix 、isFixing,建议用枚举来做为API透出。比如:
APP 启动正常
正在检测是否会在特定时间内是否会 Crash,注意:检测状态下“连续启动崩溃计数”个数小于或等于上限值
APP 出现连续启动 Crash,需要采取修复措施
APP 出现连续启动 Crash,正在修复中
2. 关键数值应当做为初始化参数供用户设置
当前启动Crash的状态
达到需要执行上报操作的“连续启动崩溃计数”个数。
达到需要执行修复操作的“连续启动崩溃计数”个数。
APP 启动后经过多少秒,可以将“连续启动崩溃计数”清零
3. 修复、上报逻辑应当支持用户异步操作
reportBlock
上报逻辑, repairtBlock
修复逻辑
比如:
typedef void (^BoolCompletionHandler)(BOOL succeeded, NSError *error);
typedef void (^RepairBlock)(ABSBoolCompletionHandler completionHandler);
用户执行 BoolCompletionHandler
后即可知道是否执行完毕,并且支持异步操作。
异步操作带来的问题,可以通过前面提到的枚举API来实时监测状态,来决定各种其他操作。
什么时候会出现该异常?
连续启动 crash 自修复技术实现与原理解析
下面给出优化后的代码实现:
//
// CYLBootingProtection.h
//
//
// Created by ChenYilong on 18/01/10.
// Copyright © 2018年 ChenYilong. All rights reserved.
//#import <Foundation/Foundation.h>typedef void (^ABSBoolCompletionHandler)(BOOL succeeded, NSError *error);
typedef void (^ABSRepairBlock)(ABSBoolCompletionHandler completionHandler);
typedef void (^ABSReportBlock)(NSUInteger crashCounts);typedef NS_ENUM(NSInteger, BootingProtectionStatus) {BootingProtectionStatusNormal, /**< APP 启动正常 */BootingProtectionStatusNormalChecking, /**< 正在检测是否会在特定时间内是否会 Crash,注意:检测状态下“连续启动崩溃计数”个数小于或等于上限值 */BootingProtectionStatusNeedFix, /**< APP 出现连续启动 Crash,需要采取修复措施 */BootingProtectionStatusFixing, /**< APP 出现连续启动 Crash,正在修复中... */
};/**
* 启动连续 crash 保护。
* 启动后 `_crashOnLaunchTimeIntervalThreshold` 秒内 crash,反复超过 `_continuousCrashOnLaunchNeedToReport` 次则上报日志,超过 `_continuousCrashOnLaunchNeedToFix` 则启动修复操作。
*/
@interface CYLBootingProtection : NSObject/**
* 启动连续 crash 保护方法。
* 前置条件:在 App 启动时注册 crash 处理函数,在 crash 时调用[CYLBootingProtection addCrashCountIfNeeded]。
* 启动后一定时间内(`crashOnLaunchTimeIntervalThreshold`秒内)crash,反复超过一定次数(`continuousCrashOnLaunchNeedToReport`次)则上报日志,超过一定次数(`continuousCrashOnLaunchNeedToFix`次)则启动修复程序;在一定时间内(`crashOnLaunchTimeIntervalThreshold`秒) 秒后若没有 crash 将“连续启动崩溃计数”计数置零。`reportBlock` 上报逻辑,`repairtBlock` 修复逻辑,完成后执行 `[self setCrashCount:0]`*/
- (void)launchContinuousCrashProtect;/*!
* 当前启动Crash的状态
*/
@property (nonatomic, assign, readonly) BootingProtectionStatus bootingProtectionStatus;/*!
* 达到需要执行上报操作的“连续启动崩溃计数”个数。
*/
@property (nonatomic, assign, readonly) NSUInteger continuousCrashOnLaunchNeedToReport;/*!
* 达到需要执行修复操作的“连续启动崩溃计数”个数。
*/
@property (nonatomic, assign, readonly) NSUInteger continuousCrashOnLaunchNeedToFix;/*!
* APP 启动后经过多少秒,可以将“连续启动崩溃计数”清零
*/
@property (nonatomic, assign, readonly) NSTimeInterval crashOnLaunchTimeIntervalThreshold;/*!
* 借助 context 可以让多个模块注册事件,并且事件 block 能独立执行,互不干扰。
*/
@property (nonatomic, copy, readonly) NSString *context;/*!
* @details 启动后kCrashOnLaunchTimeIntervalThreshold秒内crash,反复超过continuousCrashOnLaunchNeedToReport次则上报日志,超过continuousCrashOnLaunchNeedToFix则启动修复程序;当所有操作完成后,执行 completion。在 crashOnLaunchTimeIntervalThreshold 秒后若没有 crash 将 kContinuousCrashOnLaunchCounterKey 计数置零。
* @param context 借助 context 可以让多个模块注册事件,并且事件 block 能独立执行,互不干扰。
*/
- (instancetype)initWithContinuousCrashOnLaunchNeedToReport:(NSUInteger)continuousCrashOnLaunchNeedToReportcontinuousCrashOnLaunchNeedToFix:(NSUInteger)continuousCrashOnLaunchNeedToFixcrashOnLaunchTimeIntervalThreshold:(NSTimeInterval)crashOnLaunchTimeIntervalThresholdcontext:(NSString *)context;
/*!
* 当前“连续启动崩溃“的状态
*/
+ (BootingProtectionStatus)bootingProtectionStatusWithContext:(NSString *)context continuousCrashOnLaunchNeedToFix:(NSUInteger)continuousCrashOnLaunchNeedToFix;/*!
* 设置上报逻辑,参数 crashCounts 为启动连续 crash 次数
*/
- (void)setReportBlock:(ABSReportBlock)reportBlock;/*!
* 设置修复逻辑
*/
- (void)setRepairBlock:(ABSRepairBlock)repairtBlock;+ (void)setLogger:(void (^)(NSString *))logger;@end
//
// CYLBootingProtection.m
//
//
// Created by ChenYilong on 18/01/10.
// Copyright © 2018年 ChenYilong. All rights reserved.
//#import "CYLBootingProtection.h"
#import <UIKit/UIKit.h>static dispatch_queue_t _exceptionOperationQueue = 0;
void (^Logger)(NSString *log);@interface CYLBootingProtection ()@property (nonatomic, assign) NSUInteger continuousCrashOnLaunchNeedToReport;
@property (nonatomic, assign) NSUInteger continuousCrashOnLaunchNeedToFix;
@property (nonatomic, assign) NSTimeInterval crashOnLaunchTimeIntervalThreshold;
@property (nonatomic, copy) NSString *context;
@property (nonatomic, copy) ABSReportBlock reportBlock;
@property (nonatomic, copy) ABSRepairBlock repairBlock;/*!
* 设置“连续启动崩溃计数”个数
*/
- (void)setCrashCount:(NSInteger)count;/*!
* 设置“连续启动崩溃计数”个数
*/
+ (void)setCrashCount:(NSUInteger)count context:(NSString *)context;/*!
* “连续启动崩溃计数”个数
*/
- (NSUInteger)crashCount;/*!
* “连续启动崩溃计数”个数
*/
+ (NSUInteger)crashCountWithContext:(NSString *)context;@end@implementation CYLBootingProtection
+ (void)initialize {static dispatch_once_t onceToken;dispatch_once(&onceToken, ^{_exceptionOperationQueue = dispatch_queue_create("com.ChenYilong.CYLBootingProtection.fileCacheQueue", DISPATCH_QUEUE_SERIAL);});
}
- (instancetype)initWithContinuousCrashOnLaunchNeedToReport:(NSUInteger)continuousCrashOnLaunchNeedToReportcontinuousCrashOnLaunchNeedToFix:(NSUInteger)continuousCrashOnLaunchNeedToFixcrashOnLaunchTimeIntervalThreshold:(NSTimeInterval)crashOnLaunchTimeIntervalThresholdcontext:(NSString *)context {if (!(self = [super init])) {return nil;}_continuousCrashOnLaunchNeedToReport = continuousCrashOnLaunchNeedToReport;_continuousCrashOnLaunchNeedToFix = continuousCrashOnLaunchNeedToFix;_crashOnLaunchTimeIntervalThreshold = crashOnLaunchTimeIntervalThreshold;_context = [context copy];[[NSNotificationCenter defaultCenter] addObserver:selfselector:@selector(applicationWillTerminate:)name:UIApplicationWillTerminateNotificationobject:[UIApplication sharedApplication]];return self;
}/*!
* App在前台时用户手动划掉APP的时候,不计入检测。
* 但是APP在后台时划掉APP,无法检测出来。
* 见:https://stackoverflow.com/a/35041565/3395008
*/
- (void)applicationWillTerminate:(NSNotification *)note {BOOL isNormalChecking = [self isNormalChecking];if (isNormalChecking) {[self decreaseCrashCount];}
}- (void)dealloc {[[NSNotificationCenter defaultCenter] removeObserver:self];
}/*
支持同步修复、异步修复,两种修复方式
- 异步修复,不卡顿主UI,但有修复未完成就被再次触发crash、或者用户kill掉的可能。需要用户手动根据修复状态,来选择性地进行操作,应该有回掉。
- 同步修复,最简单直观,在主线程删除或者下载修复包。
*/
- (void)launchContinuousCrashProtect {NSAssert(_repairBlock, @"_repairBlock is nil!");[[self class] Logger:@"CYLBootingProtection: Launch continuous crash report"];[self resetBootingProtectionStatus];NSUInteger launchCrashes = [self crashCount];// 上报if (launchCrashes >= self.continuousCrashOnLaunchNeedToReport) {NSString *logString = [NSString stringWithFormat:@"CYLBootingProtection: App has continuously crashed for %@ times. Now synchronize uploading crash report and begin fixing procedure.", @(launchCrashes)];[[self class] Logger:logString];if (_reportBlock) {dispatch_async(dispatch_get_main_queue(),^{_reportBlock(launchCrashes);});}}// 修复if ([self isUpToBootingProtectionCount]) {[[self class] Logger:@"need to repair"];[self setIsFixing:YES];if (_repairBlock) {ABSBoolCompletionHandler completionHandler = ^(BOOL succeeded, NSError *__nullable error){if (succeeded) {[self resetCrashCount];} else {[[self class] Logger:error.description];}};dispatch_async(dispatch_get_main_queue(),^{_repairBlock(completionHandler);});}} else {[self increaseCrashCount:launchCrashes];// 正常流程,无需修复[[self class] Logger:@"need no repair"];// 记录启动时刻,用于计算启动连续 crash// 重置启动 crash 计数dispatch_after(dispatch_time(DISPATCH_TIME_NOW, (int64_t)(self.crashOnLaunchTimeIntervalThreshold * NSEC_PER_SEC)), dispatch_get_main_queue(), ^(void){// APP活过了阈值时间,重置崩溃计数NSString *logString = [NSString stringWithFormat:@"CYLBootingProtection: long live the app ( more than %@ seconds ), now reset crash counts", @(self.crashOnLaunchTimeIntervalThreshold)];[[self class] Logger:logString];[self resetCrashCount];});}
}//减少计数的时机:用户手动划掉APP
- (void)decreaseCrashCount {NSUInteger oldCrashCount = [self crashCount];[self decreaseCrashCountWithOldCrashCount:oldCrashCount];
}- (void)decreaseCrashCountWithOldCrashCount:(NSUInteger)oldCrashCount {dispatch_sync(_exceptionOperationQueue, ^{if (oldCrashCount > 0) {[self setCrashCount:oldCrashCount-1];}[self resetBootingProtectionStatus];});
}//重制计数的时机:修复完成、或者用户手动划掉APP
- (void)resetCrashCount {[self setCrashCount:0];[self resetBootingProtectionStatus];
}//只在未达到计数上限时才会增加计数
- (void)increaseCrashCount:(NSUInteger)oldCrashCount {dispatch_sync(_exceptionOperationQueue, ^{[self setIsNormalChecking:YES];[self setCrashCount:oldCrashCount+1];});
}- (void)resetBootingProtectionStatus {[self setIsNormalChecking:NO];[self setIsFixing:NO];
}- (BootingProtectionStatus)bootingProtectionStatus {return [[self class] bootingProtectionStatusWithContext:_context continuousCrashOnLaunchNeedToFix:_continuousCrashOnLaunchNeedToFix];
}/*!
*
@attention 注意之所以要检查 `BootingProtectionStatusNormalChecking` 原因如下:`-launchContinuousCrashProtect` 方法与 `-bootingProtectionStatus` 方法,如果 `-launchContinuousCrashProtect` 先执行,那么会造成如下问题:
假设n为上限,但crash(n-1)次,但是用 `-bootingProtectionStatus` 判断出来,当前已经处于n次了。原因如下:crash(n-1)次,正常流程,计数+1,变成n次,
随后在检查 `-bootingProtectionStatus` 时,发现已经处于异常状态了,实际是正常状态。所以需要使用`BootingProtectionStatusNormalChecking` 来进行区分。
*/
+ (BootingProtectionStatus)bootingProtectionStatusWithContext:(NSString *)context continuousCrashOnLaunchNeedToFix:(NSUInteger)continuousCrashOnLaunchNeedToFix {BOOL isNormalChecking = [self isNormalCheckingWithContext:context];if (isNormalChecking) {return BootingProtectionStatusNormalChecking;}BOOL isUpToBootingProtectionCount = [self isUpToBootingProtectionCountWithContext:contextcontinuousCrashOnLaunchNeedToFix:continuousCrashOnLaunchNeedToFix];if (!isUpToBootingProtectionCount) {return BootingProtectionStatusNormal;}BootingProtectionStatus type;BOOL isFixingCrash = [self isFixingCrashWithContext:context];if (isFixingCrash) {type = BootingProtectionStatusFixing;} else {type = BootingProtectionStatusNeedFix;}return type;
}- (NSUInteger)crashCount {return [[self class] crashCountWithContext:_context];
}- (void)setCrashCount:(NSInteger)count {if (count >=0) {[[self class] setCrashCount:count context:_context];}
}- (void)setIsFixing:(BOOL)isFixingCrash {[[self class] setIsFixing:isFixingCrash context:_context];
}/*!
* 是否正在修复
*/
- (BOOL)isFixingCrash {return [[self class] isFixingCrashWithContext:_context];
}- (void)setIsNormalChecking:(BOOL)isNormalChecking {[[self class] setIsNormalChecking:isNormalChecking context:_context];
}/*!
* 是否正在检查
*/
- (BOOL)isNormalChecking {return [[self class] isNormalCheckingWithContext:_context];
}+ (NSUInteger)crashCountWithContext:(NSString *)context {NSString *continuousCrashOnLaunchCounterKey = [self continuousCrashOnLaunchCounterKeyWithContext:context];NSUInteger crashCount = [[NSUserDefaults standardUserDefaults] integerForKey:continuousCrashOnLaunchCounterKey];NSString *logString = [NSString stringWithFormat:@"crashCount:%@", @(crashCount)];[[self class] Logger:logString];return crashCount;
}+ (void)setCrashCount:(NSUInteger)count context:(NSString *)context {NSString *continuousCrashOnLaunchCounterKey = [self continuousCrashOnLaunchCounterKeyWithContext:context];NSString *logString = [NSString stringWithFormat:@"setCrashCount:%@", @(count)];[[self class] Logger:logString];NSUserDefaults *defaults = [NSUserDefaults standardUserDefaults];[defaults setInteger:count forKey:continuousCrashOnLaunchCounterKey];[defaults synchronize];
}+ (void)setIsFixing:(BOOL)isFixingCrash context:(NSString *)context {NSString *continuousCrashFixingKey = [[self class] continuousCrashFixingKeyWithContext:context];NSString *logString = [NSString stringWithFormat:@"setisFixingCrash:{%@}", @(isFixingCrash)];[[self class] Logger:logString];NSUserDefaults *defaults = [NSUserDefaults standardUserDefaults];[defaults setBool:isFixingCrash forKey:continuousCrashFixingKey];[defaults synchronize];
}+ (BOOL)isFixingCrashWithContext:(NSString *)context {NSString *continuousCrashFixingKey = [[self class] continuousCrashFixingKeyWithContext:context];BOOL isFixingCrash = [[NSUserDefaults standardUserDefaults] boolForKey:continuousCrashFixingKey];NSString *logString = [NSString stringWithFormat:@"isFixingCrash:%@", @(isFixingCrash)];[[self class] Logger:logString];return isFixingCrash;
}+ (void)setIsNormalChecking:(BOOL)isNormalChecking context:(NSString *)context {NSString *continuousCrashNormalCheckingKey = [[self class] continuousCrashNormalCheckingKeyWithContext:context];NSString *logString = [NSString stringWithFormat:@"setIsNormalChecking:{%@}", @(isNormalChecking)];[[self class] Logger:logString];NSUserDefaults *defaults = [NSUserDefaults standardUserDefaults];[defaults setBool:isNormalChecking forKey:continuousCrashNormalCheckingKey];[defaults synchronize];
}+ (BOOL)isNormalCheckingWithContext:(NSString *)context {NSString *continuousCrashFixingKey = [[self class] continuousCrashNormalCheckingKeyWithContext:context];BOOL isFixingCrash = [[NSUserDefaults standardUserDefaults] boolForKey:continuousCrashFixingKey];NSString *logString = [NSString stringWithFormat:@"isIsNormalChecking:%@", @(isFixingCrash)];[[self class] Logger:logString];return isFixingCrash;
}- (BOOL)isUpToBootingProtectionCount {return [[self class] isUpToBootingProtectionCountWithContext:_context continuousCrashOnLaunchNeedToFix:_continuousCrashOnLaunchNeedToFix];
}+ (BOOL)isUpToBootingProtectionCountWithContext:(NSString *)context continuousCrashOnLaunchNeedToFix:(NSUInteger)continuousCrashOnLaunchNeedToFix {BOOL isUpToCount = [self crashCountWithContext:context] >= continuousCrashOnLaunchNeedToFix;if (isUpToCount) {return YES;}return NO;
}- (void)setReportBlock:(ABSReportBlock)block {_reportBlock = block;
}- (void)setRepairBlock:(ABSRepairBlock)block {_repairBlock = block;
}/*!
* “连续启动崩溃计数”个数,对应的Key
* 默认为 "_CONTINUOUS_CRASH_COUNTER_KEY"
*/
+ (NSString *)continuousCrashOnLaunchCounterKeyWithContext:(NSString *)context {BOOL isValid = [[self class] isValidString:context];NSString *validContext = isValid ? context : @"";NSString *continuousCrashOnLaunchCounterKey = [NSString stringWithFormat:@"%@_CONTINUOUS_CRASH_COUNTER_KEY", validContext];return continuousCrashOnLaunchCounterKey;
}/*!
* 是否正在修复记录,对应的Key
* 默认为 "_CONTINUOUS_CRASH_FIXING_KEY"
*/
+ (NSString *)continuousCrashFixingKeyWithContext:(NSString *)context {BOOL isValid = [[self class] isValidString:context];NSString *validContext = isValid ? context : @"";NSString *continuousCrashFixingKey = [NSString stringWithFormat:@"%@_CONTINUOUS_CRASH_FIXING_KEY", validContext];return continuousCrashFixingKey;
}/*!
* 是否正在检查是否在特定时间内会Crash,对应的Key
* 默认为 "_CONTINUOUS_CRASH_CHECKING_KEY"
*/
+ (NSString *)continuousCrashNormalCheckingKeyWithContext:(NSString *)context {BOOL isValid = [[self class] isValidString:context];NSString *validContext = isValid ? context : @"";NSString *continuousCrashFixingKey = [NSString stringWithFormat:@"%@_CONTINUOUS_CRASH_CHECKING_KEY", validContext];return continuousCrashFixingKey;
}#pragma mark -
#pragma mark - log and util Methods+ (void)setLogger:(void (^)(NSString *))logger {Logger = [logger copy];
}+ (void)Logger:(NSString *)log {if (Logger) Logger(log);
}+ (BOOL)isValidString:(id)notValidString {if (!notValidString) {return NO;}if (![notValidString isKindOfClass:[NSString class]]) {return NO;}NSInteger stringLength = 0;@try {stringLength = [notValidString length];} @catch (NSException *exception) {}if (stringLength == 0) {return NO;}return YES;
}@end
下面是相应的验证步骤:
等待15秒会有对应计数清零的操作日志输出:
2018-01-18 16:25:37.162980+0800 BootingProtection[89773:15553277] 类名与方法名:-[AppDelegate onBeforeBootingProtection]_block_invoke(在第45行),描述:CYLBootingProtection: Launch continuous crash report
2018-01-18 16:25:37.163140+0800 BootingProtection[89773:15553277] 类名与方法名:-[AppDelegate onBeforeBootingProtection]_block_invoke(在第45行),描述:setIsNormalChecking:{0}
2018-01-18 16:25:37.165738+0800 BootingProtection[89773:15553277] 类名与方法名:-[AppDelegate onBeforeBootingProtection]_block_invoke(在第45行),描述:setisFixingCrash:{0}
2018-01-18 16:25:37.166883+0800 BootingProtection[89773:15553277] 类名与方法名:-[AppDelegate onBeforeBootingProtection]_block_invoke(在第45行),描述:crashCount:0
2018-01-18 16:25:37.167102+0800 BootingProtection[89773:15553277] 类名与方法名:-[AppDelegate onBeforeBootingProtection]_block_invoke(在第45行),描述:crashCount:0
2018-01-18 16:25:37.167253+0800 BootingProtection[89773:15553277] 类名与方法名:-[AppDelegate onBeforeBootingProtection]_block_invoke(在第45行),描述:setIsNormalChecking:{1}
2018-01-18 16:25:37.167938+0800 BootingProtection[89773:15553277] 类名与方法名:-[AppDelegate onBeforeBootingProtection]_block_invoke(在第45行),描述:setCrashCount:1
2018-01-18 16:25:37.168806+0800 BootingProtection[89773:15553277] 类名与方法名:-[AppDelegate onBeforeBootingProtection]_block_invoke(在第45行),描述:need no repair2018-01-18 16:25:52.225197+0800 BootingProtection[89773:15553277] 类名与方法名:-[AppDelegate onBeforeBootingProtection]_block_invoke(在第45行),描述:CYLBootingProtection: long live the app ( more than 15 seconds ), now reset crash counts
2018-01-18 16:25:52.225378+0800 BootingProtection[89773:15553277] 类名与方法名:-[AppDelegate onBeforeBootingProtection]_block_invoke(在第45行),描述:setCrashCount:0
2018-01-18 16:25:52.226234+0800 BootingProtection[89773:15553277] 类名与方法名:-[AppDelegate onBeforeBootingProtection]_block_invoke(在第45行),描述:setIsNormalChecking:{0}
2018-01-18 16:25:52.226595+0800 BootingProtection[89773:15553277] 类名与方法名:-[AppDelegate onBeforeBootingProtection]_block_invoke(在第45行),描述:setisFixingCrash:{0}