函数声明:
precision_score(y_true, y_pred, labels=None, pos_label=1, average='binary', sample_weight=None)
其中较为常用的参数解释如下:
y_true:真实标签
y_pred:预测标签
average:评价值的平均值的计算方式。可以接收[None, 'binary' (default), 'micro', 'macro', 'samples', 'weighted']对于多类/多标签目标需要此参数。下面进行详细说明:
如果是None,则返回每个类的分数。否则,这决定了对数据进行平均的类型用下面几种统计方法的哪一种:
先说对于'micro' :
这两种写法是等价的
print(precision_score(y_test, y_pred,average='micro'))
print(np.sum(y_test == y_pred) / len(y_test))
以下内容中,P表示二分类时精确率的计算结果 部分内容参考
' macro ' : 相当于类间不带权重。不考虑类别数量,不适用于类别不均衡的数据集,其计算方式为: 各类别的P求和/类别数量
' weighted ' : 相当于类间带权重。各类别的P × 该类别的样本数量(实际值而非预测值)/ 样本总数量
举个例子:
如实际样本中,0类有98个样本,1类有2个样本,3类有100个样本,共有3类,样本总数为20。
预测结果中,0类全部预测为3类,全部错误;1类全部预测正确;3类全部预测为0类,全部预测错误。
则P_macro = 0 + 1 + 0 / 3 = 0.33333333
P_weighted = 0×98 + 1×2 + 0×100 / 200 = 2/200 = 0.01
再举个例子:
如共有100个样本,0类98个,1类2个;
预测结果为全0
则P_macro = 0.98 + 0 / 2 = 0.49
P_weighted = 98×0.98 + 0×2 / 100 = 2/200 = 0.9604
得出结论:
对于类别不均衡的分类模型,采用macro方式会有较大的偏差,采用weighted方式则可较好反映模型的优劣,因为若类别数量较小则存在蒙对或蒙错的概率,其结果不能真实反映模型优劣,需要较大的样本数量才可计算较为准确的评价值,通过将样本数量作为权重,可理解为评价值的置信度,数量越多,其评价值越可信。
官方说明:average : string, [None, 'binary' (default), 'micro', 'macro', 'samples', \'weighted']This parameter is required for multiclass/multilabel targets.If ``None``, the scores for each class are returned. Otherwise, thisdetermines the type of averaging performed on the data:``'binary'``:Only report results for the class specified by ``pos_label``.This is applicable only if targets (``y_{true,pred}``) are binary.``'micro'``:Calculate metrics globally by counting the total true positives,false negatives and false positives.``'macro'``:Calculate metrics for each label, and find their unweightedmean. This does not take label imbalance into account.``'weighted'``:Calculate metrics for each label, and find their average weightedby support (the number of true instances for each label). Thisalters 'macro' to account for label imbalance; it can result in anF-score that is not between precision and recall.``'samples'``:Calculate metrics for each instance, and find their average (onlymeaningful for multilabel classification where this differs from:func:`accuracy_score`).
再给个链接给予参考:https://www.cnblogs.com/harvey888/p/6964741.html