分类结果可视化python_可视化分类结果的另一种方法

分类结果可视化python

I love good data visualizations. Back in the days when I did my PhD in particle physics, I was stunned by the histograms my colleagues built and how much information was accumulated in one single plot.

我喜欢出色的数据可视化。 早在我获得粒子物理学博士学位时,我就被同事建立的直方图以及在一张图中积累了多少信息而感到震惊。

绘图中的信息 (Information in Plots)

It is really challenging to improve existing visualization methods or to transport methods from other research fields. You have to think about the dimensions in your plot and the ways to add more of them. A good example is the path from a boxplot to a violinplot to a swarmplot. It is a continuous process of adding dimensions and thus information.

改善现有的可视化方法或从其他研究领域转移方法确实是一项挑战。 您必须考虑绘图中的尺寸以及添加更多尺寸的方法。 一个很好的例子是从箱形图到小提琴图再到黑线的路径。 这是添加维度和信息的连续过程。

The possibilities of adding information or dimensions to a plot are almost endless. Categories can be added with different marker shapes, color maps like in a heat map can serve as another dimension and the size of a marker can give insight to further parameters.

向地块添加信息或尺寸的可能性几乎是无限的。 可以添加具有不同标记形状的类别,像热图一样的颜色图可以用作另一个维度,标记的大小可以洞察其他参数。

分类器效果图 (Plots of Classifier Performance)

When it comes to machine learning, there are many ways to plot the performance of a classifier. There is an overwhelming amount of metrics to compare different estimators like accuracy, precision, recall or the helpful MMC.

在机器学习方面,有许多方法可以绘制分类器的性能。 有大量指标可以比较不同的估算器,例如准确性,准确性,召回率或有用的MMC。

All of the common classification metrics are calculated from true positive, true negative, false positive and false negative incidents. The most popular plots are definitely ROC curve, PRC, CAP curve and the confusion matrix.

所有常见分类指标都是根据真实肯定,真实否定错误肯定错误否定事件计算的。 最受欢迎的图肯定是ROC曲线,PRC,CAP曲线和混淆矩阵。

I won’t get into detail of the three curves, but there are many different ways to handle the confusion matrix, like adding a heat map.

我不会详细介绍这三个曲线,但是有许多不同的方法来处理混淆矩阵,例如添加热图。

Image for post
A seaborn heatmap of a confusion matrix.
混淆矩阵的海洋热图。

分类拼接图 (A Classification Mosaic Diagram)

For many cases, this is probably sufficient and easy to pick up all relevant information, but for a multi class problem, it can get much harder to do so.

在许多情况下,这可能足够容易地提取所有相关信息,但是对于多类问题,这样做会变得更加困难。

While reading some papers, I stumbled across:

在阅读一些论文时,我偶然发现:

Jakob Raymaekers, Peter J. Rousseeuw, Mia Hubert. Visualizing classification results. arXiv:2007.14495 [stat.ML]

Jakob Raymaekers,Peter J.Rousseeuw和Mia Hubert。 可视化分类结果。 arXiv:2007.14495 [stat.ML]

and from there to

然后从那里

Friendly, Michael. “Mosaic Displays for Multi-Way Contingency Tables.” Journal of the American Statistical Association, vol. 89, no. 425, 1994, pp. 190–200. JSTOR, www.jstor.org/stable/2291215. Accessed 13 Aug. 2020.

友好,迈克尔。 “多向列联表的马赛克显示。” 美国统计协会杂志 ,第一卷。 89号 425,1994,第190-200页。 JSTOR , www.jstor.org / stable / 2291215。 于2020年8月13日访问。

The authors propose a mosaic diagram to plot discrete values. We can transport this idea to the field of machine learning with the predicted classes as the discrete values.

作者提出了一个马赛克图来绘制离散值。 我们可以将这种思想以预测的类作为离散值传输到机器学习领域。

In a multi class environment, such a plot would look like the following:

在多类环境中,这种绘图如下所示:

Image for post
Mosaic plot of a classification result with four classes.
具有四个类别的分类结果的镶嵌图。

It has several advantages over a classical confusion matrix. One can easily see the predicted classes on the y-axis and the number proportion of each class on the x-axis. The big difference from a simple bar plot is the width of the bars, which are giving an idea of the class imbalance.

与经典的混淆矩阵相比,它具有多个优点。 可以轻松地在y轴上看到预测的类别,并在x轴上看到每个类别的数量比例。 与简单条形图的最大区别在于条形的宽度,这使人们对类的不平衡有所了解。

You can find the code for such a plot fed with a confusion matrix here:

您可以在此处找到此类代码的代码,其中包含混淆矩阵:

Have fun plotting your next classification results!

祝您规划下一个分类结果愉快!

翻译自: https://towardsdatascience.com/a-different-way-to-visualize-classification-results-c4d45a0a37bb

分类结果可视化python

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389333.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

算法组合 优化算法_算法交易简化了风险价值和投资组合优化

算法组合 优化算法Photo by Markus Spiske (left) and Jamie Street (right) on UnsplashMarkus Spiske (左)和Jamie Street(右)在Unsplash上的照片 In the last post, we saw how actual algorithms are developed and tested. In this post, we will figure out the level of…

PS抠发丝技巧 「选择并遮住…」

PS抠发丝技巧 「选择并遮住…」 现在的海报设计,大多数都有模特MM,然而MM的头发实用太多了,有的还飘起来…… 对于设计师(特别是淘宝美工)没有一个强大、快速、实用的抠发丝技巧真的混不去哦。而PS CC 2017版本开始,就有了一个强大…

covid 19如何重塑美国科技公司的工作文化

未来 , 技术 , 观点 (Future, Technology, Opinion) Who would have thought that a single virus would take down the whole world and make us stay inside our homes? A pandemic wave that has altered our lives in such a way that no human (bi…

python生日悖论分析_生日悖论

python生日悖论分析If you have a group of people in a room, how many do you need to for it to be more likely than not, that two or more will have the same birthday?如果您在一个房间里有一群人,那么您需要多少个才能使两个或两个以上的人有相同的生日&a…

rstudio 管道符号_R中的管道指南

rstudio 管道符号R基础知识 (R Fundamentals) Data analysis often involves many steps. A typical journey from raw data to results might involve filtering cases, transforming values, summarising data, and then running a statistical test. But how can we link al…

蒙特卡洛模拟预测股票_使用蒙特卡洛模拟来预测极端天气事件

蒙特卡洛模拟预测股票In a previous article, I outlined the limitations of conventional time series models such as ARIMA when it comes to forecasting extreme temperature values, which in and of themselves are outliers in the time series.在上一篇文章中 &#…

直方图绘制与直方图均衡化实现

一,直方图的绘制 1.直方图的概念: 在图像处理中,经常用到直方图,如颜色直方图、灰度直方图等。 图像的灰度直方图就描述了图像中灰度分布情况,能够很直观的展示出图像中各个灰度级所 占的多少。 图像的灰度直方图是灰…

时间序列因果关系_分析具有因果关系的时间序列干预:货币波动

时间序列因果关系When examining a time series, it is quite common to have an intervention influence that series at a particular point.在检查时间序列时,在特定时间点对该序列产生干预影响是很常见的。 Some examples of this could be:例如: …

微生物 研究_微生物监测如何工作,为何如此重要

微生物 研究Background背景 While a New York Subway station is bustling with swarms of businessmen, students, artists, and millions of other city-goers every day, its floors, railings, stairways, toilets, walls, kiosks, and benches are teeming with non-huma…

Linux shell 脚本SDK 打包实践, 收集assets和apk, 上传FTP

2019独角兽企业重金招聘Python工程师标准>>> git config user.name "jenkins" git config user.email "jenkinsgerrit.XXX.net" cp $JENKINS_HOME/maven.properties $WORKSPACE cp $JENKINS_HOME/maven.properties $WORKSPACE/app cp $JENKINS_…

opencv:卷积涉及的基础概念,Sobel边缘检测代码实现及卷积填充模式

具体参考我的另一篇文章: opencv:卷积涉及的基础概念,Sobel边缘检测代码实现及Same(相同)填充与Vaild(有效)填充 这里是对这一篇文章的补充! 卷积—三种填充模式 橙色部分为image, 蓝色部分为…

无法从套接字中获取更多数据_数据科学中应引起更多关注的一个组成部分

无法从套接字中获取更多数据介绍 (Introduction) Data science, machine learning, artificial intelligence, those terms are all over the news. They get everyone excited with the promises of automation, new savings or higher earnings, new features, markets or te…

web数据交互_通过体育运动使用定制的交互式Web应用程序数据科学探索任何数据...

web数据交互Most good data projects start with the analyst doing something to get a feel for the data that they are dealing with.大多数好的数据项目都是从分析师开始做一些事情,以便对他们正在处理的数据有所了解。 They might hack together a Jupyter n…

PCA(主成分分析)思想及实现

PCA的概念: PCA是用来实现特征提取的。 特征提取的主要目的是为了排除信息量小的特征,减少计算量等。 简单来说: 当数据含有多个特征的时候,选取主要的特征,排除次要特征或者不重要的特征。 比如说:我们要…

【安富莱二代示波器教程】第8章 示波器设计—测量功能

第8章 示波器设计—测量功能 二代示波器测量功能实现比较简单,使用2D函数绘制即可。不过也专门开辟一个章节,为大家做一个简单的说明,方便理解。 8.1 水平测量功能 8.2 垂直测量功能 8.3 总结 8.1 水平测量功能 水平测量方…

深度学习数据更换背景_开始学习数据科学的最佳方法是了解其背景

深度学习数据更换背景数据科学教育 (DATA SCIENCE EDUCATION) 目录 (Table of Contents) The Importance of Context Knowledge 情境知识的重要性 (Optional) Research Supporting Context-Based Learning (可选)研究支持基于上下文的学习 The Context of Data Science 数据科学…

熊猫数据集_用熊猫掌握数据聚合

熊猫数据集Data aggregation is the process of gathering data and expressing it in a summary form. This typically corresponds to summary statistics for numerical and categorical variables in a data set. In this post we will discuss how to aggregate data usin…

IOS CALayer的属性和使用

一、CALayer的常用属性 1、propertyCGPoint position; 图层中心点的位置,类似与UIView的center;用来设置CALayer在父层中的位置;以父层的左上角为原点(0,0); 2、 property CGPoint anchorPoint…

QZEZ第一届“饭吉圆”杯程序设计竞赛

终于到了饭吉圆杯的开赛,这是EZ我参与的历史上第一场ACM赛制的题目然而没有罚时 不过题目很好,举办地也很成功,为法老点赞!!! 这次和翰爷,吴骏达 dalao,陈乐扬dalao组的队&#xff0…

谈谈数据分析 caoz_让我们谈谈开放数据…

谈谈数据分析 caozAccording to the International Open Data Charter(1), it defines open data as those digital data that are made available with the technical and legal characteristics necessary so that they can be freely used, reused and redistributed by any…