r语言绘制雷达图_用r绘制雷达蜘蛛图

r语言绘制雷达图

I’ve tried several different types of NBA analytical articles within my readership who are a group of true fans of basketball. I found that the most popular articles are not those with state-of-the-art machine learning technologies, but those with straightforward and meaningful graphs.

我在读者群中尝试了几种不同类型的NBA分析文章,这些文章是一群真正的篮球迷。 我发现最受欢迎的文章不是那些具有最新的机器学习技术的文章,而是那些具有简单明了的图表的文章。

At a certain stage of my career as a data scientist, I realized that delivering the information is more important than showing the fancy models. Perhaps that’s why linear regression is still one of the most popular models in the finance world.

在我作为数据科学家的职业生涯的某个阶段,我意识到提供信息比展示精美的模型更为重要。 也许这就是为什么线性回归仍然是金融界最受欢迎的模型之一的原因。

In this post, I am going to talk about a simple topic. It is how to draw the spider plot, or the radar plot, which is one of the most essential graphs in a comparative analysis. I am implementing the code in R.

在这篇文章中,我将讨论一个简单的话题。 这是绘制蜘蛛图或雷达图的方法 ,它是比较分析中最重要的图之一。 我正在R中实现代码。

数据 (Data)

NBA players’ basic statistics and advanced statistics per game in 2019–2020 NBA playoffs. (from basketball reference)

在2019–2020 NBA季后赛中,NBA球员的每场比赛的基本统计数据和高级统计数据。 ( 参考篮球 )

(Code)

Let’s first visualize James Harden’s stats in a spider plot. We only focus on five stats: PTS (points), TRB (total rebounds), AST (assists), STL (steals), and BLK (blocks).

让我们首先在蜘蛛图中形象化詹姆斯·哈登的统计数据。 我们仅关注五项统计数据:PTS(得分),TRB(总篮板),AST(助攻),STL(抢断)和BLK(盖帽)。

df = read.csv("playoff_stats.csv")
maxxx = apply(df[,c("PTS.","TRB","AST","STL","BLK")],2,max)
minnn = apply(df[,c("PTS.","TRB","AST","STL","BLK")],2,min)

In the code of this block, the data was read to a data frame, “df”. And the maximum and minimum values of each column were calculated because these values are useful to define the boundary of the data in the spider plot.

在该块的代码中,数据被读取到数据帧“ df”。 并计算每列的最大值和最小值,因为这些值可用于定义蜘蛛图中数据的边界。

For example, I extracted the stats of James Harden and LeBron James for our analysis.

例如,我提取了James Harden和LeBron James的统计数据进行分析。

df_sel = df[c(3,10),c("PTS.","TRB","AST","STL","BLK")]
rownames(df_sel) = c("Harden","Lebron")

To define the function of the spider plot, we need to load the fmsb package.

要定义蜘蛛图的功能,我们需要加载fmsb包。

comp_plot = function(data,maxxx,minnn){
library(fmsb)
data = rbind(maxxx, minnn, data)
colors_border=c( rgb(0.2,0.5,0.5,0.9), rgb(0.8,0.2,0.5,0.9) , rgb(0.7,0.5,0.1,0.9) )
colors_in=c( rgb(0.2,0.5,0.5,0.4), rgb(0.8,0.2,0.5,0.4) , rgb(0.7,0.5,0.1,0.4) )
radarchart( data, axistype=1 , pcol=colors_border , pfcol=colors_in , plwd=4 , plty=1, cglcol="grey", cglty=1, axislabcol="grey", caxislabels=rep("",5), cglwd=0.8, vlcex=0.8)
legend(x=0.5, y=1.2, legend = rownames(data[-c(1,2),]), bty = "n", pch=20 , col=colors_in , text.col = "black", cex=1, pt.cex=3)
}

In the function, “radarchart” is to draw the spider plot, some arguments of which is explained below.

在函数中,“ radarchart”用于绘制蜘蛛图,下面将解释其中的一些参数。

pcol and pfcol define the color of lines and the color to fill, respectively. plwd and plty give the line width and type of the spider plot, respectively. The grid line (or the web) has the color and type defined by cglcol and cglty. I don’t want to put any label in the center of the spider plot, so the caxislabels is given null strings (rep(“”,5)).

pcolpfcol分别定义线条的颜色和要填充的颜色。 plwdplty分别给出了蜘蛛图的线宽和类型。 网格线(或网络)的颜色和类型由cglcolcglty定义。 我不想在蜘蛛图的中心放置任何标签,因此caxislabels给出了空字符串(rep(“”,5))。

Let’s see how Harden’s stats look.

让我们看看哈登的统计数据如何。

comp_plot(df_sel[1,],maxxx,minnn)
Image for post
Spider plot of James Harden’s playoff stats
詹姆斯·哈登季后赛数据的蜘蛛情节

From the plot above, we can see that Harden is not only an excellent scorer (high points) but also a playmaker (high assists). These interpretations are perfectly consistent with the James Harden we know.

从上图可以看出,哈登不仅是出色的得分手(高分),而且是组织者(高助攻)。 这些解释与我们所知的詹姆斯·哈登完全吻合。

To compare the stats between James Harden and LeBron James, let’s input both players’ stats to the function.

为了比较詹姆斯·哈登和勒布朗·詹姆斯之间的数据,我们将两个球员的数据输入到该函数中。

comp_plot(df_sel,maxxx,minnn)
Image for post
Spider plot of comparison between James Harden and LeBron James
詹姆斯·哈登与勒布朗·詹姆斯之间的比较蜘蛛图

We can see that LeBron has got better rebound and assist numbers in the stats comparing to Harden even though his scoring is not as good as Harden.

我们可以看到,勒布朗的得分和助攻数据都比哈登更好,尽管他的得分不如哈登。

Pretty straight forward, right?

很简单吧?

Let’s do a similar comparison between Giannis Antetokounmpo and Kawhi Leonard in their advanced statistics, which include offense box plus/minus (OBPM), defense box plus/minus (DBPM), offense win share (OWS), defense win share (DWS), and true shooting percentage (TS).

让我们在Giannis Antetokounmpo和Kawhi Leonard的高级统计数据中进行类似的比较,其中包括进攻框正负(OBPM),防守框正负(DBPM),进攻赢率(OWS),防守赢率(DWS),和真实拍摄百分比(TS)。

df = read.csv("playoff_stats_adv.csv")
maxxx = apply(df[,c("OBPM","DBPM","OWS","DWS","TS.")],2,max)
minnn = apply(df[,c("OBPM","DBPM","OWS","DWS","TS.")],2,min)
df_sel = df[c(1,3),c("OBPM","DBPM","OWS","DWS","TS.")]
rownames(df_sel) = c("Giannis","Kawhi")

Let’s see Giannis’s stats first.

首先让我们看看吉安尼斯的统计数据。

comp_plot(df_sel[1,],maxxx,minnn)
Image for post
Spider plot of Giannis Antetokounmpo’s playoff advanced stats
Giannis Antetokounmpo的季后赛高级数据

We can find that Giannis is an all-round star because he has good stats in almost every aspect. No wonder he won his second MVP for the 2019–2020 regular season.

我们可以发现吉安尼斯是全能明星,因为他几乎在各个方面都有出色的数据。 难怪他赢得了2019–2020常规赛第二次MVP。

Next, let’s compare Giannis with Kawhi Leonard in their advanced statistics.

接下来,让我们将Giannis和Kawhi Leonard的高级统计数据进行比较。

comp_plot(df_sel,maxxx,minnn)
Image for post
Spider plot of comparison between Giannis Antetokounmpo and Kawhi Leonard
Giannis Antetokounmpo和Kawhi Leonard之间的比较蜘蛛图

We can see that Giannis outperformed Kawhi in all aspects of the advanced stats.

我们可以看到,吉安尼斯在所有高级统计数据方面都胜过了Kawhi。

You can compare any number of players with this simple function, however, I don’t recommend to use a spider plot for the comparison of more than 3 individuals.

您可以使用此简单功能比较任意数量的玩家,但是,我不建议使用蜘蛛图来比较3个以上的玩家。

If you do need to compare a large group of objects, a heatmap could be a better choice for visualization. Here is one of my previous posts of the best heatmap function in R.

如果确实需要比较大量对象,则热图可能是可视化的更好选择。 这是我以前有关R中最佳热图函数的文章之一。

I hope this short article could contribute to your toolkit as a data scientist!

我希望这篇简短的文章可以对您作为数据科学家的工具包有所帮助!

Image for post
Photo by Zan on Unsplash
Zan在Unsplash上的照片

翻译自: https://towardsdatascience.com/draw-a-radar-spider-plot-with-r-4af9693c3237

r语言绘制雷达图

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389393.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

java 分裂数字_分裂的补充:超越数字,打印物理可视化

java 分裂数字As noted in my earlier Nightingale writings, color harmony is the process of choosing colors on a Color Wheel that work well together in the composition of an image. Today, I will step further into color theory by discussing the Split Compleme…

结构化数据建模——titanic数据集的模型建立和训练(Pytorch版)

本文参考《20天吃透Pytorch》来实现titanic数据集的模型建立和训练 在书中理论的同时加入自己的理解。 一,准备数据 数据加载 titanic数据集的目标是根据乘客信息预测他们在Titanic号撞击冰山沉没后能否生存。 结构化数据一般会使用Pandas中的DataFrame进行预处理…

比赛,幸福度_幸福与生活满意度

比赛,幸福度What is the purpose of life? Is that to be happy? Why people go through all the pain and hardship? Is it to achieve happiness in some way?人生的目的是什么? 那是幸福吗? 人们为什么要经历所有的痛苦和磨难? 是通过…

带有postgres和jupyter笔记本的Titanic数据集

PostgreSQL is a powerful, open source object-relational database system with over 30 years of active development that has earned it a strong reputation for reliability, feature robustness, and performance.PostgreSQL是一个功能强大的开源对象关系数据库系统&am…

Django学习--数据库同步操作技巧

同步数据库:使用上述两条命令同步数据库1.认识migrations目录:migrations目录作用:用来存放通过makemigrations命令生成的数据库脚本,里面的生成的脚本不要轻易修改。要正常的使用数据库同步的功能,app目录下必须要有m…

React 新 Context API 在前端状态管理的实践

2019独角兽企业重金招聘Python工程师标准>>> 本文转载至:今日头条技术博客 众所周知,React的单向数据流模式导致状态只能一级一级的由父组件传递到子组件,在大中型应用中较为繁琐不好管理,通常我们需要使用Redux来帮助…

机器学习模型 非线性模型_机器学习模型说明

机器学习模型 非线性模型A Case Study of Shap and pdp using Diabetes dataset使用糖尿病数据集对Shap和pdp进行案例研究 Explaining Machine Learning Models has always been a difficult concept to comprehend in which model results and performance stay black box (h…

5分钟内完成胸部CT扫描机器学习

This post provides an overview of chest CT scan machine learning organized by clinical goal, data representation, task, and model.这篇文章按临床目标,数据表示,任务和模型组织了胸部CT扫描机器学习的概述。 A chest CT scan is a grayscale 3…

Pytorch高阶API示范——线性回归模型

本文与《20天吃透Pytorch》有所不同,《20天吃透Pytorch》中是继承之前的模型进行拟合,本文是单独建立网络进行拟合。 代码实现: import torch import numpy as np import matplotlib.pyplot as plt import pandas as pd from torch import …

作业要求 20181023-3 每周例行报告

本周要求参见:https://edu.cnblogs.com/campus/nenu/2018fall/homework/2282 1、本周PSP 总计:927min 2、本周进度条 代码行数 博文字数 用到的软件工程知识点 217 757 PSP、版本控制 3、累积进度图 (1)累积代码折线图 &…

算命数据_未来的数据科学家或算命精神向导

算命数据Real Estate Sale Prices, Regression, and Classification: Data Science is the Future of Fortune Telling房地产销售价格,回归和分类:数据科学是算命的未来 As we all know, I am unusually blessed with totally-real psychic abilities.众…

openai-gpt_为什么到处都看到GPT-3?

openai-gptDisclaimer: My opinions are informed by my experience maintaining Cortex, an open source platform for machine learning engineering.免责声明:我的看法是基于我维护 机器学习工程的开源平台 Cortex的 经验而 得出 的。 If you frequent any part…

Pytorch高阶API示范——DNN二分类模型

代码部分: import numpy as np import pandas as pd from matplotlib import pyplot as plt import torch from torch import nn import torch.nn.functional as F from torch.utils.data import Dataset,DataLoader,TensorDataset""" 准备数据 &qu…

OO期末总结

$0 写在前面 善始善终,临近期末,为一学期的收获和努力画一个圆满的句号。 $1 测试与正确性论证的比较 $1-0 什么是测试? 测试是使用人工操作或者程序自动运行的方式来检验它是否满足规定的需求或弄清预期结果与实际结果之间的差别的过程。 它…

数据可视化及其重要性:Python

Data visualization is an important skill to possess for anyone trying to extract and communicate insights from data. In the field of machine learning, visualization plays a key role throughout the entire process of analysis.对于任何试图从数据中提取和传达见…

【洛谷算法题】P1046-[NOIP2005 普及组] 陶陶摘苹果【入门2分支结构】Java题解

👨‍💻博客主页:花无缺 欢迎 点赞👍 收藏⭐ 留言📝 加关注✅! 本文由 花无缺 原创 收录于专栏 【洛谷算法题】 文章目录 【洛谷算法题】P1046-[NOIP2005 普及组] 陶陶摘苹果【入门2分支结构】Java题解🌏题目…

python多项式回归_如何在Python中实现多项式回归模型

python多项式回归Let’s start with an example. We want to predict the Price of a home based on the Area and Age. The function below was used to generate Home Prices and we can pretend this is “real-world data” and our “job” is to create a model which wi…

充分利用UC berkeleys数据科学专业

By Kyra Wong and Kendall Kikkawa黄凯拉(Kyra Wong)和菊川健多 ( Kendall Kikkawa) 什么是“数据科学”? (What is ‘Data Science’?) Data collection, an important aspect of “data science”, is not a new idea. Before the tech boom, every industry al…

02-web框架

1 while True:print(server is waiting...)conn, addr server.accept()data conn.recv(1024) print(data:, data)# 1.得到请求的url路径# ------------dict/obj d["path":"/login"]# d.get(”path“)# 按着http请求协议解析数据# 专注于web业…

ai驱动数据安全治理_AI驱动的Web数据收集解决方案的新起点

ai驱动数据安全治理Data gathering consists of many time-consuming and complex activities. These include proxy management, data parsing, infrastructure management, overcoming fingerprinting anti-measures, rendering JavaScript-heavy websites at scale, and muc…