广告投手_测量投手隐藏自己的音高的程度

广告投手

As the baseball community has recently seen with the Astros 2017 cheating scandal, knowing what pitch is being thrown gives batters a game-breaking advantage. However, unless you have an intricate system of cameras and trash cans set up, knowing what pitch is about to be thrown is incredibly difficult. Batters have mere fractions of a second to pick up on signals that might indicate the type of pitch coming their way and even fewer milliseconds to process and act on that information. Signals that could indicate what pitch is being thrown could be a subconscious tell from the pitcher or a sign from a runner looking in at second. But these are either situational or easily avoided by experienced pitchers. A much harder to avoid pitch indicator is the pitcher’s release point.

正如棒球界最近在2017年《太空人》中的作弊丑闻中所看到的那样,知道投掷什么球距会让击球手具有突破性的优势。 但是,除非您设置了复杂的照相机和垃圾桶系统,否则要知道将要抛出的间距是非常困难的。 击球手只有几分之一秒的时间就能接收到可能表明音高类型的信号,甚至更少的毫秒来处理和处理该信息。 可能表明投掷了什么音调的信号可能是投手下意识地发出的信号,也可能是跑步者看了一眼的信号。 但是这些都是有条件的,或是经验丰富的投手很容易避免。 投手的释放点是一个更难避免的投球指示器。

Image for post
Sean Manaea’s release points for pitches thrown in 2018–2019
肖恩·马纳阿(Sean Manaea)在2018–2019年投掷球场的释放点

The figure above shows Oakland A’s pitcher Sean Manaea’s release point for all of his pitches thrown since 2018. As we can see from the plot, there are relatively distinct differences in release point depending on the type of pitch. For Manaea, he tends to release his fastball lower than any other pitch and his changeup is, on average, released at the highest point. An observant batter could pick up on these signals and use them to their advantage. No trash cans needed, just skill. However, not all pitchers have such distinct differences in their release point. Take Justin Verlander for example:

上图显示了Oakland A的投手Sean Manaea自2018年以来投出的所有投球的释放点。从图中可以看出,根据投球类型的不同,释放点存在相对不同的差异。 对于Manaea而言,他倾向于将自己的快球释放得比其他任何俯仰都低,并且他的换乘平均是在最高点释放的。 细心的击球手可能会捡起这些信号,并利用它们来发挥优势。 无需垃圾桶,只需技巧即可。 但是,并非所有的投手在释放点上都有如此明显的不同。 以Justin Verlander为例:

Image for post
Justin Verlander’s release points for pitches in 2019
贾斯汀·维兰德(Justin Verlander)在2019年的发布点

Looking at the graph above, we can see that Verlander’s release points are much more uniform. He doesn’t seem to have a very distinct difference in release point depending on his pitch type. This makes it much harder for a batter to predict what pitch Verlander is throwing based on differences in arm slot.

查看上图,我们可以看到Verlander的发布点更加统一。 根据他的音高类型,他的释放点似乎没有非常明显的不同。 这使击球手很难根据臂隙的不同来预测Verlander的投球角度。

Quantifying Ability to Hide Pitches:

量化隐藏音高的能力:

Now that we have seen this difference in ability to hide pitches, a natural question would be: how can we quantify this difference? The way I decided to quantify this ability is by using a classification model. If you don’t know what a classification model is, here is a quick summary. Classification is a machine learning model that attempts to classify data based on certain features. In this particular model the ‘classes’ are the pitch types and the features are the coordinates for the release point and the count the pitch was thrown in. So the model takes the the release point coordinates and the count and does its best to determine what type of pitch was thrown based on that information.

既然我们已经看到了隐藏音高的能力上的差异,那么一个自然的问题将是:如何量化这种差异? 我决定量化此功能的方法是使用分类模型。 如果您不知道分类模型是什么,请快速总结一下。 分类是一种机器学习模型,试图基于某些功能对数据进行分类。 在这个特定的模型中,“类”是音高类型,特征是释放点的坐标以及被抛出音高的计数。因此,该模型将获取释放点的坐标和计数,并尽最大努力确定根据该信息抛出的音高类型。

Feature Selection:

功能选择:

Of course, if I simply wanted to accurately classify what pitch was thrown I could include spin rate and the movement metrics of the pitch as features to make a much more accurate model. But I want to quantify how well pitchers hide their pitches from batters so I only want information that is available to the batter up until the ball is released from the pitcher’s hand. As a consequence, in this model we only have the release coordinates of the pitch and the count it was thrown in as features.

当然,如果我只是想准确地对抛出的音高进行分类,则可以将旋转速度和音高的运动指标包括在内,以形成更准确的模型。 但是我想量化投手对击球手的掩饰效果,所以我只希望击球手能得到的信息直到球从投手手中释放出来为止。 因此,在此模型中,我们仅具有音高的释放坐标和作为特征抛出的音高。

Evaluating the Model:

评估模型:

Once the model has attempted to classify the data it is given, we need a way to evaluate how well it classified the data. This is how we will measure a pitcher’s ability to hide pitches from batters. The metric used to evaluate the model is called the precision score. This essentially returns the proportion of pitches that were classified correctly so it will range from 0 to 1. If the model is able to classify a large proportion of pitches correctly (a precision score value closer to 1) that tells us that the pitcher has more distinct differences in release points for his pitches and/or he is very predictable in the pitches he throws in certain counts. A precision score closer to 0 indicates that the model could not effectively classify the pitch type based on release point and count which tells us that the pitcher is much better at releasing pitches from the same point and mixes them up well depending on the count. One thing that needs to be kept in mind is that pitchers with a larger pitch repertoire will have a lower score simply because there are more pitches to classify. To counteract this I will be measuring how well the model does based on how much better it performs compared to simply randomly classifying the pitches. For example, if a pitcher has 4 pitches and you randomly guessed the pitch type you would expect to get 25% of them correct. So if the precision score for a pitcher with 4 pitches is 0.5, its adjusted score would be 2 because it is twice as effective compared to randomly guessing.

一旦模型尝试对给定的数据进行分类,我们需要一种方法来评估其对数据的分类程度。 这就是我们测量投手隐藏击球手投球能力的方式。 用于评估模型的度量标准称为精度得分。 这实际上会返回正确分类的音调的比例,因此范围为0到1。如果模型能够正确分类很大比例的音调(精度得分值接近1),则告诉我们该音调器具有更多他的投掷点在释放点上的明显差异和/或他在某些计数下投出的投掷点非常可预测。 精度得分接近0表示该模型无法根据释放点和计数有效地对音高类型进行分类,这告诉我们该投手在释放同一点的音高方面要好得多,并且根据计数将它们很好地混合在一起。 需要牢记的一件事是,具有更大音调库的投手将具有较低的分数,这仅仅是因为要分类的音调更多。 为了解决这个问题,我将根据模型的性能好于简单随机分类的音调来衡量模型的性能。 例如,如果一个投手有4个音高,而您随机猜测了音高类型,那么您会期望其中的25%正确。 因此,如果一个具有4个音高的投手的精确度得分为0.5,则其调整后的得分将为2,因为它的效率是随机猜测的两倍。

Results:

结果:

Now that we have defined our model and evaluation metrics, let’s see the results. Here I picked 16 random pitchers from 2019 and ran their pitch data through the model.

现在我们已经定义了模型和评估指标,让我们看看结果。 在这里,我从2019年挑选了16个随机水罐,并通过模型运行了它们的水罐数据。

Image for post

Our ‘winner’ is Blake Snell who has both the highest precision score and highest adjusted score. Snell’s high score suggests that he has distinct release points for his different, let’s see a plot of his release points to verify this.

我们的“胜利者”是布雷克·斯内尔(Blake Snell),他同时拥有最高的准确度得分和最高的调整后得分。 斯内尔(Snell)的高分表明他有不同的释放点,让我们看一下他的释放点图以验证这一点。

Image for post
Blake Snell’s release points
布莱克·斯内尔(Blake Snell)的发行要点

The graph above seems to fall in line with Snell’s high precision score. Snell appears to have very distinct areas where he releases his pitches with his changeup being released lower and to the right and his curveball and fastball being released higher and to the left.

上图似乎与Snell的高精度得分相符。 斯内尔(Snell)似乎有非常不同的区域,他的球高和球高分别向左下方和右下方释放,而他的曲线球和快球则在上方和左侧释放。

This difference is made even clearer when compared to Gerrit Cole who had the lowest precision score out of the players I tested.

与Gerrit Cole相比,这种差异更加明显,后者在我测试的球员中得分最低。

Image for post
Gerrit Cole’s release points
Gerrit Cole的发行要点

Gerrit Cole’s release points are much more muddled and there aren’t clear patterns for where he releases certain pitches. This makes it much more difficult for batters to pick up what pitch is being thrown out of his hand.

杰里特·科尔(Gerrit Cole)的释放点更加混乱,他释放特定音高的位置也没有明确的模式。 这使得击球手要捡起他手中的球变得更加困难。

Applications:

应用范围:

While this little experiment was more for my sake, I believe there are a couple ways teams could utilize this data and model.

就我而言,虽然这个小实验更多,但我相信团队可以采用几种方法来利用此数据和模型。

The first application would be for batters to find pitchers with distinct differences in release point. Teams and hitters can then look at the release point plots of those pitchers to find the patterns for the different pitches and use that information to help figure out what pitch is being thrown based on the pitcher’s arm slot. Granted, that is much easier said than done but skilled players could use that information to great effect.

击球手的第一个应用程序是查找释放点明显不同的投手。 然后,团队和击球手可以查看那些投手的释放点图,以找到不同投手的模式,并使用该信息来帮助根据投手的臂隙来找出要投掷的投手。 当然,这说起来容易做起来难,但是熟练的玩家可以利用这些信息产生巨大的效果。

The second application would be for pitchers. Pitchers that have high precision scores may want to try and lower their precision score by consistently releasing from the same point regardless of the pitch to make it harder for batters to pick up on what is being thrown.

第二个应用是投手。 具有高精确度得分的投手可能想要通过不断地从同一点释放而不管其间距如何来降低其精确度得分,从而使击球手更难捡起所投掷的东西。

If you have any more ideas for how this data can be applied feel free to let me know!

如果您对如何应用这些数据还有其他想法,请随时告诉我!

The code for the model and data can be found here on my GitHub. This is my first time using classification so any tips or criticisms are greatly appreciated.

该模型和数据的代码可以在我的GitHub上找到 。 这是我第一次使用分类,因此非常感谢任何提示或批评。

翻译自: https://towardsdatascience.com/measuring-how-well-pitchers-hide-their-pitches-f61f076d91f4

广告投手

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388554.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

验证部分表单是否重复

1. 效果 图片中的名称、机构编码需要进行重复验证2. 思路及实现 表单验证在获取数据将需要验证的表单数据进行保存this.nameChangeTemp response.data.orgName;this.codeChangeTemp response.data.orgCode; 通过rule对表单进行验证 以名字的验证为例rules: {orgName: [// 设置…

python bokeh_提升视觉效果:使用Python和Bokeh制作交互式地图

python bokehLet’s face it, fellow data scientists: our clients LOVE dashboards. Why wouldn’t they? Visualizing our data helps us tell a story. Visualization turns thousands of rows of data into a compelling and beautiful narrative. In fact, dashboard vi…

用C#写 四舍五入函数(原理版)

doubled 0.06576523;inti (int)(d/0.01);//0.01决定了精度 doubledd (double)i/100;//还原 if(d-dd>0.005)dd0.01;//四舍五入 MessageBox.Show((dd*100).ToString()"%");//7%,dd*100就变成百分的前面那一部分了

浪里个浪 FZU - 2261

TonyY是一个喜欢到处浪的男人,他的梦想是带着兰兰姐姐浪遍天朝的各个角落,不过在此之前,他需要做好规划。 现在他的手上有一份天朝地图,上面有n个城市,m条交通路径,每条交通路径都是单行道。他已经预先规划…

C#设计模式(9)——装饰者模式(Decorator Pattern)

一、引言 在软件开发中,我们经常想要对一类对象添加不同的功能,例如要给手机添加贴膜,手机挂件,手机外壳等,如果此时利用继承来实现的话,就需要定义无数的类,如StickerPhone(贴膜是手…

nosql_探索NoSQL系列

nosql数据科学 (Data Science) Knowledge on NoSQL databases seems to be an increasing requirement in data science applications, yet, the taxonomy is so diverse and problem-centered that it can be a challenge to grasp them. This post attempts to shed light on…

C++TCP和UDP属于传输层协议

TCP和UDP属于传输层协议。其中TCP提供IP环境下的数据可靠传输,它事先为要发送的数据开辟好连接通道(三次握手),然后再进行数据发送;而UDP则不为IP提供可靠性,一般用于实时的视频流传输,像rtp、r…

程序员如何利用空闲时间挣零花钱

一: 私活 作为一名程序员,在上班之余,我们有大把的时间,不能浪费,这些时间其实都是可以用来挖掘自己潜在的创造力,今天要讨论的话题就是,程序员如何利用空余时间挣零花钱?比如说周末…

python中api_通过Python中的API查找相关的工作技能

python中api工作技能世界 (The World of Job Skills) So you want to figure out where your skills fit into today’s job market. Maybe you’re just curious to see a comprehensive constellation of job skills, clean and standardized. Or you need a taxonomy of ski…

欺诈行为识别_使用R(编程)识别欺诈性的招聘广告

欺诈行为识别背景 (Background) Online recruitment fraud (ORF) is a form of malicious behaviour that aims to inflict loss of privacy, economic damage or harm the reputation of the stakeholders via fraudulent job advertisements.在线招聘欺诈(ORF)是一种恶意行为…

c语言实验四报告,湖北理工学院14本科C语言实验报告实验四数组

湖北理工学院14本科C语言实验报告实验四 数组.doc实验四 数 组实验课程名C语言程序设计专业班级 14电气工程2班 学号 201440210237 姓名 熊帆 实验时间 5.12-5.26 实验地点 K4-208 指导教师 祁文青 一、实验目的和要求1. 掌握一维数组和二维数组的定义、赋值和输入输出的方法&a…

rabbitmq channel参数详解【转】

1、Channel 1.1 channel.exchangeDeclare(): type:有direct、fanout、topic三种durable:true、false true:服务器重启会保留下来Exchange。警告:仅设置此选项,不代表消息持久化。即不保证重启后消息还在。原…

nlp gpt论文_GPT-3:NLP镇的最新动态

nlp gpt论文什么是GPT-3? (What is GPT-3?) The launch of Open AI’s 3rd generation of the pre-trained language model, GPT-3 (Generative Pre-training Transformer) has got the data science fraternity buzzing with excitement!Open AI的第三代预训练语言…

真实不装| 阿里巴巴新人上路指北

新手上路,总想听听前辈们分享他们走过的路。橙子选取了阿里巴巴合伙人逍遥子(阿里巴巴集团CEO) 、Eric(蚂蚁金服董事长兼CEO)、Judy(阿里巴巴集团CPO)的几段分享,他们是如何看待职场…

小程序学习总结

上个周末抽空了解了一下小程序,现在将所学所感记录以便日后翻看;需要指出的是我就粗略过了下小程序的api了解了下小程序的开发流程以及工具的使用,然后写了一个小程序的demo;在我看来,如果有前端基础学习小程序无异于锦上添花了,而我这个三年的码农虽也写过不少前端代码但离专业…

uber 数据可视化_使用R探索您在Uber上的活动:如何分析和可视化您的个人数据历史记录

uber 数据可视化Perhaps, dear reader, you are too young to remember that before, the only way to request a particular transport service such as a taxi was to raise a hand to make a signal to an available driver, who upon seeing you would stop if he was not …

java B2B2C springmvc mybatis电子商城系统(四)Ribbon

2019独角兽企业重金招聘Python工程师标准>>> 一:Ribbon是什么? Ribbon是Netflix发布的开源项目,主要功能是提供客户端的软件负载均衡算法,将Netflix的中间层服务连接在一起。Ribbon客户端组件提供一系列完善的配置项如…

基于plotly数据可视化_[Plotly + Datashader]可视化大型地理空间数据集

基于plotly数据可视化简介(我们将创建的内容): (Introduction (what we’ll create):) Unlike the previous tutorials in this map-based visualization series, we will be dealing with a very large dataset in this tutorial (about 2GB of lat, lon coordinat…

Centos用户和用户组管理

inux系统是一个多用户多任务的分时操作系统,任何一个要使用系统资源的用户,都必须首先向系统管理员申请一个账号,然后以这个账号的身份进入系统。1、添加新的用户账号使用useradd命令,其语法如下:useradd 选项 用户名-…

划痕实验 迁移面积自动统计_从Jupyter迁移到合作实验室

划痕实验 迁移面积自动统计If you want to use Google Colaboratory to perform your data analysis, for building data pipelines and data visualizations, here is the beginners’ guide to migrate from one tool to the other.如果您想使用Google Colaboratory进行数据分…