广告投手
As the baseball community has recently seen with the Astros 2017 cheating scandal, knowing what pitch is being thrown gives batters a game-breaking advantage. However, unless you have an intricate system of cameras and trash cans set up, knowing what pitch is about to be thrown is incredibly difficult. Batters have mere fractions of a second to pick up on signals that might indicate the type of pitch coming their way and even fewer milliseconds to process and act on that information. Signals that could indicate what pitch is being thrown could be a subconscious tell from the pitcher or a sign from a runner looking in at second. But these are either situational or easily avoided by experienced pitchers. A much harder to avoid pitch indicator is the pitcher’s release point.
正如棒球界最近在2017年《太空人》中的作弊丑闻中所看到的那样,知道投掷什么球距会让击球手具有突破性的优势。 但是,除非您设置了复杂的照相机和垃圾桶系统,否则要知道将要抛出的间距是非常困难的。 击球手只有几分之一秒的时间就能接收到可能表明音高类型的信号,甚至更少的毫秒来处理和处理该信息。 可能表明投掷了什么音调的信号可能是投手下意识地发出的信号,也可能是跑步者看了一眼的信号。 但是这些都是有条件的,或是经验丰富的投手很容易避免。 投手的释放点是一个更难避免的投球指示器。
The figure above shows Oakland A’s pitcher Sean Manaea’s release point for all of his pitches thrown since 2018. As we can see from the plot, there are relatively distinct differences in release point depending on the type of pitch. For Manaea, he tends to release his fastball lower than any other pitch and his changeup is, on average, released at the highest point. An observant batter could pick up on these signals and use them to their advantage. No trash cans needed, just skill. However, not all pitchers have such distinct differences in their release point. Take Justin Verlander for example:
上图显示了Oakland A的投手Sean Manaea自2018年以来投出的所有投球的释放点。从图中可以看出,根据投球类型的不同,释放点存在相对不同的差异。 对于Manaea而言,他倾向于将自己的快球释放得比其他任何俯仰都低,并且他的换乘平均是在最高点释放的。 细心的击球手可能会捡起这些信号,并利用它们来发挥优势。 无需垃圾桶,只需技巧即可。 但是,并非所有的投手在释放点上都有如此明显的不同。 以Justin Verlander为例:
Looking at the graph above, we can see that Verlander’s release points are much more uniform. He doesn’t seem to have a very distinct difference in release point depending on his pitch type. This makes it much harder for a batter to predict what pitch Verlander is throwing based on differences in arm slot.
查看上图,我们可以看到Verlander的发布点更加统一。 根据他的音高类型,他的释放点似乎没有非常明显的不同。 这使击球手很难根据臂隙的不同来预测Verlander的投球角度。
Quantifying Ability to Hide Pitches:
量化隐藏音高的能力:
Now that we have seen this difference in ability to hide pitches, a natural question would be: how can we quantify this difference? The way I decided to quantify this ability is by using a classification model. If you don’t know what a classification model is, here is a quick summary. Classification is a machine learning model that attempts to classify data based on certain features. In this particular model the ‘classes’ are the pitch types and the features are the coordinates for the release point and the count the pitch was thrown in. So the model takes the the release point coordinates and the count and does its best to determine what type of pitch was thrown based on that information.
既然我们已经看到了隐藏音高的能力上的差异,那么一个自然的问题将是:如何量化这种差异? 我决定量化此功能的方法是使用分类模型。 如果您不知道分类模型是什么,请快速总结一下。 分类是一种机器学习模型,试图基于某些功能对数据进行分类。 在这个特定的模型中,“类”是音高类型,特征是释放点的坐标以及被抛出音高的计数。因此,该模型将获取释放点的坐标和计数,并尽最大努力确定根据该信息抛出的音高类型。
Feature Selection:
功能选择:
Of course, if I simply wanted to accurately classify what pitch was thrown I could include spin rate and the movement metrics of the pitch as features to make a much more accurate model. But I want to quantify how well pitchers hide their pitches from batters so I only want information that is available to the batter up until the ball is released from the pitcher’s hand. As a consequence, in this model we only have the release coordinates of the pitch and the count it was thrown in as features.
当然,如果我只是想准确地对抛出的音高进行分类,则可以将旋转速度和音高的运动指标包括在内,以形成更准确的模型。 但是我想量化投手对击球手的掩饰效果,所以我只希望击球手能得到的信息直到球从投手手中释放出来为止。 因此,在此模型中,我们仅具有音高的释放坐标和作为特征抛出的音高。
Evaluating the Model:
评估模型:
Once the model has attempted to classify the data it is given, we need a way to evaluate how well it classified the data. This is how we will measure a pitcher’s ability to hide pitches from batters. The metric used to evaluate the model is called the precision score. This essentially returns the proportion of pitches that were classified correctly so it will range from 0 to 1. If the model is able to classify a large proportion of pitches correctly (a precision score value closer to 1) that tells us that the pitcher has more distinct differences in release points for his pitches and/or he is very predictable in the pitches he throws in certain counts. A precision score closer to 0 indicates that the model could not effectively classify the pitch type based on release point and count which tells us that the pitcher is much better at releasing pitches from the same point and mixes them up well depending on the count. One thing that needs to be kept in mind is that pitchers with a larger pitch repertoire will have a lower score simply because there are more pitches to classify. To counteract this I will be measuring how well the model does based on how much better it performs compared to simply randomly classifying the pitches. For example, if a pitcher has 4 pitches and you randomly guessed the pitch type you would expect to get 25% of them correct. So if the precision score for a pitcher with 4 pitches is 0.5, its adjusted score would be 2 because it is twice as effective compared to randomly guessing.
一旦模型尝试对给定的数据进行分类,我们需要一种方法来评估其对数据的分类程度。 这就是我们测量投手隐藏击球手投球能力的方式。 用于评估模型的度量标准称为精度得分。 这实际上会返回正确分类的音调的比例,因此范围为0到1。如果模型能够正确分类很大比例的音调(精度得分值接近1),则告诉我们该音调器具有更多他的投掷点在释放点上的明显差异和/或他在某些计数下投出的投掷点非常可预测。 精度得分接近0表示该模型无法根据释放点和计数有效地对音高类型进行分类,这告诉我们该投手在释放同一点的音高方面要好得多,并且根据计数将它们很好地混合在一起。 需要牢记的一件事是,具有更大音调库的投手将具有较低的分数,这仅仅是因为要分类的音调更多。 为了解决这个问题,我将根据模型的性能好于简单随机分类的音调来衡量模型的性能。 例如,如果一个投手有4个音高,而您随机猜测了音高类型,那么您会期望其中的25%正确。 因此,如果一个具有4个音高的投手的精确度得分为0.5,则其调整后的得分将为2,因为它的效率是随机猜测的两倍。
Results:
结果:
Now that we have defined our model and evaluation metrics, let’s see the results. Here I picked 16 random pitchers from 2019 and ran their pitch data through the model.
现在我们已经定义了模型和评估指标,让我们看看结果。 在这里,我从2019年挑选了16个随机水罐,并通过模型运行了它们的水罐数据。
Our ‘winner’ is Blake Snell who has both the highest precision score and highest adjusted score. Snell’s high score suggests that he has distinct release points for his different, let’s see a plot of his release points to verify this.
我们的“胜利者”是布雷克·斯内尔(Blake Snell),他同时拥有最高的准确度得分和最高的调整后得分。 斯内尔(Snell)的高分表明他有不同的释放点,让我们看一下他的释放点图以验证这一点。
The graph above seems to fall in line with Snell’s high precision score. Snell appears to have very distinct areas where he releases his pitches with his changeup being released lower and to the right and his curveball and fastball being released higher and to the left.
上图似乎与Snell的高精度得分相符。 斯内尔(Snell)似乎有非常不同的区域,他的球高和球高分别向左下方和右下方释放,而他的曲线球和快球则在上方和左侧释放。
This difference is made even clearer when compared to Gerrit Cole who had the lowest precision score out of the players I tested.
与Gerrit Cole相比,这种差异更加明显,后者在我测试的球员中得分最低。
Gerrit Cole’s release points are much more muddled and there aren’t clear patterns for where he releases certain pitches. This makes it much more difficult for batters to pick up what pitch is being thrown out of his hand.
杰里特·科尔(Gerrit Cole)的释放点更加混乱,他释放特定音高的位置也没有明确的模式。 这使得击球手要捡起他手中的球变得更加困难。
Applications:
应用范围:
While this little experiment was more for my sake, I believe there are a couple ways teams could utilize this data and model.
就我而言,虽然这个小实验更多,但我相信团队可以采用几种方法来利用此数据和模型。
The first application would be for batters to find pitchers with distinct differences in release point. Teams and hitters can then look at the release point plots of those pitchers to find the patterns for the different pitches and use that information to help figure out what pitch is being thrown based on the pitcher’s arm slot. Granted, that is much easier said than done but skilled players could use that information to great effect.
击球手的第一个应用程序是查找释放点明显不同的投手。 然后,团队和击球手可以查看那些投手的释放点图,以找到不同投手的模式,并使用该信息来帮助根据投手的臂隙来找出要投掷的投手。 当然,这说起来容易做起来难,但是熟练的玩家可以利用这些信息产生巨大的效果。
The second application would be for pitchers. Pitchers that have high precision scores may want to try and lower their precision score by consistently releasing from the same point regardless of the pitch to make it harder for batters to pick up on what is being thrown.
第二个应用是投手。 具有高精确度得分的投手可能想要通过不断地从同一点释放而不管其间距如何来降低其精确度得分,从而使击球手更难捡起所投掷的东西。
If you have any more ideas for how this data can be applied feel free to let me know!
如果您对如何应用这些数据还有其他想法,请随时告诉我!
The code for the model and data can be found here on my GitHub. This is my first time using classification so any tips or criticisms are greatly appreciated.
该模型和数据的代码可以在我的GitHub上找到 。 这是我第一次使用分类,因此非常感谢任何提示或批评。
翻译自: https://towardsdatascience.com/measuring-how-well-pitchers-hide-their-pitches-f61f076d91f4
广告投手
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388554.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!