詹森不等式_注意詹森差距

詹森不等式

背景 (Background)

In Kaggle’s M5 Forecasting — Accuracy competition, the square root transformation ruined many of my team’s forecasts and led to a selective patching effort in the eleventh hour. Although it turned out well, we were reminded that “reconstitution bias” can plague predictions on the original scale, even with common transformations such as the square root.

在Kaggle的“ M5预测-准确性”竞赛中 ,平方根转换破坏了我团队的许多预测,并在第11小时进行了选择性修补工作。 尽管结果很好 ,但我们仍被提醒,“重构偏见”会困扰原始规模的预测,即使采用平方根之类的常见转换也是如此。

平方根变换 (The square root transformation)

For Poisson data, the rationale of the square root is that it is a variance-stabilizing transformation; in theory, the square root of the values are distributed approximately normal with constant variance and a mean that is the square root of the original mean. It is an approximation, and as Wikipedia puts it, one in which the “convergence to normality (as [the original mean] increases) is far faster than the untransformed variable.

对于Poisson数据 ,平方根的基本原理是它是方差稳定的变换; 从理论上讲,值的平方根近似分布,且具有恒定方差,且均值是原始均值的平方根。 正如Wikipedia所说 ,这是一种近似,其中“ 一化的收敛性(随着(原始均值)的增加)比未转换的变量快得多。

Imagine you decide to take square roots in a count data scenario, feeling good reassured that the convergence to normality is “fast.” You then model the mean of square-root transformed data and then get predictions on the square root scale. At some point, especially in a forecasting scenario, you’ll have to get back to the original scale. That probably entails squaring the model-estimated means. The M5 competition served as a reminder that this approach can and will break down.

想象一下,您决定在计数数据方案中求平方根,并确信向正态的收敛是“快速的”。 然后,您可以对平方根转换后的数据的均值建模,然后获得平方根尺度的预测。 在某些时候,尤其是在预测情况下,您必须回到原始比例。 这可能需要对模型估计的均方进行平方。 M5竞赛提醒我们,这种方法可能并且将会失败。

詹森差距 (The Jensen Gap)

Jensen’s Inequality states that for convex functions, the function evaluated at the expectation is less than or equal to the expectation of the function, i.e., g(E[Y]) ≤ E[g(Y)]. The inequality is flipped for concave functions.

Jensen不等式指出,对于凸函数,按期望评估的函数小于或等于该函数的期望,即g(E [Y])≤E [g(Y)]。 对于凹函数,不等式被翻转。

Similarly, the Jensen Gap is defined as the difference E[g(Y)]-g(E[Y]), which is positive for convex functions g. (As an aside, notice that when g(x) is the square function, the Jensen Gap is the Variance of Y, which had better be non-negative!)

类似地, 詹森差距定义为差E [ g ( Y )]- g (E [ Y ]),对于凸函数g为正 (顺便说一句, 请注意,当g ( x )是平方函数时,Jensen Gap Y的方差,最好是非负的!)

When considering g(x) as the square function and the square root of Y as the random variable, the Jensen Gap becomes E[Y]-E[sqrt(Y)]². Since that quantity is positive, our reconstituted mean will be biased downward. To learn more about the magnitude of the gap, we turn to the Taylor expansion.

当将g ( x )作为平方函数并将Y的 平方根作为随机变量时,Jensen Gap变为E [ Y ] -E [sqrt( Y )]²。 由于该数量为正,因此我们重构的均值将向下偏向。 要了解有关差距大小的更多信息,我们转向泰勒展开。

泰勒展开至近似偏差 (Taylor expansion to approximate bias)

To the Mathematics StackExchange prompt “Expected Value of Square Root of Poisson Random Variable,” contributor Hernan Gonzalez explains the Taylor expansion of a random variable about its mean, as shown in the screenshot below.

在数学StackExchange提示“ 泊松随机变量平方根的期望值 ”中, 贡献者Hernan Gonzalez解释了随机变量的泰勒展开式及其均值,如下面的屏幕快照所示。

Image for post

Note that the expansion needs at least a few central moments of the original distribution. For the Poisson, the first three are just the mean parameter.

请注意,展开至少需要原始分布的几个中心时刻。 对于泊松而言,前三个只是均值参数。

Ignoring that the mean estimator is also a random variable, we can run the expectation above through the inverse transformation, i.e., square it, to get an idea of the bias on the original scale for any Poisson mean value (the algebra isn’t here but it’s computed in line 34 of the demonstration code.) Similarly, with properties of the square root of the random variable, it’s straightforward to analyze g(x) = x ^2 in the same way. That opens up the possibility of bias correction, an interesting proposition, albeit one with assumptions and complexities of its own.

忽略均值估计器也是一个随机变量,我们可以通过逆变换在上面运行期望值,即对它求平方,以了解任何泊松均值在原始比例上的偏差(代数不在此处但是,它是在演示代码的第34行中计算出来的 。)类似地,由于具有随机变量的平方根的属性,因此以相同的方式分析g (x)= x ^ 2很简单。 这开辟了偏差校正的可能性,这是一个有趣的主张,尽管它有其自身的假设和复杂性。

近似分解 (Approximation breakdown)

Near the end of his answer, Gonzalez mentions that the approximation “is only useful if” the mean of the original Poisson is quite a bit bigger than 1, clarifying in the comments that this is needed so that “the terms of the sum decrease quickly.” That follows from the mean being raised to negative powers after the original term.

冈萨雷斯在回答接近尾声时提到,“ 仅当 ”原始泊松的均值比1大很多时,近似值“ 才有用 ”,并在注释中阐明了这一点是必要的,以便“ 总和的项Swift减少”。 。 ”这是因为原任期之后,均值被提升为负数。

In the M5 competition, mean sales for many items were substantially below one, and thus using the square root transformation was a recipe for poor performance. To get an idea of how this plays out in an actual sample, the next section will investigate this phenomenon via simulation.

在M5竞赛中,许多商品的平均销售额都大大低于1,因此使用平方根变换是降低性能的良方。 为了了解这种情况在实际样本中如何发挥作用,下一部分将通过仿真研究这种现象。

示范 (Demonstration)

In this section, we use the loess smoother to create models on both the original scale and the square root scale, and square the mean estimates of the latter. For simulated Poisson data with both a mean of 20 and a mean of 0.2, we plot the two sets of predictions and examine the bias. The code is under 50 lines and is available in Nousot’s Public Github repository.

在本节中,我们将使用黄土平滑器在原始比例和平方根比例上创建模型,并对后者的均值进行平方。 对于均值为20和均值为0.2的模拟Poisson数据,我们绘制了两组预测并检查了偏差。 该代码少于50行,可在Nousot的Public Github存储库中找到 。

当平均值是20 (When the mean is 20)

Image for post

For the case where the mean of the Poisson random variable is 20, the retransformation bias is negative (as Jensen’s Inequality said it would be), but also relatively small. In the code, the first two terms of the Taylor expansion are computed and compared to the empirical bias on the square root scale. At -0.027 and -0.023, respectively, they are relatively close.

对于泊松随机变量的平均值为20的情况,重变换偏差为负(就像詹森的不等式所说的那样),但也相对较小。 在代码中,计算出泰勒展开的前两个项,并将其与平方根尺度上的经验偏差进行比较。 它们分别为-0.027和-0.023,相对接近。

当平均值为0.20时 (When the mean is 0.20)

Image for post

For the case where the mean of the Poisson random variable is 0.20, the picture is much different. While Jensen’s Inequality always holds, the Jensen Gap is now large in a relative sense. Furthermore, the Taylor approximation has completely broken down, with the first two bias terms summing to 0.419 while the empirical bias is -.251 (still on the square root scale).

对于泊松随机变量的平均值为0.20的情况,图片有很大不同。 尽管詹森的不平等现象始终存在,但詹森差距现在相对来说还是很大的。 此外,泰勒近似已完全分解,前两个偏差项的总和为0.419,而经验偏差为-.251(仍在平方根刻度上)。

讨论区 (Discussion)

David Warton’s 2018 paper “Why You Cannot Transform Your Way Out of Trouble for Small Counts” demonstrates the hopelessness of getting to the standard assumptions for small-mean count data. For the sparse time series in M5, there was nothing to gain and a lot to lose by taking the square root. At the very least, we should have treated those series differently. (Regarding our use of the Kalman Filter, Otto Seiskari’s advice to tune via cross-validation when the model is misspecified is especially compelling).

戴维·沃顿(David Warton)在2018年发表的论文“ 为什么小数位数无法摆脱麻烦 ”,这说明了达到小数位数数据的标准假设的绝望。 对于M5中稀疏的时间序列,通过求平方根没有任何收益,也有很多损失。 至少,我们应该对这些系列进行不同的处理。 (关于我们对卡尔曼滤波器的使用,当模型指定不正确时, Otto Seiskari的建议通过交叉验证进行调谐特别引人注目)。

Warton’s paper has some harsh words for users of transformations in general. I still believe that if a transformation brings you closer to the standard assumptions, where your code runs faster and you enjoy nicer properties, then it’s worth considering. But there needs to be an honest exploration of properties of the transformation in the context of the data, and this does not come for free.

一般而言,沃顿的论文对转​​换的使用者来说有些苛刻的话。 我仍然相信,如果 转换使您更接近标准假设,即代码运行速度更快并且享受更好的属性,因此值得考虑。 但是需要在数据的上下文中诚实地探索转换的属性,而这并不是免费的。

Typically transformations (and their inverses) are either convex or concave, and thus Jensen’s Inequality will guarantee bias in the form of a Jensen Gap. If you’re wondering why you’ve never heard of it, it’s because it’s often written off as approximation error. According to Gao et al (2018),

通常,变换(及其逆变换)是凸的或凹的,因此Jensen的不等式将保证以Jensen Gap的形式出现偏差。 如果您想知道为什么从未听说过它,那是因为它经常被记为近似误差。 根据Gao等人(2018) ,

“Computing a hard-to-compute [expectation of a function] appears in theoretical estimates in a variety of scenarios from statistical mechanics to machine learning theory. A common approach to tackle this problem is to … show that the error, i.e., the Jensen gap, would be small enough for the application.”

从统计力学到机器学习理论,在各种情况下的理论估计中都出现了计算难以计算的[函数期望]。 解决此问题的常用方法是……表明误差(即詹森间隙)对于应用程序而言足够小。”

When using transformations, the work to understand the properties of inverse-transformation (in the context of the data) is worth it. It’s dangerous out there. Watch your step, and mind the Jensen Gap!

使用转换时,了解逆转换属性(在数据上下文中)的工作是值得的。 那里很危险。 注意您的脚步,并注意詹森差距!

翻译自: https://towardsdatascience.com/mind-the-jensen-gap-c54e0eb9e1b7

詹森不等式

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388668.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

【转载】儒林外史人物——荀玫

写在前面:本博客内容为转载,原文URL:http://blog.sina.com.cn/s/blog_9132ac5b0101iukw.html 说完周进,本应顺着说范进,但我觉得荀玫他们村的事情过于喜感,想先说荀玫。 荀玫简直是儒林中的某类标杆人物&am…

WebM VP8 SDK Usage/关于WebM VP8 SDK的用法

WebM是Google提出的新的网络视频格式,本质上是个MKV的壳,封装VPX中的VP8视频流与Vorbis OGG音频流。目前Firefox、Opera、Chrome都能直接打开WebM视频文件而无需其他任何乱七八糟的插件。我个人倒是很喜欢WebM的OGG音频,虽然在低比特率下不如…

数据分析师 需求分析师_是什么让分析师出色?

数据分析师 需求分析师重点 (Top highlight)Before we dissect the nature of analytical excellence, let’s start with a quick summary of three common misconceptions about analytics from Part 1:在剖析卓越分析的本质之前,让我们从第1部分中对分析的三种常…

JQuery发起ajax请求,并在页面动态的添加元素

页面html代码&#xff1a; <li><div class"coll-tit"><span class"coll-icon"><iclass"sysfont coll-default"></i>全域旅游目的地</span></div><div class"coll-panel"><div c…

arcgis镜像图形工具,ArcGis图形编辑

一、编辑工具条介绍二、草图工具介绍Sketch Tool&#xff1a;使用草图工具来创建点要素或是线或面要素的节点。双击或是F2键结束草图状态&#xff0c;转化为要素。Intersection Tool&#xff1a;使用相交工具在两个线要素相交(或延长相交)的地方创建一个节点。如图&#xff1a;…

MAYA插件入门

我们知道&#xff0c; MAYA 是一个基于结点的插件式软件架构&#xff0c;这种开放式的软件架构是非常优秀的&#xff0c;它可以让用户非常方便地在其基础上开发一些自已想要的插件&#xff0c;从而实现一些特殊的功能或效果。 在MAYA上开发自已的插件&#xff0c;你有3种选择&a…

(原創) 如何使用C++/CLI读/写jpg檔? (.NET) (C++/CLI) (GDI+) (C/C++) (Image Processing)

Abstract因为Computer Vision的作业&#xff0c;之前都是用C# GDI写&#xff0c;但这次的作业要做Grayscale Dilation&#xff0c;想用STL的Generic Algorithm写&#xff0c;但C Standard Library并无法读取jpg档&#xff0c;用其它Library又比较麻烦&#xff0c;所以又回头想…

猫眼电影评论_电影的人群意见和评论家的意见一样好吗?

猫眼电影评论Ryan Bellgardt’s 2018 movie, The Jurassic Games, tells the story of ten death row inmates who must compete for survival in a virtual reality game where they not only fight each other but must also fight dinosaurs which can kill them both in th…

128.Two Sum

题目&#xff1a; Given an array of integers, return indices of the two numbers such that they add up to a specific target. 给定一个整数数组&#xff0c;返回两个数字的索引&#xff0c;使它们相加到特定目标。 You may assume that each input would have exactly on…

php获取错误信息函数,关于php:如何获取mail()函数的错误消息?

我一直在使用PHP mail()函数。如果邮件由于任何原因未发送&#xff0c;我想回显错误消息。 我该怎么做&#xff1f;就像是$this_mail mail(exampleexample.com, My Subject, $message);if($this_mail) echo sent!;else echo error_message;谢谢&#xff01;当mail()返回false时…

关于夏季及雷雨天气的MODEM、路由器使用注意事项

每年夏季是雷雨多发季节&#xff0c;容易出现家用电脑因而雷击造成电脑硬件的损坏和通讯故障&#xff0c;为了避免这种情况的的发生&#xff0c;保护您的财产不受损失&#xff08;一般雷击照成损坏的设备是没得保修的&#xff09;&#xff0c;建议您继续阅读下面内容&#xff1…

创建Console应用程序,粘贴一下代码,创建E://MyWebServerRoot//目录,作为虚拟目录,亲自测试通过,

创建Console应用程序&#xff0c;粘贴一下代码&#xff0c;创建E://MyWebServerRoot//目录&#xff0c;作为虚拟目录&#xff0c;亲自测试通过&#xff0c; 有一个想法&#xff0c;调用ASP.DLL解析ASP&#xff0c;可是始终没有找到资料&#xff0c;有待于研究&#xff0c;还有…

c#对文件的读写

最近需要对一个文件进行数量的分割&#xff0c;因为数据量庞大&#xff0c;所以就想到了通过写程序来处理。将代码贴出来以备以后使用。 //读取文件的内容 放置于StringBuilder 中 StreamReader sr new StreamReader(path, Encoding.Default); String line; StringBuilder sb …

php表格tr,jQuery+ajax实现动态添加表格tr td功能示例

本文实例讲述了jQueryajax实现动态添加表格tr td功能。分享给大家供大家参考&#xff0c;具体如下&#xff1a;功能&#xff1a;ajax获取后台返回数据给table动态添加tr/tdhtml部分&#xff1a;ajax部分&#xff1a;var year $(#year).val();//下拉框数据var province $(#prov…

maya的简单使用

1、导出obj类型文件window - settings preferences - plug- in Manager objExport.mllfile - export selection就有OBJ选项了窗口-设置/首选项- 插件管理 objExport.mll文件-导出当前选择2、合并元素在文件下面的下拉框&#xff0c;选择多边形。按住shift键&…

ai前沿公司_美术是AI的下一个前沿吗?

ai前沿公司In 1950, Alan Turing developed the Turing Test as a test of a machine’s ability to display human-like intelligent behavior. In his prolific paper, he posed the following questions:1950年&#xff0c;阿兰图灵开发的图灵测试作为一台机器的显示类似人类…

查看修改swap空间大小

查看swap 空间大小(总计)&#xff1a; # free -m 默认单位为k, -m 单位为M   total used free shared buffers cached  Mem: 377 180 197 0 19 110  -/ buffers/ca…

关于WKWebView高度的问题的解决

关于WKWebView高度的问题的解决 IOS端嵌入网页的方式有两种UIWebView和WKWebView。其中WKWebView的性能要高些;WKWebView的使用也相对简单 WKWebView在加载完成后&#xff0c;在相应的代理里面获取其内容高度&#xff0c;大多数网上的方法在获取高度是会出现一定的问题&#xf…

测试nignx php请求并发数,nginx 优化(突破十万并发)

一般来说nginx 配置文件中对优化比较有作用的为以下几项&#xff1a;worker_processes 8;nginx 进程数&#xff0c;建议按照cpu 数目来指定&#xff0c;一般为它的倍数。worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000;为每个进…