置信区间的置信区间
I’m going to try something a little different today, in which I combine two (completely unrelated) topics I love talking about, and hopefully create something that is interesting and educational.
今天,我将尝试一些与众不同的东西,其中我结合了两个我喜欢谈论的主题(完全不相关),并希望创建一些有趣且具有教育意义的东西。
…Actually, scratch that. If you’ve ever read anything I’ve written on Medium, this is actually pretty on brand (Explaining Linear Regression to Michael Scott, anyone?) In today’s article, I am going to explain what confidence intervals are, how to calculate them, and how to interpret them — all through the use case of Harry Potter’s infamous Divination teacher.
…实际上,把它刮掉。 如果您曾经读过我在Medium上写过的任何文章,那么实际上这在品牌上是很不错的( 向Michael Scott解释线性回归 ,有人吗?)在今天的文章中,我将解释什么是置信区间,如何计算它们,以及如何解释它们-全部以哈利·波特臭名昭著的占卜老师为例。
For those of you who are unfamiliar with Harry Potter and Sybill Trelawney (which I’m assuming is nobody currently reading this, because it means you’ve been without internet for 20+ years), she is a witch who can “see into the future” through 99% educated guessing and 1% true prophecies. Confidence intervals, at least in my opinion, are pretty similar: making educated guesses about entire populations based on a small set of observations. However, instead of crystal balls and tea leaves, our tools are standard deviations and sample means.
对于不熟悉哈利·波特(Harry Potter)和西比尔·特劳拉尼(Sybill Trelawney)(我认为目前没有人读这本书,因为这意味着您已经有20多年没有互联网了)的那些人,她是一个巫婆,可以“观察未来”通过99%的有根据的猜测和1%的真实预言。 至少在我看来,置信区间非常相似:根据一小组观察结果对整个人群进行有根据的猜测。 但是,我们的工具不是标准的球和茶叶,而是水晶球和茶叶。
什么是置信区间,为什么人们使用它们? (What is a Confidence Interval, and Why do People Use Them?)
Professor Trelawney has had thousands of students during her tenure at Hogwarts, and Dumbledore is doing an audit of her work to see how accurate she really is at predicting the future. However, Dumbledore is a busy guy (trying to hunt down Horcruxes and run a school) and can’t reach out to every student who has ever taken Trelawney’s class. So instead, he randomly selects a sample of 300 previous students and asks them what percentage of her predictions have come true so far.
特里劳妮教授她在霍格沃茨任职期间,已经有成千上万的学生,和邓布利多正在做她的工作进行审核,看她真的是在预测未来如何准确。 但是,邓布利多是个忙碌的人(试图追捕Horcruxes并经营一所学校),无法接触到曾经参加过Trelawney课的每个学生。 因此,他取而代之的是,从300名以前的学生中随机选择一个样本,并询问他们到目前为止,她的预测中有多少百分比是正确的。
Dumbledore collects the responses and finds that the distribution of responses is left skewed, which means students are more likely to find Trelawney’s predictions to be inaccurate than accurate. In fact, the sample average is 55.6%, which means that Trelawney’s “predictions” are barely better than random guessing.
邓布利多收集了答案,发现答案的分布是偏斜的,这意味着学生更有可能发现Trelawney的预测是不准确的,而不是准确的。 实际上,样本平均值为55.6%,这意味着Trelawney的“预测”仅比随机猜测好。
This is pretty concerning to Dumbledore, because he does not want a teacher who is essentially guessing when they teach Divination. He wonders what the true mean accuracy of her predictions is, and if it might actually be lower than the sample mean of 55.6% if he had reached out to every student she ever taught.
对于邓布利多来说,这很令人担忧,因为他不希望有一位在教授占卜术时实质上在猜测的老师。 他想知道她的预测的真正平均准确度是多少,如果他接触过她曾经教过的每一个学生,那么它是否真的可能低于样本平均水平55.6%。
Luckily, Dumbledore remembers a Muggle concept called the Central Limit Theorem (what can I say, he’s a multifaceted guy), which gives him the ability to approximate the true mean accuracy of Trelawney’s predictions. The Central Limit Theorem posits that when you have a sample of data and you want to find the true population statistic (i.e. a common example is mean so I will use it in this article, but it can be other measures as well), you can take smaller groups of the sample and find their mean (i.e. random samples of 30 people at a time, when you have a total sample of 300 people and total population of 30,000 people). If you have enough samples and the samples fulfill certain criteria, the sample means will fall into a normal distribution (regardless of the original distribution of the data) and the mean of the sampling distribution of the sample mean (I hate statistics sometimes) will approximate the true population mean (the mean, if you had actually surveyed all 30,000 people in the population).
幸运的是,邓布利多记得一个麻瓜概念,即中心极限定理(我可以说,他是一个多面的人),这使他能够近似特雷劳尼的预测的真实平均准确度。 中心极限定理假设,当您有数据样本并想要找到真实的总体统计量时(即,一个普通的例子是平均值,因此我将在本文中使用它,但也可以是其他度量),您可以取较小的样本组并找到平均值(即,当您的样本总数为300人且总人口为30,000人时,一次随机抽样30人)。 如果您有足够的样本,并且样本满足某些条件,则样本均值将落入正态分布(无论数据的原始分布如何),并且样本均值的采样分布均值 (有时会讨厌统计)真实人口均值(均值,如果您实际调查了人口中的所有30,000个人)。
Based on this concept, Dumbledore knows that the 55.6% average accuracy is somewhere on the normally distributed curve of sample means, and the true accuracy is in the middle of this same distribution.
基于这样的理念,邓布利多知道,55.6%的平均准确度是某处的样本均值的正态分布曲线上,而真正的精度在这相同的分布的中间。
However, he still doesn’t know exactly where the sample mean lies on the distribution, and how close it is to the actual population mean (denoted as μ). This is where confidence intervals come in.
但是,他仍然不知道样本均值在分布上的确切位置以及它与实际总体均值(表示为μ )有多接近。 这就是置信区间的来源。
Confidence intervals create an interval of standard deviations centered around the sample mean. Based on the size of the interval, you can declare that the interval will include the true population mean x% of the time. Many researchers and companies use 95% confidence intervals (span 2 standard deviations away from the sample mean on each side), but based on your use case, you can go higher or lower in confidence (i.e. 3 standard deviations away from the mean is 99.7% confidence).
置信区间创建一个以样本平均值为中心的标准偏差区间。 根据时间间隔的大小,您可以声明时间间隔将包括x时间的真实总体平均值。 许多研究人员和公司都使用95%的置信区间(距样本均值2个标准偏差),但是根据您的用例,您可以提高或降低置信度(即距均值3个标准偏差为99.7) % 置信度)。
Based on the example above, you can see how sample means that fall within 2 standard deviations of the population mean (inside the dark purple lines) will include the population mean (dotted blue line) in their 95% confidence interval. However, the sample means that are located more than 2 standard deviations from the population mean will not include the population mean in their confidence interval.
根据上面的示例,您可以看到样本如何表示落入总体平均值的 2个标准偏差(深紫色线内)将在其95%置信区间内包括总体平均值(蓝色虚线)。 但是,样本均值与总体均值的差超过2个标准差时,不会在其置信区间内包括总体均值。
Since 95% of the values in a normal distribution reside within 2 standard deviations from the mean, we can be confident that 95% of the sample means will be within 2 standard deviations from the population mean. Following this, we can also say that 95% of all random samples taken from this population will include the true population mean in their 95% confidence interval.
由于正态分布中95%的值位于均值的2个标准差之内,因此我们可以确信95%的样本均值将与总体均值在2个标准差之内。 之后,我们还可以说, 从该总体中抽取的所有随机样本中有95%将在其95%置信区间内包含真实总体平均值 。
如何计算置信区间 (How to Calculate a Confidence Interval)
Now that we have a general idea of what confidence intervals are used for and why they work, we can return to Hogwarts and help Dumbledore calculate the confidence interval for Professor Trelawney’s prediction accuracy.
既然我们对所用的置信区间及其工作原理有了一个大致的了解,我们可以返回霍格沃茨,并帮助邓布利多计算特雷劳尼教授的预测准确性的置信区间。
Above is the generic formula for calculating a confidence interval, where x̅ is the sample mean, z is the number of standard deviations from the mean we want the interval to span, s is the standard deviation of the sample, and n is the number of samples in the group.
上面是计算置信区间的通用公式,其中x̅是样本均值, z是相对于我们希望区间跨越的均值的标准偏差数, s是样本的标准差, n是样本数的标准差。组中的样本。
The second half of the formula ( z * s / √n) calculates the actual value of what z standard deviations are (z = 2 for 95% confidence interval), and you add and remove this value from x̅ to find the upper and lower bound of your confidence interval.
公式的后半部分(z * s /√n)计算z标准差的实际值(对于95%置信区间, z = 2),然后从x̅中添加和删除该值以找到 置信区间的上限和下限。
If Dumbledore plugs in the assessment values into this formula, he will get:
如果Dumbledore将评估值插入此公式,他将得到:
Upper-bound
上限
= 55.6 + (2 x (12.9/√300)
= 55.6 +(2 x(12.9 /√300)
= 57.08%
= 57.08%
Lower-bound
下界
= 55.6 - (2 x (12.9/√300)
= 55.6-(2 x(12.9 /√300)
= 54.12%
= 54.12%
This means there is a good chance the true accuracy of Professor Trelawney’s prediction is between 54.12% — 57.08%. However, since Dumbledore only assessed one sample, we cannot say if this sample average (55.6%) was within 2 standard deviations of the population mean, and if its confidence interval successfully captured the true population average. There is a chance the sample average happened to be in the 5% of data that is > 2 standard deviations from the mean (aka one of the BAD BOIS in red below), and therefore completely missed the population mean.
这意味着特雷劳尼教授的预测的真实准确性很有可能在54.12%— 57.08%之间 。 但是,由于Dumbledore仅评估了一个样本,因此我们无法确定该样本平均值(55.6%)是否在总体均值的2个标准差之内,以及其置信区间是否成功捕获了真实总体均值。 样本平均值有可能在数据的5%内,与平均值相差> 2个标准差(也就是下面的红色中的BAD BOIS之一),因此完全错过了总体平均值。
特雷劳尼教授的命运(又名邓布利多的下一步) (Professor Trelawney’s Fate (aka Dumbledore’s next steps))
Dumbledore has a few options for next steps:
邓布利多有几个可供选择的下一步:
- He can send out the survey to a few more cohorts of 300 students each, and start plotting the cohort averages in a distribution. This will give him a better idea of where the 55.6% falls (i.e. close to the other values and likely within 2 standard deviations of the population mean, or significantly further from the other values and likely > 2 standard deviations). 他可以将调查问卷发送给另外300个学生的队列,然后开始在分布中绘制队列平均值。 这将使他更好地了解55.6%的下降位置(即接近其他值,并且可能在总体均值的2个标准差内,或者比其他值明显更远,并且可能大于2个标准差)。
- He can re-run the same analysis, this time taking smaller groups out of his 300 responses and plotting their means in a distribution (i.e. 100 random groups of 30 students each, with replacement). The mean of this distribution will give him a good idea of what the actual population mean is, without needing to use a confidence interval. A sample size of 30 (n=30) is the “rule-of-thumb” for the theorem of central tendency to work well, so Dumbledore actually went above and beyond with his original sample size of 300. 他可以重新运行相同的分析,这次将他的300个响应中的较小的组从中抽取出来,并在分布中绘制平均值(即100个随机组,每组30个学生,并有替换)。 该分布的平均值将使他对实际总体的平均值有一个很好的了解,而无需使用置信区间。 样本大小为30(n = 30)是中心趋势定理很好的“经验法则”,因此邓布利多实际上超过了他最初的样本大小300。
Dumbledore considers his options, and decides to accept the confidence interval of 54.12% — 57.08% prediction accuracy for now. He is not terribly thrilled with the results, but he is relieved to see that the lower-bound of the interval is still greater than 50%, which means — at the very least — Professor Trelawney’s predictions are better than random guessing!
邓布利多考虑了自己的选择,决定暂时接受54.12%的置信区间-57.08%的预测准确度。 他对结果并不感到非常兴奋,但看到该区间的下限仍然大于50%感到欣慰,这意味着-至少-至少,Trelawney教授的预测比随机猜测要好!
He has summarized his findings, and will present them to Pressor Trelawney in her next performance review. Maybe he will recommend for her to add confidence intervals to her to her syllabus in the coming years 🧙🏼♀️🔮
他总结了他的发现,并将在下一次性能评估中向Pressor Trelawney提出。 也许他会建议她在未来几年为她的课程提纲增加信心intervals🧙🏼️🔮
翻译自: https://towardsdatascience.com/sybill-trelawney-and-the-confidence-interval-conundrum-df7659e3fc59
置信区间的置信区间
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388637.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!