How many of you use p=0.05 as an absolute cut off? p ≥ 0.05 means not significant. No evidence. Nada. And then p < 0.05 great it’s significant. This is a crude way of using p-values, and hopefully I will convince you of this.
你们中有多少人使用p = 0.05作为绝对截止值? p≥0.05表示不显着。 没有证据。 娜达 然后p <0.05很好,很有意义。 这是使用p值的粗略方法,希望我能说服您。
什么是p值? (What is a p-value?)
A lot of us use p-values following this arbitrary cut off but don’t actually know the theoretical background of a p-value. A p-value is the probability, under the null hypothesis, of observing data at least as extreme as the observed data. It is not, for example, the probability that some population parameter x = 0. x either equals 0 or it does not (in a frequentist setting).
我们中的许多人都在此任意取舍之后使用p值,但实际上并不了解p值的理论背景。 p值是在零假设下观察数据至少与观察数据一样极端的概率。 例如,这不是某个总体参数x = 0的概率。x等于0或不等于0(在常客设置中)。
So, the smaller the p-value, the more unlikely it is that this data would have been observed under the null hypothesis. In essence, the smaller the p-value, the stronger the evidence against the null hypothesis.
因此,p值越小,在原假设下观察到该数据的可能性就越小。 本质上,p值越小,针对原假设的证据越强。
什么会影响p值? (What affects p-values?)
Two things mainly. The first is the strength of effect. The greater the difference from the null hypothesis. The smaller the p-value will be.
主要有两件事。 首先是效果的强度。 与原假设的差异越大。 p值越小。
The second is the sample size. The larger the sample, the smaller the p-value will be (if in fact the null hypothesis is false).
第二个是样本量。 样本越大,p值就越小(如果实际上零假设是假的)。
So, this means that if p ≥ 0.05, it could be because the effect isn’t that strong (or doesn’t exist) or that our sample is too small, resulting in our test being underpowered to detect a difference.
因此,这意味着如果p≥0.05,则可能是因为效果不那么强烈(或不存在)或我们的样本太小,导致我们的测试能力不足以检测差异。
一些例子 (Some examples)
致命药 (A deadly drug)
Suppose we were looking at adverse events of a new drug. Now suppose p=0.051 for evidence that the drug increases the rate of deaths. Now, if we used p=0.05 as a cut-off then it’s great. No evidence that the drug increases the rate of deaths — let’s put it into production. Now imagine that p=0.049 of an increase in the rate of deaths. Oh no! There’s evidence that the drug is harmful. Let’s not put it into production.
假设我们正在研究一种新药的不良React。 现在假设p = 0.051作为该药物增加死亡率的证据。 现在,如果我们使用p = 0.05作为临界值,那就太好了。 没有证据表明这种药物会增加死亡率,我们将其投入生产。 现在,假设死亡率增加了p = 0.049。 不好了! 有证据表明这种药物有害。 我们不要将其投入生产。
Mathematically, there’s not really a difference between the two. They are essentially the same. But by using this arbitrary cut off we reach very different conclusions.
从数学上来说,两者之间并没有真正的区别。 它们本质上是相同的。 但是,通过使用这种任意截断,我们得出了截然不同的结论。
这种药物有效吗 (Does this drug work)
Now imagine another drug. We’ve got a very large sample (n=10,000) and we want to know whether this drug cures cancer. So we get p=0.049 that it cures cancer. Great! Significant evidence this drug cures cancer. Let’s give it to everyone.
现在想象另一种药物。 我们有一个非常大的样本(n = 10,000),我们想知道这种药物是否可以治愈癌症。 因此我们得到p = 0.049可以治愈癌症。 大! 重要证据表明该药可治愈癌症。 让我们给大家。
Though, it’s a large sample. Wouldn’t we expect p to be smaller? It’s not that strong evidence against the null hypothesis. There’s approximately a one in twenty chance that our results are down to chance. Now suppose this drug is really expensive. Do we really want to start giving it out to everyone based on some fairly weak evidence? Probably not.
虽然,这是一个很大的样本。 我们难道不希望p变小吗? 并非没有证据支持原假设。 我们的结果接近偶然的可能性大约为十分之一。 现在假设这种药真的很贵。 我们是否真的要根据一些相当薄弱的证据开始向所有人分发? 可能不是。
Now of course if p=0.001 this would be a one in a hundred chance that our results our down to chance. This would be much stronger evidence that the drug works.
当然,现在如果p = 0.001,这将是我们得出结果的机会的百分之一。 这将是该药有效的更有力证据。
那么我们应该如何解释p值呢? (So how should we interpret p-values?)
As a continuous scale. The smaller the p-value is, the stronger the evidence is. But, you should take the sample size and effect size into account. You should also consider whether you are looking at something positive or negative. If looking at something like our deadly drug example, we should be concerned even if the evidence is very weak. However, with something like wanting to know whether a drug works, we can afford to be much more sceptical about our result.
作为连续的规模。 p值越小,证据越强。 但是,您应该考虑样本大小和效果大小。 您还应该考虑看的是正面还是负面。 如果以类似我们致命毒品的例子来看,即使证据不足,我们也应予以关注。 但是,由于想知道某种药物是否有效,我们可以对我们的结果持怀疑态度。
So, hopefully in the future, you’ll stop using p=0.05 as some threshold picked out of threshold and consider it as what it truly is — the weight of evidence against the null hypothesis. And, of course, if you don’t have the evidence you need that isn’t necessarily because it doesn’t exist it could be that you lack statistical power to detect an effect.
因此,希望在将来,您将停止使用p = 0.05作为从阈值中选出的某个阈值,并将其视为真正的阈值-反对原假设的证据权重。 而且,当然,如果您没有所需的证据,不一定是因为该证据不存在,可能是您缺乏统计能力来检测效果。
翻译自: https://towardsdatascience.com/stop-using-p-0-05-4a059e622c75
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/387939.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!