贝塞尔修正_贝塞尔修正背后的推理:n-1

贝塞尔修正

A standard deviation seems like a simple enough concept. It’s a measure of dispersion of data, and is the root of the summed differences between the mean and its data points, divided by the number of data points…minus one to correct for bias.

标准偏差似乎很简单。 它是对数据分散性的一种度量,是均值与其数据点之和的总和之差除以数据点的数量…… 减去1即可校正偏差

This is, I believe, the most oversimplified and maddening concept for any learner, and the intent of this post is to provide a clear and intuitive explanation for Bessel’s Correction, or n-1.

我认为,对于任何学习者来说,这都是最简单和令人发疯的概念,这篇文章的目的是为贝塞尔修正(n-1)提供清晰直观的解释。

To start, recall the formula for a population mean:

首先,回顾一下总体均值的公式:

Image for post
Population Mean Formula
人口均值公式

What about a sample mean?

样本的意思是什么?

Image for post
Sample Mean Formula
样本均值公式

Well, they look identical, except for the lowercase N. In each case, you just add each xᵢ, and divide by how many x’s there are. If we are dealing with an entire population, we would use N, instead of n, to indicate the total number of points in the population.

好吧,除了小写字母N之外,它们看起来相同。在每种情况下,您只需将每个xᵢ相加,然后除以x的个数即可。 如果要处理整个总体,则将使用N而不是n来表示总体中的总点数。

Now, what is standard deviation σ (called sigma)?

现在,标准偏差σ是多少?

If a population contains N points, then the standard deviation is the square root of the variance, which is the summed-and-averaged squared differences of each data point and the population mean, or μ:

如果总体包含N个点,则标准偏差为方差的平方根,即每个数据点与总体平均值或μ的求和平均值平方差。

Image for post
Formula for Population Standard Deviation
人口标准偏差公式

But what about a sample standard deviation, s, with n data points and sample mean x-bar:

但是,具有n个数据点和样本均值x-bar的样本标准偏差s呢?

Image for post
Formula for Sample Standard Deviation
样品标准偏差公式

Alas, the dreaded n-1 appears. Why? Shouldn’t it be the same formula? It was virtually the same formula for population mean and sample mean!

,,出现了可怕的n-1。 为什么? 不应该是相同的公式吗? 总体均值和样本均值实际上是相同的公式!

The short answer is: this is very complex, to such an extent that most instructors explain n-1 by saying the sample standard deviation will ‘a biased estimator’ if you don’t do it.

简短的答案是: 这非常复杂,以至于大多数教师讲n-1时,如果不这样做,则样本标准差将成为“有偏估计”。

什么是偏见,为什么存在? (What is Bias, and Why is it There?)

The Wikipedia explanation can be found here.

Wikipedia的解释可以在这里找到 。

It’s not helpful.

没有帮助

To really understand n-1, just like any other brief attempt to explain Bessel’s Correction, requires holding a lot in your head at once. I’m not talking about a proof, either. I’m talking about truly understanding the differences between a sample and a population.

要真正了解N-1,就像任何其他的简短试图解释贝塞尔修正,需要立刻拿着很多在你的头上。 我也不是在说证明。 我说的是真正了解样本与总体之间的差异

What is a sample?

什么是样本?

A sample is always a subset of a population it’s intended to represent (a subset can be the same size as the original set, in the unusual case of sampling an entire population without replacement). This is a massive leap alone. Once a sample is taken, there are presumed, hypothetical parameters and distributions built into that sample-representation.

样品始终是一个人口它意在表示 (一个子集可以是相同大小的原始集合,在没有更换采样的整个人口的不寻常的情况下)的子集 。 单单这是一次巨大的飞跃。 抽取样本后, 该样本表示中会内置假定的假设参数和分布

The very word statistic refers to some piece of information about a sample (such as a mean, or median) which corresponds to some piece of analogous information about the population (again, such as mean, or median) called a parameter. The field of ‘Statistics’ is named as such, instead of ‘Parametrics’, to convey this attitude of inference from smaller to larger, and this leap, again, has many assumptions built into it. For example, if prior assumptions about a sample’s population are actually quantified, this leads to Bayesian statistics. If not, this leads to frequentism, both outside the scope of this post, but nevertheless important angles to consider in the context of Bessel’s correction. (in fact, in Bayesian inference Bessel’s Correction is not used, since prior probabilities about population parameters are intended to handle bias in a different way, upfront. Variance and standard deviation are calculated with plain old n).

统计数据一词是指有关样本的某些信息(例如平均值或中位数),它对应于有关总体的某些类似信息(同样是平均值或中位数),称为参数。 “统计”字段的名称被这样命名,而不是“参数”字段,以传达从小到大的这种推理态度而这一飞跃又有许多假设。 例如,如果实际量化了关于样本总体的先前假设,那么这将导致贝叶斯统计 。 如果不是这样,则会导致频频主义 ,这既不在本文讨论的范围之内,也不过是在贝塞尔修正案中要考虑的重要角度。 (实际上,在贝叶斯推断中未使用Bessel校正,因为有关总体参数的先验概率旨在以不同的方式预先处理偏差。方差和标准差使用普通old n来计算)。

But let’s not lose focus. Now that we’ve stated the important fundamental difference between a sample and a population, let’s consider the implications of sampling. I will be using the Normal distribution for the following examples for the sake of simplicity, as well as this Jupyter notebook which contains one-million simulated, Normally distributed data points for visualizing intuitions about samples. I highly recommend playing with it yourself, or simply using from sklearn.datasets import make_gaussian_quantiles to get a hands-on feel for what’s really going on with sampling.

但是,我们不要失去重点。 现在,我们已经说明了样本和总体之间的重要根本区别,让我们考虑一下样本的含义。 为简单起见,我将在以下示例中使用正态分布,以及此Jupyter笔记本 ,其中包含一百万个模拟的正态分布数据点,用于可视化有关样本的直觉。 我强烈建议您自己玩,或者只是from sklearn.datasets import make_gaussian_quantiles来获得对采样实际操作的亲身体验。

Here is an image of one million randomly-generated, Normally distributed points. We will call it our population:

这是一百万个随机生成的正态分布点的图像。 我们称其为人口:

Image for post
Just one million points
一百万点

To further simplify things, we will only be considering mean, variance, standard deviation, etc., based on the x-values. (That is, I could have used a mere number line for these visualizations, but having the y-axis more effectively displays the distribution across the x axis).

为了进一步简化,我们将仅基于x值考虑均值,方差,标准差等。 (也就是说,我本可以仅使用数字线进行这些可视化,但是使y轴更有效地显示x轴上的分布)。

This is a population, so N = 1,000,000. It’s Normally distributed, so the mean is 0.0, and the standard deviation is 1.0.

这是人口,因此N = 1,000,000。 它是正态分布的,因此平均值为0.0,标准偏差为1.0。

I took two random samples, the first only 10 points and the second 100 points:

我随机抽取了两个样本,前一个仅10分,第二个100分:

Image for post
100-point sample in black, 10-point sample in orange, red lines are one std from the mean
黑色的100点样本,橙色的红色点10点样本与平均值相差1 std

Now, let’s take a look at these two samples, without and with Bessel’s Correction, along with their standard deviations (biased and unbiased, respectively). The first sample is only 10 points, and the second sample is 100 points.

现在,让我们看一下这两个样本(不带贝塞尔校正和不带贝塞尔校正)以及它们的标准偏差(分别为有偏和无偏)。 第一个样本仅为10分,第二个样本为100分。

Image for post
The Correction Seems to Help!
更正似乎有帮助!

Take a good long look at the above image. Bessel’s Correction does seem to be helping. It makes sense: very often the sample standard deviation will be lower than the population standard deviation, especially if the sample is small, because unrepresentative points (‘biased’ points, i.e. farther from the mean) will have more of an impact on the calculation of variance. Because the difference between each data point and the sample mean is being squared, the range of possible differences will be smaller than the real range if the population mean was used. Furthermore, taking a square root is a concave function, and therefore introduces ‘downward bias’ in estimations.

请仔细看一下上面的图片。 贝塞尔的矫正似乎确实有所帮助。 这是有道理的:样本标准偏差通常低于总体标准偏差,尤其是在样本较小的情况下,因为不具有代表性的点(“有偏点”,即距离均值较远)会对计算产生更大的影响差异。 由于每个数据点和样本均值之间的差异均被平方,因此,如果使用总体均值,则可能的差异范围将小于实际范围。 此外, 取平方根是一个凹函数,因此在估计中引入了“向下偏差” 。

Another way of thinking about it is this: the larger your sample, the more of an opportunity you have to run into more population-representative points, i.e. points that are close to the mean. Therefore, you have less of a chance of getting a sample mean which results in differences which are too small, leading to a too-small variance, and you’re left with an undershot standard deviation.

另一种思考方式是:样本越大, 就越有机会碰到更多具有人口代表性的点,即接近均值的点。 因此,您获得样本均值的机会较小,样本均值导致的差异过小,导致方差过小,并且留下的标准偏差不足。

On average, samples of a Normally-distributed population will produce a variance which is biased downward by a factor of n-1 on average. (Incidentally, I believe the distribution of sample biases themselves are described by Student’s t-distribution, determined by n). Therefore, by dividing the square-rooted variance by n-1, we make the denominator smaller, thereby making the result larger and leading to a so-called ‘unbiased’ estimate.

平均而言,正态分布总体的样本将产生方差,该方差平均向下降低n-1倍 。 (顺便说一句,我相信样本偏差本身的分布由Student的t分布描述,由n确定)。 因此,通过将平方根方差除以n-1,我们使分母变小,从而使结果变大,从而导致所谓的“无偏”估计。

The key point to emphasize here is that Bessel’s Correction, or dividing by n-1, doesn’t always actually help! Because the potential sample-variances are themselves t-distributed, you will unwittingly run into cases where n-1 will overshoot the real population standard deviation. It just so happens that n-1 is the best tool we have to correct for bias most of the time.

这里要强调的关键是,贝塞尔校正或除以n-1并不一定总是有帮助! 因为潜在的样本方差本身是t分布的,所以您会无意中遇到n-1会超出实际总体标准差的情况。 碰巧的是,n-1是大多数时候我们必须校正偏差的最佳工具

To prove this, check out the same Jupyter notebook where I’ve merely changed the random seed until I found some samples whose standard deviation was already close to the population standard deviation, and where n-1 added more bias:

为了证明这一点,请查看我只更改了随机种子的同一本Jupyter笔记本 直到我发现一些标准偏差已经接近总体标准偏差的样本,并且其中n-1 增加了更多偏差

Image for post

In this case, Bessel’s Correction actually hurt us!

在这种情况下,贝塞尔的更正实际上伤害了我们!

Thus, Bessel’s Correction is not always a correction. It’s called such because most of the time, when sampling, we don’t know the population parameters. We don’t know the real mean or variance or standard deviation. Thus, we are relying on the fact that because we know the rate of bad luck (undershooting, or downward bias), we can counteract bad luck by the inverse of that rate: n-1.

因此,贝塞尔的校正并不总是校正。 之所以这样称呼,是因为在大多数情况下,抽样时我们不知道总体参数 。 我们不知道真实的均值或方差或标准差。 因此,我们依靠这样一个事实, 因为我们知道厄运率(下冲或向下偏差),因此可以通过该比率的倒数来抵消厄运:n-1。

But what if you get lucky? Just like in the cells above, this can happen sometimes. Your sample can occasionally produce the correct standard deviation, or even overshoot it, in which case n-1 ironically adds bias.

但是,如果您幸运的话,该怎么办? 就像上面的单元格一样,有时可能会发生这种情况。 您的样品有时可能会产生正确的标准偏差,甚至会产生超标,在这种情况下,n-1具有讽刺意味的是会增加偏差。

Nevertheless, it’s the best tool we have for bias correction in a state of ignorance. The need for bias correction doesn’t exist from a God’s-eye point of view, where the parameters are known.

但是,它是我们在无知状态下进行偏差校正的最佳工具。 从参数已知的角度来看,不存在偏差校正的需要。

At the end of the day, this fundamentally comes down to understanding the crucial difference between a sample and a population, as well as why Bayesian Inference is such a different approach to classical problems, where guesses about the parameters are made upfront via prior probabilities, thus removing the need for Bessel’s Correction.

归根结底,这从根本上归结为理解样本与总体之间的关键差异,以及为什么贝叶斯推理是对古典问题的如此不同的方法,其中对参数的猜测是通过先验概率预先做出的 ,从而消除了贝塞尔校正的需要。

I’ll focus on Bayesian statistics in future posts. Thanks for reading!

在以后的文章中,我将重点介绍贝叶斯统计。 谢谢阅读!

翻译自: https://towardsdatascience.com/the-reasoning-behind-bessels-correction-n-1-eeea25ec9bc9

贝塞尔修正

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/391238.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

RESET MASTER和RESET SLAVE使用场景和说明【转】

【前言】在配置主从的时候经常会用到这两个语句,刚开始的时候还不清楚这两个语句的使用特性和使用场景。 经过测试整理了以下文档,希望能对大家有所帮助; 【一】RESET MASTER参数 功能说明:删除所有的binglog日志文件,…

Kubernetes 入门(1)基本概念

1. Kubernetes简介 作为一个目前在生产环境已经广泛使用的开源项目 Kubernetes 被定义成一个用于自动化部署、扩容和管理容器应用的开源系统;它将一个分布式软件的一组容器打包成一个个更容易管理和发现的逻辑单元。 Kubernetes 是希腊语『舵手』的意思&#xff0…

android 西班牙_分析西班牙足球联赛(西甲)

android 西班牙The Spanish football league commonly known as La Liga is the first national football league in Spain, being one of the most popular professional sports leagues in the world. It was founded in 1929 and has been held every year since then with …

Goalng软件包推荐

2019独角兽企业重金招聘Python工程师标准>>> 前言 哈喽大家好呀! 马上要迎来狗年了大家是不是已经怀着过年的心情了呢? 今天笔者给大家带来了一份礼物, Goalng的软件包推荐, 主要总结了一下在go语言中大家开源的优秀的软件, 大家了解之后在后续使用过程有遇到如下软…

Kubernetes 入门(2)基本组件

1. C/S架构 Kubernetes 遵循非常传统的客户端服务端架构,客户端通过 RESTful 接口或者直接使用 kubectl 与 Kubernetes 集群进行通信,这两者在实际上并没有太多的区别,后者也只是对 Kubernetes 提供的 RESTful API 进行封装并提供出来。 左侧…

【powerdesign】从mysql数据库导出到powerdesign,生成数据字典

使用版本powerdesign16.5,mysql 5.5,windows 64 步骤: 1.下载mysql驱动【注意 32和64的驱动都下载下来,具体原因查看第三步 依旧会报错处】 下载地址:https://dev.mysql.com/downloads/connector/odbc/5.3.html 请下…

php amazon-s3_推荐亚马逊电影-一种协作方法

php amazon-s3Item-based collaborative and User-based collaborative approach for recommendation system with simple coding.推荐系统的基于项目的协作和基于用户的协作方法,编码简单。 推荐系统概述 (Overview of Recommendation System) There are many met…

python:使用Djangorestframework编写post和get接口

1、安装django pip install django 2、新建一个django工程 python manage.py startproject cainiao_monitor_api 3、新建一个app python manage.py startapp monitor 4、安装DRF pip install djangorestframework 5、编写视图函数 views.py from rest_framework.views import A…

Kubernetes 入门(3)集群安装

1. kubeadm简介 kubeadm 是 Kubernetes 官方提供的一个 CLI 工具,可以很方便的搭建一套符合官方最佳实践的最小化可用集群。当我们使用 kubeadm 搭建集群时,集群可以通过 K8S 的一致性测试,并且 kubeadm 还支持其他的集群生命周期功能&#…

【9303】平面分割

Time Limit: 10 second Memory Limit: 2 MB 问题描述 同一平面内有n(n≤500)条直线,已知其中p(p≥2)条直线相交与同一点,则这n条直线最多能将平面分割成多少个不同的区域? Input 两个整数n&am…

简述yolo1-yolo3_使用YOLO框架进行对象检测的综合指南-第一部分

简述yolo1-yolo3重点 (Top highlight)目录: (Table Of Contents:) Introduction 介绍 Why YOLO? 为什么选择YOLO? How does it work? 它是如何工作的? Intersection over Union (IoU) 联合路口(IoU) Non-max suppression 非最大抑制 Networ…

JAVA基础知识|lambda与stream

lambda与stream是java8中比较重要两个新特性,lambda表达式采用一种简洁的语法定义代码块,允许我们将行为传递到函数中。之前我们想将行为传递到函数中,仅有的选择是使用匿名内部类,现在我们可以使用lambda表达式替代匿名内部类。在…

数据库:存储过程_数据科学过程:摘要

数据库:存储过程Once you begin studying data science, you will hear something called ‘data science process’. This expression refers to a five stage process that usually data scientists perform when working on a project. In this post I will walk through ea…

svm和k-最近邻_使用K最近邻的电影推荐和评级预测

svm和k-最近邻Recommendation systems are becoming increasingly important in today’s hectic world. People are always in the lookout for products/services that are best suited for them. Therefore, the recommendation systems are important as they help them ma…

Oracle:时间字段模糊查询

需要查询某一天的数据,但是库里面存的是下图date类型 将Oracle中时间字段转化成字符串,然后进行字符串模糊查询 select * from CAINIAO_MONITOR_MSG t WHERE to_char(t.CREATE_TIME,yyyy-MM-dd) like 2019-09-12 转载于:https://www.cnblogs.com/gcgc/p/…

cnn对网络数据预处理_CNN中的数据预处理和网络构建

cnn对网络数据预处理In this article, we will go through the end-to-end pipeline of training convolution neural networks, i.e. organizing the data into directories, preprocessing, data augmentation, model building, etc.在本文中,我们将遍历训练卷积神…

leetcode 554. 砖墙

你的面前有一堵矩形的、由 n 行砖块组成的砖墙。这些砖块高度相同(也就是一个单位高)但是宽度不同。每一行砖块的宽度之和应该相等。 你现在要画一条 自顶向下 的、穿过 最少 砖块的垂线。如果你画的线只是从砖块的边缘经过,就不算穿过这块砖…

递归 和 迭代 斐波那契数列

#include "stdio.h"int Fbi(int i) /* 斐波那契的递归函数 */ { if( i < 2 ) return i 0 ? 0 : 1; return Fbi(i - 1) Fbi(i - 2); /* 这里Fbi就是函数自己&#xff0c;等于在调用自己 */ }int main() { int i; int a[40]; printf("迭代显示斐波那契数列…

飞行模式的开启和关闭

2019独角兽企业重金招聘Python工程师标准>>> if(Settings.System.getString(getActivity().getContentResolver(),Settings.Global.AIRPLANE_MODE_ON).equals("0")) { Settings.System.putInt(getActivity().getContentResolver(),Settings.Global.AIRPLA…

消解原理推理_什么是推理统计中的Z检验及其工作原理?

消解原理推理I Feel:我觉得&#xff1a; The more you analyze the data the more enlightened, data engineer you will become.您对数据的分析越多&#xff0c;您将变得越发开明。 In data engineering, you will always find an instance where you need to establish whet…