高斯模糊为什么叫高斯滤波_为什么高斯是所有发行之王?

高斯模糊为什么叫高斯滤波

高斯分布及其主要特征: (Gaussian Distribution and its key characteristics:)

  • Gaussian distribution is a continuous probability distribution with symmetrical sides around its center.

    高斯分布是连续概率分布,其中心周围具有对称边。
  • Its mean, median and mode are equal.

    其均值,中位数和众数相等。
  • Its shape looks like below with most of the data points clustered around the mean with asymptotic tails.

    它的形状如下图所示,大多数数据点均以渐近尾部聚类在均值周围。
Image for post
Source资源

Interpretation:

解释:

  • ~68% of the values drawn from normal distribution lie within 1𝜎

    从正态分布得出的值的约68%位于1𝜎之内
  • ~95% of the values drawn from normal distribution lie within 2𝜎

    从正态分布得出的值的约95%位于2𝜎之内
  • ~99.7% of the values drawn from normal distribution lie within 3𝜎

    从正态分布得出的值的约99.7%位于3𝜎之内

我们在哪里找到高斯分布的存在? (Where do we find the existence of Gaussian distribution?)

ML practitioners or not, almost all of us have heard of this most popular form of distribution somewhere or the other. Everywhere we look around us, majority of the processes follow approximate Gaussian form, for e.g. age, height, IQ, memory, etc.

不管是否有ML从业者,我们几乎所有人都听说过这种最流行的发行形式。 我们环顾四周的任何地方,大多数过程都遵循近似的高斯形式,例如年龄,身高,智商,记忆力等。

On a lighter note, there is one well-known example of Gaussian lurking around all of us i.e. ‘bell curve’ during appraisal time 😊

轻松地说,有一个众所周知的例子,高斯潜伏在我们所有人周围,即在评估期间出现“钟形曲线”😊

Yes, Gaussian distribution resonates with bell curve quite often and its probability density function is represented by the following mathematical formula:

是的,高斯分布经常与钟形曲线产生共振,其概率密度函数由以下数学公式表示:

Image for post
probability density function of Gaussian distribution
高斯分布的概率密度函数

Notation:

符号:

A random variable X with mean 𝜇 and variance 𝜎² is denoted as:

具有均值𝜇和方差𝜎²的随机变量X表示为:

Image for post
Random variable X following normal distribution
服从正态分布的随机变量X

高斯分布有何特别之处? 为什么我们几乎到处都可以找到高斯? (What is so special about the Gaussian distribution? Why do we find Gaussian almost everywhere?)

Whenever we need to represent real valued random variables whose distribution is not known, we assume the Gaussian form.

每当我们需要表示其分布未知的实值随机变量时,我们都采用高斯形式。

This behavior is largely owed to Central Limit Theorem (CLT) which involves the study of sum of multiple random variables.

这种行为很大程度上归因于中央极限定理(CLT) ,该定理涉及多个随机变量之和的研究。

As per CLT: normalized sum of a number of random variables, regardless of which distribution they belong to originally, converges to Gaussian distribution as the number of terms in the summation increases.

根据CLT:许多随机变量的归一化总和,无论它们最初属于哪个分布,都随着总和中项数的增加而收敛到高斯分布

An important point to note is that CLT is valid at a sample size of 30 observations i.e. sampling distribution can be safely assumed to follow Gaussian form, if we have a minimum sample size of 30 observations.

需要注意的重要一点是,CLT在30个观测值的样本量下有效,即,如果我们的最小样本量为30个观测值,则可以安全地假定样本分布遵循高斯形式。

Therefore, any physical quantity that is sum of many independent processes is assumed to follow Gaussian. For e.g., “in a typical machine learning framework, there are multiple sources of errors possible — data entry error, data measurement error, classification error etc”. The cumulative effect of all such forms of error is likely to follow normal distribution”

因此,假定许多独立过程之和的任何物理量都遵循高斯。 例如,“在典型的机器学习框架中,可能有多种错误来源—数据输入错误,数据测量错误,分类错误等”。 所有这些形式的错误的累积影响很可能遵循正态分布。”

Let’s check this using python:

让我们使用python检查一下:

Steps:

脚步:

  • Draw n samples from exponential distribution

    从指数分布中抽取n个样本
  • Normalize the sum of n samples

    归一化n个样本的总和
  • Repeat above steps N times

    重复上述步骤N次
  • Keep storing the normalized sum in sum_list

    继续将归一化的和存储在sum_list中
  • In the end, plot the histogram of the normalized sum_list

    最后,绘制归一化的sum_list的直方图
  • The output closely follows Gaussian distribution, as shown below:

    输出紧密遵循高斯分布,如下所示:
Image for post
Normalized sum of 30 samples drawn from exponential distribution follow Gaussian distribution
从指数分布中抽取的30个样本的归一化和遵循高斯分布

Similarly, there are several other distributions like Student t distribution, chi-squared distribution, F distribution etc which have strong dependence on the Gaussian distribution. For e.g. t-distribution is a result of infinite mixture of Gaussians leading to longer tails as compared to a Gaussian one.

同样,还有其他一些分布,例如学生t分布,卡方分布,F分布等,它们对高斯分布有很强的依赖性。 例如,t分布是高斯混合的无限结果,与高斯相比,它导致更长的尾巴。

Properties of Gaussian Distribution:

高斯分布的性质:

1) Affine transformation:

1) 仿射变换:

It is a simple transformation of multiplying the random variable with a scalar ‘a’ and adding another scalar ‘b’ to it.

这是将随机变量与标量“ a”相乘并向其添加另一个标量“ b”的简单转换。

The resulting distribution is Gaussian with mean:

结果分布为高斯,均值:

If X ~ N(𝜇, 𝜎²), then for any a,b ∈ ℝ,

如果X〜N(𝜇,𝜎²),那么对于任何a,b∈ℝ,

a.X+b ~ N(a. 𝜇+b, a².𝜎²)

a.X + b〜N(a。𝜇 + b,a².𝜎²)

Note that not all transformations result into Gaussian, for e.g. square of a Gaussian will not lead to Gaussian.

请注意,并非所有的变换都会产生高斯,例如,高斯的平方不会导致高斯。

2) Standardization:

2) 标准化:

If we have 2 sets of observations, each drawn from a normal distribution with different mean and sigma, then how do we compare the two observations to calculate the probabilities with respect to their population?

如果我们有两组观测值,每组观测值均来自具有不同均值和sigma的正态分布,那么我们如何比较这两个观测值以计算其总体的概率?

Hence, we need to convert the observations mentioned above into Z score. This process is called as Standardization which adjusts the raw observation with respect to its mean and sigma of the population it is generated from and brings it onto a common scale

因此,我们需要将上述观察值转换为Z分数。 此过程称为“标准化”,它根据原始观测值的平均值和总和来调整原始观测值,并将其放到一个通用范围内

Image for post
Z score
Z分数

3) Conditional distribution: An important property of multivariate Gaussian is that if two sets of variables are jointly Gaussian, then the conditional distribution of one set conditioned on the other set is again Gaussian

3) 条件分布:多元高斯的一个重要属性是,如果两组变量联合为高斯,则以另一组为条件的一组的条件分布又是高斯

4) Marginal distribution of the set is also a Gaussian

4)集合的边际分布也是高斯分布

5) Gaussian distributions are self-conjugate i.e. given the Gaussian likelihood function, choosing the Gaussian prior will result in Gaussian posterior.

5)高斯分布是自共轭的,即给定高斯似然函数,选择高斯先验将导致高斯后验。

6) Sum and difference of two independent Gaussian random variables is a Gaussian

6)两个独立的高斯随机变量的和与差是一个高斯

Limitations of Gaussian Distributions:

高斯分布的局限性:

  1. Simple Gaussian distribution fails to capture the below structure:

    简单的高斯分布无法捕获以下结构:
Image for post
Mixture of Gaussians from Pattern Recognition and Machine Learning by Christopher Bishop
模式识别和机器学习中的高斯混合

Such structure is better characterized by the linear combination of two Gaussians (also known as mixture of Gaussians). However, it's complex to estimate the parameters of such mixture of Gaussians.

通过两个高斯线性组合(也称为高斯混合)可以更好地描述这种结构。 但是,估计这种高斯混合参数很复杂。

2) Gaussian distribution is uni-modal, i.e. it fails to provide a good approximation to multi-modal distributions thereby restricting the range of distributions that it can represent adequately.

2)高斯分布是单峰分布,即它不能很好地近似多峰分布,从而限制了它可以充分表示的分布范围。

3) Degrees of freedom grow quadratically with an increase in the number of dimensions. This results in high computational complexity in inverting such large covariance matrix.

3)自由度随着尺寸数量的增加而平方增长。 在反转这样大的协方差矩阵时,这导致很高的计算复杂度。

Hope the post gives you a sneak peek into the world of Gaussian distributions.

希望这篇文章能使您对高斯分布的世界有所了解。

Happy Reading!!!

阅读愉快!

翻译自: https://towardsdatascience.com/why-is-gaussian-the-king-of-all-distributions-c45e0fe8a6e5

高斯模糊为什么叫高斯滤波

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388881.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

C# webbrowser 代理

百度,google加自己理解后,将所得方法总结一下: 方法1:修改注册表Software//Microsoft//Windows//CurrentVersion//Internet Settings下 ProxyEnable和ProxyServer。这种方法适用于局域网用户,拨号用户无效。 1p…

C MySQL读写分离连接串_Mysql读写分离

一 什么是读写分离MySQL Proxy最强大的一项功能是实现“读写分离(Read/Write Splitting)”。基本的原理是让主数据库处理事务性查询,而从数据库处理SELECT查询。数据库复制被用来把事务性查询导致的变更同步到集群中的从数据库。当然,主服务器也可以提供…

从Jupyter Notebook到脚本

16 Aug: My second article: From Scripts To Prediction API8月16日:我的第二篇文章: 从脚本到预测API As advanced beginners, we know quite a lot: EDA, ML concepts, model architectures etc…… We can write a big Jupyter Notebook, click “Re…

加勒比海兔_加勒比海海洋物种趋势

加勒比海兔Ok, here’s a million dollar question: is the Caribbean really dying? Or, more specifically, are marine species found on Caribbean reefs becoming less abundant?好吧,这是一个百万美元的问题:加勒比海真的死了吗? 或者…

tornado 简易教程

引言 回想Django的部署方式 以Django为代表的python web应用部署时采用wsgi协议与服务器对接(被服务器托管),而这类服务器通常都是基于多线程的,也就是说每一个网络请求服务器都会有一个对应的线程来用web应用(如Djang…

人口密度可视化_使用GeoPandas可视化菲律宾的人口密度

人口密度可视化GeoVisualization /菲律宾。 (GeoVisualization /Philippines.) Population density is a crucial concept in urban planning. Theories on how it affects economic growth are divided. Some claim, as Rappaport does, that an economy is a form of “spati…

Unity - Humanoid设置Bip骨骼导入报错

报错如下: 解决: 原因是biped骨骼必须按照Unity humanoid的要求设置,在max中设置如下: 转载于:https://www.cnblogs.com/CloudLiu/p/10746052.html

Kubernetes - - k8s - v1.12.3 OpenLDAP统一认证

1,基本概念 为了方便管理和集成jenkins,k8s、harbor、jenkins均使用openLDAP统一认证。2,部署openLDAP 根据之前的文档,openLDAP使用GFS进行数据持久化。下载对应的openLDAP文件git clone https://github.com/xiaoqshuo/k8s-clust…

srpg 胜利条件设定_英雄联盟获胜条件

srpg 胜利条件设定介绍 (Introduction) The e-sports community has been growing rapidly in the past few years, and what used to be a casual pastime has morphed into an industry projected to generate $1.8 B in revenue by 2022. While there are many video games …

机器学习 综合评价_PyCaret:机器学习综合

机器学习 综合评价Any Machine Learning project journey starts with loading the dataset and ends (continues ?!) with the finalization of the optimum model or ensemble of models for predictions on unseen data and production deployment.任何机器学习项目的旅程都…

silverlight 3D 游戏开发

http://www.postvision.net/SilverMotion/DemoTech.aspx silverlight 3D 游戏开发 时间:2010-10-22 06:33来源:开心银光 作者:黎东海 点击: 562次意外发现一个silverlight的实时3D渲染引擎。性能比开源那些强很多。 而且支持直接加载maya,3Dmax等主流3D模型文件。 附件附上它的…

皮尔逊相关系数 相似系数_皮尔逊相关系数

皮尔逊相关系数 相似系数数据科学和机器学习统计 (STATISTICS FOR DATA SCIENCE AND MACHINE LEARNING) In the last post, we analyzed the relationship between categorical variables and categorical and continuous variables. In this case, we will analyze the relati…

Kubernetes持续交付-Jenkins X的Helm部署

Jenkins X 是一个集成化的 CI / CD 平台,可用于 部署在Kubernetes集群或云计算中心。支持在云计算环境下简单地开发和部署应用。本项目是在Kubernetes上的安装支持工具集。 本工具集中包含: Jenkins - 定制好的流水线和运行环境,完全整合CI/C…

中国石油大学(华东)暑期集训--二进制(BZOJ5294)【线段树】

问题 C: 二进制 时间限制: 1 Sec 内存限制: 128 MB提交: 8 解决: 2[提交] [状态] [讨论版] [命题人:]题目描述 pupil发现对于一个十进制数,无论怎么将其的数字重新排列,均不影响其是不是3的倍数。他想研究对于二进制,是否也有类似的性质。于…

Java 8 新特性之Stream API

1. 概述 1.1 简介 Java 8 中有两大最为重要的改革,第一个是 Lambda 表达式,另外一个则是 Stream API(java.util.stream.*)。 Stream 是 Java 8 中处理集合的关键抽象概念,它可以指定你希望对集合进行的操作&#xff0c…

Ubuntu中NS2安装详细教程

前言: NS2是指 Network Simulator version 2,NS(Network Simulator) 是一种针对网络技术的源代码公开的、免费的软件模拟平台,研究人员使用它可以很容易的进行网络技术的开发,而且发展到今天,它…

14.vue路由脚手架

一.vue路由:https://router.vuejs.org/zh/ 1、定义 let router new VueRouter({mode:"history/hash",base:"基本路径" 加一些前缀 必须在history模式下有效linkActiveClass:"active", 范围选择linkExactActiveClass:"exact&qu…

linux-buff/cache过大导致内存不足-程序异常

2019独角兽企业重金招聘Python工程师标准>>> 问题描述 Linux内存使用量超过阈值,使得Java应用程序无可用内存,最终导致程序崩溃。即使在程序没有挂掉时把程序停掉,系统内存也不会被释放。 找原因的过程 这个问题已经困扰我好几个月…

Android 适配(一)

一、Android适配基础参数1.常见分辨率(px)oppx 2340x1080oppR15 2280x1080oppor11sp 2160*10801080*1920 (主流屏幕16:9)1080*216018:9 手机主流分辨率: 1080*2160高端 16:9 手机主流分辨率: 1080P (1080*1920) 或 2K …

Source Insight 创建工程(linux-2.6.22.6内核源码)

1. 软件设置 安装完Source Insight,需要对其进行设置添加对“.S”汇编文件的支持: 2. 新建linux-2.6.22.6工程 1)选择工程存放的路径: 2)下载linux-2.6.22.6内核源码,并解压。在Source Insight中 指定源码的…