因果推论第六章

因果推论 (Causal Inference)

This is the sixth post on the series we work our way through “Causal Inference In Statistics” a nice Primer co-authored by Judea Pearl himself.

这是本系列的第六篇文章,我们将通过Judea Pearl本人与他人合着的《引诱统计学中的因果关系》一书进行介绍。

Image for post
Amazon Affiliate Link
亚马逊会员链接

You can find the previous post here and all the we relevant Python code in the companion GitHub Repository:

您可以在此处找到相关的上一篇文章以及相关的GitHub存储库中所有与我们相关的Python代码:

While I will do my best to introduce the content in a clear and accessible way, I highly recommend that you get the book yourself and follow along. So, without further ado, let’s get started!

尽管我会尽力以清晰易懂的方式介绍内容,但我强烈建议您自己拿书并继续学习。 因此,事不宜迟,让我们开始吧!

In the previous post V, we started on Chapter II of the book where Pearl starts to build up the machinery of Causal Inference. In this post we look at Colliders, one of the most powerful ideas in the analysis of graphical models.

在上一篇文章V中 ,我们从书的第二章开始,Pearl开始建立因果推理的机制。 在这篇文章中,我们看一下碰撞器,它是图形模型分析中最强大的想法之一。

2.3对撞机 (2.3 Colliders)

The third, and perhaps the most important, graph motif that we will cover is known as the Collider and is illustrated in this figure:

我们将介绍的第三个(也许是最重要的)图形图案称为Collider,如下图所示:

Image for post
Fig. 2.3 — A simple collider
图2.3 —一个简单的对撞机

Collider nodes are nodes that receive inputs from 2 or more other variables. Applying the rules we are already familiar with, we can immediately conclude:

对撞机节点是从2个或更多其他变量接收输入的节点。 应用我们已经熟悉的规则,我们可以立即得出结论:

  1. X and Z are dependent — P(X|Z) ≠ P(X)

    X和Z是依赖的 -P(X | Z)≠P(X)

  2. Y and Z are dependent — P(Y|Z) ≠ P(Y)

    Y和Z是依赖的 -P(Y | Z)≠P(Y)

  3. X and Y are independent — P(X|Y) =P(X)

    X和Y是独立的 -P(X | Y)= P(X)

  4. X and Y are dependent conditional on Z — P(X|Y, Z) ≠ P(X|Z)

    X和Y取决于Z — P(X | Y,Z)≠P(X | Z)

Points 1 and 2 follow directly from Rule 0: Any two variables with a directed edge between them are dependent. Point 3 is obvious from the fact that there is no directed path between X and Y (neither is an ancestor or a descendent of the other).

点1和2直接从规则0得出:在它们之间有向边的任何两个变量都是相关的。 从X和Y之间没有定向路径的事实可以明显看出第3点(两者都不是祖先或后代)。

Point 4 is the most interesting case, but it can be understood with a simple algebraic example. Consider the mathematical expression:

点4是最有趣的情况,但是可以通过一个简单的代数示例来理解。 考虑一下数学表达式:

Image for post

This relationship determines the value of Z and is valid for any possible value of X and Y, but as soon as I fix the value of Z, say Z=10 that immediately limits the possible values of X and Y that are now constrained such that, Y = 10-Z (or, graphically, lie on the intersection of the two Z=X+Y and Z=10 planes):

此关系确定Z的值,并且对于X和Y的任何可能值均有效,但是一旦我确定Z的值,则说Z = 10立即限制了现在受约束的X和Y的可能值,从而,Y = 10-Z(或以图形方式位于两个Z = X + Y和Z = 10平面的交点上):

Image for post
The effect of conditioning on a collider node, Z
条件对撞节点Z的影响

This is a simple illustration of the most fundamental definition of conditioning: filtering by the value of the conditioning variable.

这是对调节的最基本定义的简单说明:通过调节变量的值进行过滤。

The book further illustrates this idea using the example of the well known Monty-Hall problem. For this game, our probability table would be:

本书以著名的蒙蒂·霍尔(Monty-Hall)问题为例进一步说明了这一想法。 对于这个游戏,我们的概率表将是:

Image for post

Where the 0.0555 values correspond to the fact that we give Monty a 50/50 chance of choosing either goat if I happen to choose the door where the car is.

0.0555的值对应的事实是,如果我碰巧选择了汽车所在的门,我们给Monty 50/50的机会选择任一只山羊。

From this table, it’s easy to see that P(Choice|Car)=P(Car), or, in other words:

从此表中,很容易看出P(Choice | Car)= P(Car),或者换句话说:

Image for post

Essentially, I’m choosing one of the three doors at random. Now, let’s take a look at P(Choice | Car, Monty):

本质上,我是随机选择三个门之一。 现在,让我们看一下P(Choice | Car,Monty):

Image for post

Where it is now clear that depending on the door that Monty has chosen, the value of P(Choice | Car) will change.

现在很明显,根据Monty选择的门,P(Choice | Car)的值将改变。

This is also a clear example of a non-causal dependency between two variables, illustrating the point that correlation does not imply causation. Here the relationship between the two variables (Car and Choice) comes about just due to the fact that we limited our space of possibilities by adding the extra information about Monty’s choice. An event that is already familiar to us from our discussion of Bayes Theorem.

这也是两个变量之间的非因果关系的一个清晰示例,说明了相关性并不意味着因果关系。 在这里,这两个变量(汽车和选择)之间的关系恰好是由于我们通过添加有关蒙蒂选择的额外信息来限制可能性的事实。 通过贝叶斯定理的讨论,我们已经很熟悉这一事件。

From these examples, it is easy to extract a new general rule:

从这些示例中,很容易提取出一条新的通用规则:

Rule 3 (Conditional Independence in Colliders): If a variable Z is the collision node between two variables X and Y, and there is only one path between X and Y, then X and Y are unconditionally independent but are dependent conditional on Z and any descendents of Z.

规则3(碰撞者的条件独立性):如果变量Z是两个变量X和Y之间的碰撞节点,并且X和Y之间只有一条路径,则X和Y是无条件独立的,但取决于Z和任何其他条件Z的后代。

Congratulations on following along yet another blog post on this series series. I sincerely hope that you continue to enjoy reading them as much as I enjoy writing them.

祝贺您关注本系列文章的另一篇博客文章。 我衷心希望您继续喜欢阅读它们,就像我喜欢写它们一样。

Just a quick reminder that you can find the code for all the examples above in our GitHub repository:

谨在此提醒您,您可以在我们的GitHub存储库中找到上述所有示例的代码:

And if you would like to be notified when the next post comes out, you can subscribe to the The Sunday Briefing newsletter:

而且,如果您希望在下一篇文章发表时得到通知,可以订阅《星期日简报》时事通讯:

Image for post

翻译自: https://medium.com/data-for-science/causal-inference-part-vi-colliders-af07301c9a15

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389596.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

如何优化网站加载时间

一、背景 我们要监测网站的加载情况,可以使用 window.performance 来简单的检测。 window.performance 是W3C性能小组引入的新的API,目前IE9以上的浏览器都支持。一个performance对象的完整结构如下图所示: memory字段代表JavaScript对内存的…

熊猫数据集_处理熊猫数据框中的列表值

熊猫数据集Have you ever dealt with a dataset that required you to work with list values? If so, you will understand how painful this can be. If you have not, you better prepare for it.您是否曾经处理过需要使用列表值的数据集? 如果是这样&#xff0…

旋转变换(一)旋转矩阵

1. 简介 计算机图形学中的应用非常广泛的变换是一种称为仿射变换的特殊变换,在仿射变换中的基本变换包括平移、旋转、缩放、剪切这几种。本文以及接下来的几篇文章重点介绍一下关于旋转的变换,包括二维旋转变换、三维旋转变换以及它的一些表达方式&#…

数据预处理 泰坦尼克号_了解泰坦尼克号数据集的数据预处理

数据预处理 泰坦尼克号什么是数据预处理? (What is Data Pre-Processing?) We know from my last blog that data preprocessing is a data mining technique that involves transforming raw data into an understandable format. Real-world data is often incom…

Pytorch中DNN入门思想及实现

DNN全连接层(线性层) 计算公式: y w * x b W和b是参与训练的参数 W的维度决定了隐含层输出的维度,一般称为隐单元个数(hidden size) b是偏差值(本文没考虑) 举例: 输…

IDEA去除mapper.xml文件中的sql语句的背景色

2019独角兽企业重金招聘Python工程师标准>>> IDEA版本 2017.3 mapper.xml文件中的sql语句,总是黄色一大片,看起来不舒服。 按如下设置进行设置即可 此时设置完还有点背景色 再进行一个设置 Ok,完美解决 转载于:https://my.oschina.net/u/3939…

vc6.0 绘制散点图_vc有关散点图的一切

vc6.0 绘制散点图Scatterplots are one of the most popular visualization techniques in the world. Its purposes are recognizing clusters and correlations in ‘pairs’ of variables. There are many variations of scatter plots. We will look at some of them.散点图…

Pytorch中RNN入门思想及实现

RNN循环神经网络 整体思想: 将整个序列划分成多个时间步,将每一个时间步的信息依次输入模型,同时将模型输出的结果传给下一个时间步,也就是说后面的结果受前面输入的影响。 RNN的实现公式: 个人思路: 首…

小扎不哭!FB又陷数据泄露风波,9000万用户受影响

对小扎来说,又是多灾多难的一个月。 继不久前Twitter曝出修补了一个可能造成数以百万计用户私密消息被共享给第三方开发人员的漏洞,连累Facebook股价跟着短线跳水之后,9月28日,Facebook又双叒叕曝出因安全漏洞遭到黑客攻击&#…

在衡量欧洲的政治意识形态时,调查规模的微小变化可能会很重要

(Related post: On a scale from 1 to 10, how much do the numbers used in survey scales really matter?)(相关文章: 从1到10的量表,调查量表中使用的数字到底有多重要? ) At Pew Research Center, survey questions about respondents’…

Pytorch中CNN入门思想及实现

CNN卷积神经网络 基础概念: 以卷积操作为基础的网络结构,每个卷积核可以看成一个特征提取器。 思想: 每次观察数据的一部分,如图,在整个矩阵中只观察黄色部分33的矩阵,将这【33】矩阵(点乘)权重得到特…

事件映射 消息映射_映射幻影收费站

事件映射 消息映射When I was a child, I had a voracious appetite for books. I was constantly visiting the library and picking new volumes to read, but one I always came back to was The Phantom Tollbooth, written by Norton Juster and illustrated by Jules Fei…

前端代码调试常用

转载于:https://www.cnblogs.com/tabCtrlShift/p/9076752.html

Pytorch中BN层入门思想及实现

批归一化层-BN层(Batch Normalization) 作用及影响: 直接作用:对输入BN层的张量进行数值归一化,使其成为均值为零,方差为一的张量。 带来影响: 1.使得网络更加稳定,结果不容易受到…

匿名内部类和匿名类_匿名schanonymous

匿名内部类和匿名类Everybody loves a fad. You can pinpoint someone’s generation better than carbon dating by asking them what their favorite toys and gadgets were as a kid. Tamagotchi and pogs? You were born around 1988, weren’t you? Coleco Electronic Q…

Pytorch框架中SGD&Adam优化器以及BP反向传播入门思想及实现

因为这章内容比较多,分开来叙述,前面先讲理论后面是讲代码。最重要的是代码部分,结合代码去理解思想。 SGD优化器 思想: 根据梯度,控制调整权重的幅度 公式: 权重(新) 权重(旧) - 学习率 梯度 Adam…

朱晔和你聊Spring系列S1E3:Spring咖啡罐里的豆子

标题中的咖啡罐指的是Spring容器,容器里装的当然就是被称作Bean的豆子。本文我们会以一个最基本的例子来熟悉Spring的容器管理和扩展点。阅读PDF版本 为什么要让容器来管理对象? 首先我们来聊聊这个问题,为什么我们要用Spring来管理对象&…

ab实验置信度_为什么您的Ab测试需要置信区间

ab实验置信度by Alos Bissuel, Vincent Grosbois and Benjamin HeymannAlosBissuel,Vincent Grosbois和Benjamin Heymann撰写 The recent media debate on COVID-19 drugs is a unique occasion to discuss why decision making in an uncertain environment is a …

基于Pytorch的NLP入门任务思想及代码实现:判断文本中是否出现指定字

今天学了第一个基于Pytorch框架的NLP任务: 判断文本中是否出现指定字 思路:(注意:这是基于字的算法) 任务:判断文本中是否出现“xyz”,出现其中之一即可 训练部分: 一&#xff…

支撑阻力指标_使用k表示聚类以创建支撑和阻力

支撑阻力指标Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works without seek…