Applied Spatial Statistics（二）统计推断：排列测试

本笔记本演示了如何执行假设检验和 p 值计算的排列检验。

相关系数
Moran’s I

相关性

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import *url = 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/mpg.csv'df = pd.read_csv(url)
df = df.dropna()

让我们抽取一个非常小的样本，只有 10 个点。

small_sample_car = df.sample(10,random_state=222)
small_sample_car

	mpg	cylinders	displacement	horsepower	weight	acceleration	model_year	origin	name
165	20.0	8	262.0	110.0	3221	13.5	75	usa	chevrolet monza 2+2
380	36.0	4	120.0	88.0	2160	14.5	82	japan	nissan stanza xe
16	18.0	6	199.0	97.0	2774	15.5	70	usa	amc hornet
261	18.1	6	258.0	120.0	3410	15.1	78	usa	amc concord d/l
19	26.0	4	97.0	46.0	1835	20.5	70	europe	volkswagen 1131 deluxe sedan
99	18.0	6	232.0	100.0	2945	16.0	73	usa	amc hornet
206	26.5	4	140.0	72.0	2565	13.6	76	usa	ford pinto
86	14.0	8	304.0	150.0	3672	11.5	73	usa	amc matador
267	27.5	4	134.0	95.0	2560	14.2	78	japan	toyota corona
232	16.0	8	351.0	149.0	4335	14.5	77	usa	ford thunderbird

plt.scatter(small_sample_car.horsepower, small_sample_car.acceleration)
plt.xlabel("HP")
plt.ylabel("Acceleration")

Text(0, 0.5, 'Acceleration')

在这里插入图片描述

基于此小样本的 Pearson 相关性，请注意此处的分析 p 值为 ~0.03

pearson_rslt = pearsonr(small_sample_car.horsepower, small_sample_car.acceleration)print("Pearson:",pearson_rslt)

Pearson: PearsonRResult(statistic=-0.6832264558183371, pvalue=0.02941732498278316)

np.random.permutation(small_sample_car.acceleration)

array([20.5, 13.6, 11.5, 15.5, 14.5, 13.5, 14.5, 15.1, 16. , 14.2])

定义相关性的排列测试函数：

def permutation_cor(sample_car):null_cor_list = []for i in range(10000):#permutate one variable, while hold the other constant.subsample_Y = np.random.permutation(sample_car.acceleration)#compute correlationsample_cor = pearsonr(sample_car.horsepower, subsample_Y)[0]null_cor_list.append(sample_cor)return null_cor_list

这给了我们零假设下的相关系数。

null_dist = permutation_cor(small_sample_car)

plt.hist(null_dist)plt.vlines(pearson_rslt[0],ymin=0,ymax=2000,color="red")

<matplotlib.collections.LineCollection at 0x15ff5d6d0>

在这里插入图片描述

让我们计算一下 unll 下的相关性（在幅度上）比样本相关性大多少倍。

这为我们提供了一个模拟 p 值。

np.sum(np.abs(np.array(null_dist)) >= np.abs(pearson_rslt[0]))

import numpy as np
p_sim = np.sum(np.abs(np.array(null_dist)) >= np.abs(pearson_rslt[0]))/10000

print("The simulated p-value (2-tail): ", p_sim)print("The actual p-value (2-tail) : ", pearson_rslt[1])

The simulated p-value (2-tail):  0.0323
The actual p-value (2-tail) :  0.02941732498278316

Moran’s I

让我们看看伊利诺伊州 ALAND 的空间自相关。

import geopandas as gpd

IL = gpd.read_file("https://raw.githubusercontent.com/Ziqi-Li/GIS5122/main/data/IL_acs/IL_acs.shp")

IL.columns

Index(['state', 'county', 'NAME_x', 'total_pop', 'pct_65_ove', 'pct_black','pct_hispan', 'males', 'females', 'pct_bach', 'GEOID_x', 'STATEFP','COUNTYFP', 'COUNTYNS', 'AFFGEOID', 'GEOID_y', 'NAME_y', 'LSAD','ALAND', 'AWATER', 'GEOID_int', 'geometry'],dtype='object')

IL.plot(column="ALAND",legend=True)

<Axes: >

在这里插入图片描述

使用 esda 包计算 Moran’s I 值。

from libpysal.weights.contiguity import Queenvalue = IL['ALAND'].values #Get the data we are interested.w = Queen.from_dataframe(IL) #Generate W matrix using Queen contiguityw.transform = 'r' #Row-standardization of the weight matrix

/var/folders/mp/9px298sd6vs8xccb_3sql0dr0000gp/T/ipykernel_45186/246861888.py:5: FutureWarning: `use_index` defaults to False but will default to True in future. Set True/False directly to control this behavior and silence this warningw = Queen.from_dataframe(IL) #Generate W matrix using Queen contiguity

from esda.moran import Moran
from splot.esda import plot_moranil_moran = Moran(value, w, permutations=1000, two_tailed=True)print("Moran's I:", il_moran.I)print("analytical p-value:",il_moran.p_norm)print("Simulated p-value:",il_moran.p_sim)

Moran's I: 0.16857257061968103
analytical p-value: 0.0035814725750113125
Simulated p-value: 0.004995004995004995

plot_moran(il_moran, zstandard=True, figsize=(10,4))
plt.show()

/Users/ziqili/anaconda3/lib/python3.11/site-packages/splot/_viz_esda_mpl.py:354: FutureWarning: `shade` is now deprecated in favor of `fill`; setting `fill=True`.
This will become an error in seaborn v0.14.0; please update your code.sbn.kdeplot(moran.sim, shade=shade, color=color, ax=ax, **kwargs)

在这里插入图片描述

#Define a permutation test function for Moran's I:
def permutation_moran(geodataframe):w = Queen.from_dataframe(IL) null_moran_list = []for i in range(1000):value = np.random.permutation(geodataframe['ALAND'].values) #Get the data we are interested.#compute moran's Imoran = Moran(value, w)null_moran_list.append(moran.I)return null_moran_list

我们为每个县随机分配一个值，并计算每个地图的 Moran’s I。

import numpy as np
IL.plot(column=np.random.permutation(IL['ALAND'].values),legend=True)

<Axes: >

在这里插入图片描述

原假设下的 Moran’s I

null_moran = permutation_moran(IL)

/var/folders/mp/9px298sd6vs8xccb_3sql0dr0000gp/T/ipykernel_45186/1735371652.py:4: FutureWarning: `use_index` defaults to False but will default to True in future. Set True/False directly to control this behavior and silence this warningw = Queen.from_dataframe(IL)

plt.hist(null_moran)
plt.vlines(il_moran.I,ymin=0,ymax=300,color="red")

<matplotlib.collections.LineCollection at 0x2a57a3f10>

在这里插入图片描述

p_sim = np.sum(np.abs(np.array(null_moran)) >= np.abs(il_moran.I))/10000

print("Analytical p-value:",il_moran.p_norm)
print("Simulated p-value from esda's permutation:",il_moran.p_sim)print("Simulated p-value from permutation test:",p_sim)

Analytical p-value: 0.0035814725750113125
Simulated p-value from esda's permutation: 0.004995004995004995
Simulated p-value from permutation test: 0.0005

转存中…(img-XgiyYpYB-1713438830006)]

p_sim = np.sum(np.abs(np.array(null_moran)) >= np.abs(il_moran.I))/10000

print("Analytical p-value:",il_moran.p_norm)
print("Simulated p-value from esda's permutation:",il_moran.p_sim)print("Simulated p-value from permutation test:",p_sim)

Analytical p-value: 0.0035814725750113125
Simulated p-value from esda's permutation: 0.004995004995004995
Simulated p-value from permutation test: 0.0005

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.mzph.cn/bicheng/118.shtml

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈email:809451989@qq.com，一经查实，立即删除！