格兰杰和格兰杰因果
网络搜到的Grange大神标准照
格兰杰1934年9月出生于英国威尔士的斯旺西,早期就读于诺丁汉大学,接受当时英国第一个经济学数学双学位教育,1955年留校任教,1957年在天文学杂志上他发表了第一篇论文:“关于太阳黑子活动的一个统计模型”。1959年,他在诺丁汉大学获得统计学博士学位。在20世纪60年代早期,格兰杰获得了支持英国学者去美国深造的哈克尼斯(Harkness)奖学金,去普林斯顿做访问学者,在著名学者约翰·塔基(John Tukey)和奥斯卡·摩根斯坦(Oscar Morgenstein)门下深造。1974年移居美国,成为圣迭亚哥加州大学经济学院教授。随后,他开创了该学院的计量经济学研究工作,并使之成为全世界最出色的计量经济学研究基地之一。最后成为该校的荣誉退休教授。格兰杰于1991年成为国际预测师协会会员,曾获得斯德哥尔摩经济学院和卡洛斯三世大学的荣誉博士学位。他现为西部经济学会主席、每年仅两位的美国经济学会杰出会员。他的研究兴趣主要在统计学和计量经济学(主要是时间序列分析)、预测、金融、人口统计学和方法论等方面。
2003年诺贝尔经济学奖获得者,克莱夫·格兰杰,2003年诺贝尔经济学奖得主,来自美国加州大学圣迭戈分校。克莱夫·格兰杰(Clive W.J. Granger)教授因“协整理论”在时间序列数据分析上做出的杰出贡献而获2003年诺贝尔经济学奖。
经济学家开拓了一种试图分析变量之间的格兰杰因果关系的办法,即格兰杰因果关系检验。该检验方法为2003年诺贝尔经济学奖得主克莱夫·格兰杰(Clive W. J. Granger)所开创,用于分析经济变量之间的格兰杰因果关系。他给格兰杰因果关系的定义为“依赖于使用过去某些时点上所有信息的最佳最小二乘预测的方差。
格兰杰教授被认为是世界上最伟大的计量经济学家之一,瑞典皇家科学院曾说,“他不仅是研究员们学习的光辉典范,而且也是金融分析家的楷模。”无数经济学专业的学生奋斗一生,也只是为了能远远看到他的背影。他在利用数学模型分析时间序列数据方面的实证研究,给全世界打开了一扇窥探经济运行规律,特别是金融市场运行规律的大门。正因如此,我们可以对股市和汇市浩如烟海的数据进行分析整理,并预测今后的走势。
从访问普林斯顿的20世纪60年代早期开始,格兰杰就是一位非常有影响的时间序列计量经济学学者。他的论文几乎涵盖了过去40年间该领域的主要进展,没有格兰杰的分析方法,进行时间序列计量方面的实证分析几乎是不可能的。就他所产生的学术影响,人们对他的叹服可以用天才的研究者和作家来表达。
格兰杰的学术作品有两个突出的特点,学术思想与实际问题密切相关;很强的可读性,因此很多内容已经成为引用的经典。也许,这些方面的长处除了与他本人的天才资质有关之外,还和他的学习经历有关。早在读高中时,他就曾在两个语法学校就读。而且他喜欢纯粹的数学思维训练,在初学经济学时,就对当时只会纯文字描述的经济学家感到遗憾。
诺贝尔奖评委会认为,格兰杰的工作改变了经济学家处理时间序列数据的方法,对研究财富与消费、汇率与价格、以及短期利率与长期利率之间的关系具有非常重要意义。目前美国联邦储备委员会和许多国家的中央银行都使用这一方法来进行评估和预测。
混频数据的计量经济学方法
这里简要回顾处理混频数据的计量经济学模型。
典型的计量经济学回归方程处理的是具有相同采样频率的变量。为保持频率相同,研究人员要么将高频观测值加总为最低频率数据,要么对低频数据进行插值以得到最高频率数据。在实证应用中,前者为最常用的方法,高频数据通过平均或者取一个代表值(例如每个季度的最后一个月)而将为最低频率。这种对数据进行“预过滤”而使得预测方程左侧和右侧变量成为同频率的方法有一个潜在的问题,就是可能会破坏高频数据中大量的有用信息。因此,对混频数据进行直接建模是十分必要的。
这里,我稍微直白来讲讲,简单讲,假设你原来构建的是季度模型,例如用到了GDP增速季度数据,假设用CPI测度通胀率,而CPI数据有月度数据,传统的做法,就是将月度CPI通过一定的算法(折腾)调整为季度数据,在数据频率转换的过程中,自然会损失很多信息,也有很大的争议,月度CPI时间序列转为季度CPI时间序列,十个人可能有十五种结果。混频数据建模技术出现之后,就不用走这个步骤了,直接拿月度CPI和季度GDP建模就是。
混频数据的计量经济学应用研究很广泛,例如
1 桥接方程
2 混频数据取样(MIDAS)方法,包括MIDAS权重函数,AR-MIDAS模型,CoMIDAS模型,及其拓展等等。
3 混频-向量自回归(MF-VAR)模型
4 混频因子模型, 包括混频小规模因子模型、混频大规模因子模型,混频状态空间
5 因子-MIDAS模型,包括平滑因子—MIDAS模型,无限制因子—MIDAS模型
6 粗糙边缘数据预测的实时预测
7 NowCasting等等
8 还有我想起来的混频-GARCH模型
混频数据格兰杰因果检验的简要数学形式
叠加的高频(HF)和低频(LF)变量为:
假设1: X(L) 是VAR(p)
假设2:多项式的所有根在单位圆之外
假设3:(p,h)自回归
协方差矩阵性质:
Dp(h)推导:
Tau和Delta p,s(h)推导:
实例的Matlab主程序(注解很清晰)
%%%%%%%% Ten requiredcodes in order to run this main code %%%%%%%%%%%%%%%%%%%%%%%
% 1. VAR_est1.m: Fit(p,h)-autoregression (i.e. VAR(p) model iterated h-times)
% with Newey and West's (1987)HAC estimator. Newey and West's (1994)
% automatic bandwidth selectionis available.
% 2. irf3.m: Computeimpulse response function at horizon 0, 1, ..., hmax
% along with bootstrapped confidenceintervals.
% Cholesky decomposition is used.
% 3. var_decomp.m:Forecast error variance decomposition at horizon 0, ..., hmax-1.
% Cholesky decomposition isused.
% 4. MFCTGK_all1.m:Implement bilateral mixed frequency Granger causality tests
% for all possible pairs.
% 5. CTGK_all1.m:Implement bilateral Granger causality tests for all
% possible pairs.
% 6. sim_VAR.m: SimulateVAR(p) processes.
% 7. sim_phauto.m:Simulate (p,h)-autoregression (i.e. VAR(p) iterated h-times)
% 8.causality_test_GK4.m: Implement Granger causality tests.
% Goncalves andKillian's (2004) bootstrap is available.
% 9.mf_causal_test_GK4.m: Implement mixed frequency Granger causality tests.
% Goncalves andKillian's (2004) bootstrap is available.
% 10. Wald_test.m:Implement Wald tests.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Scenario
% We analyze threevariables x, y, z.
% x is a monthly variablewhile y and z are quarterly variables.
% The ratio of samplingfrequencies, m, is equal to 3.
% construct a 5 x 1 mixedfrequency vector X = [x1, x2, x3, y, z]'.
% Assume DGP is MF-VAR(1)with coefficient:
A = [ 0, 0.1, 0.4, 0, 0;
0.1, -0.1, 0.2, 0, 0;
0, 0, 0.1, 0, 0;
0.0, -0.9, 0.9, 0.2, 0;
0, 0, 0, 0.9, 0.6];
% Evidently, (1) x causesy and (2) y causes z.
% See A(4, 1:3) for (1)and A(5,4) for (2).
% Let's generate datafrom this DGP and fit MF-VAR(1) to see what happens.
%% Step 1: Initialsetting
% general setting
T = 80; % sample size is 80 quarters, a realistic size.
m = 3; % ratio of sampling frequencies (month vs. quarter)
K_H = 1; % one high frequency variable x
K_L = 2; % two low frequency variables y, z
K = K_L + m*K_H; %dimension of MF-VAR
p = 1; % VAR lag length included.
% true lag order is 1.
lambda = 'NW'; % useNewey and West's (1994) automatic bandwidth selection
% Impulse responsefunctions
irfhmax = 6; % maximum horizon
figureflag = 1; % drawfigure
irfalpha = 0.05; % draw95% bootstrapped confidence interval
bsnum = 500; % # of bootstrap replications
labels = char('x1', 'x2', 'x3', 'y', 'z'); %labels
% forecast error variancedecomposition
vdhmax = 6; % maximum horizon
% Granger causality tests
gcbs = 1999; % # of bootstrap replications
dispflag = 1; % display p-values
gclabels = char('x', 'y', 'z'); %labels
%% Step 2: Mixedfrequency analysis
% generate normal error
E = 0.1 * randn(T, K);
% generate VAR(1) process
Data = sim_VAR(E, A);
% fit MF-VAR(1)
result1 = VAR_est1(Data, p, 1,lambda);
% impulse
[IRF, lb, ub] = irf3(result1,irfhmax, figureflag, irfalpha, bsnum, labels);
% variance decomposition
vd_mf = var_decomp(result1,vdhmax);
% causality test
disp('%%%%% Mixed Frequency, horizon = 1 %%%%%')
pval_mat1 =MFCTGK_all1(result1, m, K_H, gcbs, dispflag, gclabels);
disp('%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%');
disp(blanks(3)');
%%%%%%%%%%%%%%%% REMARK%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% ... What can you tellfrom these results?
% Causality test says (1)x causes y, (2) y causes z, and there is no other causality.
%
% IRF verifies (1) and(2). IRF gives you one more important implication, though.
% It seems that x doeshave a significant impact on z! How is this ever possible?
%
% ... This is a typicalexample of "causal chain". x does cause z via y.
% To see this point, notethat:
%
% A^2 = [ 0.01, -0.01, 0.06, 0, 0;
% -0.01, 0.02, 0.04, 0, 0;
% 0, 0, 0.01, 0, 0;
% -0.09, -0.09, 0.09, 0.04, 0;
% 0, -0.81, 0.81, 0.72, 0.36]
%
% The lower-left block isno longer zeros.
% In bivariate casecausal chains are never possible, but in more general
% cases causal chains areof great importance.
% To capture causalityfrom x to z, we need to run two-step-ahead causality test.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
result2 = VAR_est1(Data, p, 2,lambda);
disp('%%%%% Mixed frequency, horizon = 2 %%%%%')
pval_mat2 =MFCTGK_all1(result2, m, K_H, gcbs, dispflag, gclabels);
disp('%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%');
disp(blanks(3)');
% ... Now you can see xdoes cause z at horizon 2.
%% Step 3: Low frequencyanalysis
% for comparison,aggregate x into quarterly frequency (flow sampling)
Data_ = [mean(Data(:,1:3), 2),Data(:,4), Data(:,5)];
% fit VAR(1)
result_ = VAR_est1(Data_, p, 1,lambda);
% IRF
[IRF_, lb_, ub_] =irf3(result_, irfhmax, figureflag, irfalpha, bsnum, gclabels);
% variance decomposition
vd_lf = var_decomp(result_,vdhmax);
% causality test
disp('%%%%% Low Frequency, horizon = 1 %%%%%')
pval_mat_lf =CTGK_all1(result_, gcbs, dispflag, gclabels);
disp('%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%');
%%%%%%%%%%%%%%%%%%%%%%REMARK %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Based on low frequencymodel, you cannot observe x causing y.
% This is because thepositive impact of x3 on y and the negative impact of
% x2 on y offset eachother after flow aggregation.
% This highlights anadvantage of mixed frequency approach.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
输出结果
%%%%%Mixed Frequency, horizon = 1 %%%%%
H_0:y does not cause x
p-value = 0.83
H_0:z does not cause x
p-value = 0.9015
H_0:x does not cause y
p-value = 0.0005
H_0:z does not cause y
p-value = 0.1565
H_0:x does not cause z
p-value = 0.9005
H_0:y does not cause z
p-value = 0.0005
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%Mixed frequency, horizon = 2 %%%%%
H_0:y does not cause x
p-value = 0.0425
H_0:z does not cause x
p-value = 0.819
H_0:x does not cause y
p-value = 0.302
H_0:z does not cause y
p-value = 0.8695
H_0:x does not cause z
p-value = 0.0005
H_0:y does not cause z
p-value = 0.0005
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%Low Frequency, horizon = 1 %%%%%
H_0:y does not cause x
p-value = 0.275
H_0:z does not cause x
p-value = 0.924
H_0:x does not cause y
p-value = 0.789
H_0:z does not cause y
p-value = 0.029
H_0:x does not cause z
p-value = 0.5655
H_0:y does not cause z
p-value = 0.0005
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
混合和低频因果关系检验的局部渐近幂
MIDAS和NowCasting这块的建模
本号后续发布,敬请关注
记得添加本号