stata学习笔记

. reg lw s iq expr tenure rns smsa,rLinear regression Number of obs = 758 F( 6, 751) = 71.89 Prob > F = 0.0000 R-squared = 0.3600 Root MSE = .34454 Robust lw Coef. Std. Err. t P>|t| [95% Conf. Interval] s .0927874 .0069763 13.30 0.000 .0790921 .1064826 iq .0032792 .0011321 2.90 0.004 .0010567 .0055016 expr .0393443 .0066603 5.91 0.000 .0262692 .0524193 tenure .034209 .0078957 4.33 0.000 .0187088 .0497092 rns -.0745325 .0299772 -2.49 0.013 -.1333815 -.0156834 smsa .1367369 .0277712 4.92 0.000 .0822186 .1912553 _cons 3.895172 .1159286 33.60 0.000 3.667589 4.122754

使用iq来度量能力存在测量误差,因此iq为内生变量,考虑使用med kww mrt age作为iq的工具变量,进行2SLS回归 ,并使用稳健标准误。

. ivregress 2sls lw s expr tenure rns smsa (iq=med kww mrt age),rInstrumental variables (2SLS) regression Number of obs = 758 Wald chi2(6) = 355.73 Prob > chi2 = 0.0000 R-squared = 0.2002 Root MSE = .38336 Robust lw Coef. Std. Err. z P>|z| [95% Conf. Interval] iq -.0115468 .0056376 -2.05 0.041 -.0225962 -.0004974 s .1373477 .0174989 7.85 0.000 .1030506 .1716449 expr .0338041 .0074844 4.52 0.000 .019135 .0484732 tenure .040564 .0095848 4.23 0.000 .0217781 .05935 rns -.1176984 .0359582 -3.27 0.001 -.1881751 -.0472216 smsa .149983 .0322276 4.65 0.000 .0868182 .2131479 _cons 4.837875 .3799432 12.73 0.000 4.0932 5.58255 Instrumented: iqInstruments: s expr tenure rns smsa med kww mrt age

受教育年限回报上升,而iq竟然是负相关,因此不可信,使用工具变量法需要验证其工具变量的有效性因此进行过度识别来检验所有工具变量是否外生。

. estat overid Test of overidentifying restrictions: Score chi2(3) = 51.5449 (p = 0.0000)

上图显示有些工具变量不合格,与扰动项相关。怀疑mrt和age不满足外生性,因此仅适用med和kww作为iq的工具变量,再次进行2SLS回归,同时显示第一阶段的回归结果。

. ivregress 2sls lw s expr tenure rns smsa (iq=med kww),r firstFirst-stage regressions Number of obs = 758 F( 7, 750) = 47.74 Prob > F = 0.0000 R-squared = 0.3066 Adj R-squared = 0.3001 Root MSE = 11.3931 Robust iq Coef. Std. Err. t P>|t| [95% Conf. Interval] s 2.467021 .2327755 10.60 0.000 2.010052 2.92399 expr -.4501353 .2391647 -1.88 0.060 -.9196471 .0193766 tenure .2059531 .269562 0.76 0.445 -.3232327 .7351388 rns -2.689831 .8921335 -3.02 0.003 -4.441207 -.938455 smsa .2627416 .9465309 0.28 0.781 -1.595424 2.120907 med .3470133 .1681356 2.06 0.039 .0169409 .6770857 kww .3081811 .0646794 4.76 0.000 .1812068 .4351553 _cons 56.67122 3.076955 18.42 0.000 50.63075 62.71169 Instrumental variables (2SLS) regression Number of obs = 758 Wald chi2(6) = 370.04 Prob > chi2 = 0.0000 R-squared = 0.2775 Root MSE = .36436 Robust lw Coef. Std. Err. z P>|z| [95% Conf. Interval] iq .0139284 .0060393 2.31 0.021 .0020916 .0257653 s .0607803 .0189505 3.21 0.001 .023638 .0979227 expr .0433237 .0074118 5.85 0.000 .0287968 .0578505 tenure .0296442 .008317 3.56 0.000 .0133432 .0459452 rns -.0435271 .0344779 -1.26 0.207 -.1111026 .0240483 smsa .1272224 .0297414 4.28 0.000 .0689303 .1855146 _cons 3.218043 .3983683 8.08 0.000 2.437256 3.998831 Instrumented: iqInstruments: s expr tenure rns smsa med kww

如上图,第一部分回归是使用内生解释变量对工具变量进行回归,第二部分用被解释变量对第一阶段回归的拟合值进行回归。

上图中教育回报率较为合理,而且iq系数也为整数,再次进行过度识别检验。

. estat overid Test of overidentifying restrictions: Score chi2(1) = .151451 (p = 0.6972)

结果没有拒绝外生的原假设。

接下来继续考察作为工具变量的第二个条件,即工具变量与内生变量的相关性,由第一阶段的回归看出,med和kww对iq有较好的解释力,但为稳健起见,还是使用对弱工具变量更不敏感的有限信息最大似然法(LIML)。

. ivregress liml lw s expr tenure rns smsa (iq=med kww),rInstrumental variables (LIML) regression Number of obs = 758 Wald chi2(6) = 369.62 Prob > chi2 = 0.0000 R-squared = 0.2768 Root MSE = .36454 Robust lw Coef. Std. Err. z P>|z| [95% Conf. Interval] iq .0139764 .0060681 2.30 0.021 .0020831 .0258697 s .0606362 .019034 3.19 0.001 .0233303 .0979421 expr .0433416 .0074185 5.84 0.000 .0288016 .0578816 tenure .0296237 .008323 3.56 0.000 .0133109 .0459364 rns -.0433875 .034529 -1.26 0.209 -.1110631 .0242881 smsa .1271796 .0297599 4.27 0.000 .0688512 .185508 _cons 3.214994 .4001492 8.03 0.000 2.430716 3.999272 Instrumented: iqInstruments: s expr tenure rns smsa med kww

以上结果与2SLS非常接近,侧面验证了不存在弱工具变量。

还有,使用工具变量法的前提是存在内生解释变量,因此进行豪斯曼检验。

. qui reg lw iq s expr tenure rns smsa. estimates store ols. qui ivregress 2sls lw s expr tenure rns smsa (iq=med kww). estimates store iv. hausman iv ols,constant sigmamoreNote: the rank of the differenced variance matrix (1) does not equal the number of coefficients being tested (7); be sure this is what you expect, or there may be problems computing the test. Examine the output of your estimators for anything unexpected and possibly consider scaling your variables so that the coefficients are on a similar scale. Coefficients (b) (B) (b-B) sqrt(diag(V_b-V_B)) iv ols Difference S.E. iq .0139284 .0032792 .0106493 .0054318 s .0607803 .0927874 -.032007 .0163254 expr .0433237 .0393443 .0039794 .0020297 tenure .0296442 .034209 -.0045648 .0023283 rns -.0435271 -.0745325 .0310054 .0158145 smsa .1272224 .1367369 -.0095145 .0048529 _cons 3.218043 3.895172 -.6771285 .3453751 b = consistent under Ho and Ha; obtained from ivregress B = inconsistent under Ha, efficient under Ho; obtained from regress Test: Ho: difference in coefficients not systematic chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 3.84 Prob>chi2 = 0.0499 (V_b-V_B is not positive definite).

结果显示拒绝了原假设,因此存在iq为内生变量,又因为传统的豪斯曼检验在异方差的情况下不成立,下面进行异方差稳健的DWH检验:

Tests of endogeneity Ho: variables are exogenous Durbin (score) chi2(1) = 3.87962 (p = 0.0489) Wu-Hausman F(1,750) = 3.85842 (p = 0.0499)

DWH的P值小于0.05,故可以认为iq为内生解释变量。

另外如果存在异方差,则GMM比2SLS更有效,因此进行最优GMM估计:

. ivregress gmm lw s expr tenure rns smsa (iq=med kww)Instrumental variables (GMM) regression Number of obs = 758 Wald chi2(6) = 372.75 Prob > chi2 = 0.0000 R-squared = 0.2750GMM weight matrix: Robust Root MSE = .36499 Robust lw Coef. Std. Err. z P>|z| [95% Conf. Interval] iq .0140888 .0060357 2.33 0.020 .0022591 .0259185 s .0603672 .0189545 3.18 0.001 .0232171 .0975174 expr .0431117 .0074112 5.82 0.000 .0285861 .0576373 tenure .0299764 .0082728 3.62 0.000 .013762 .0461908 rns -.044516 .0344404 -1.29 0.196 -.1120179 .0229859 smsa .1267368 .0297633 4.26 0.000 .0684018 .1850718 _cons 3.207298 .398083 8.06 0.000 2.427069 3.987526 Instrumented: iqInstruments: s expr tenure rns smsa med kww

上图显示两步最优GMM与2SLS很接近,再进行过度识别检验

. estat overid Test of overidentifying restriction: Hansen's J chi2(1) = .151451 (p = 0.6972)结果接受原假设,说明所有工具变量外生。

然后再做迭代GMM:下图显示与两步GMM系数估计值相差不大。

联系客服:779662525#qq.com(#替换为@)