Political Science

The Product and Difference Fallaciesfor Indirect Effects(2/2)

腦fficial Pragmatist 2022. 10. 4. 00:27
반응형

Scatterplot of percent women in government (WG) and income (measured as log GDP). The solid points are the Muslim countries (IRT = 1). The open circles are the remaining non-Muslim countries (IRT = 0) after non-Muslim countries with less comparable incomes were pruned. The dashed line represents the linear regression for the non-Muslim countries. The open triangles represent the predicted percent of women in government for the Muslim countries if they had not been Muslim. Due to the pruning based on income, the overlap between Muslim and non-Muslim on the income variable is fairly good for the prediction of WG. However, the scatterplot also represents the predictor spacefor the second stage prediction of democracy, and overlap problems are apparent for this prediction. In general, there are very few solid points near the dashed line, and for some ranges (e.g., income greater than 3.6), there are none.

Conclusion

 This article has demonstrated that even when a linear model is appropriate at the individual level, and when control variables are sufficient to approximate randomized experiments, the product and difference heuristics can produce highly misleading estimates of the indirect effect. It was shown that stratification and the inclusion of interactions can ameliorate some of this problem, but unless we are willing to make untestable assumptions about the covariance terms in (8), (18), (19), and (24), we will be unable to identify the average indirect effects.

 Furthermore, this article has demonstrated that by restricting inference to subpopulations, some of the assumptions underlying the estimation of indirect effects can be checked. In particular, by restricting the population of interest to the treated units, balance and overlap can be more easily achieved for the regression of Z on X (by pruning noncomparable control units), and balance and overlap can be more easily checked for the regression of Y on Z and X. Unfortunately, if the effect of X on Z is strong, it may be impossible to achieve balance and overlap for the regression of Y on Z and X.

 The implications of this work for future research design are threefold. First, as demonstrated in this article, it is easy enough to stratify and include interactions so as to reduce the problems associated with effect heterogeneity and the untestable covariance terms. Furthermore, although linear models were used in this article in order to simplify presentation, the procedures described do not depend on the use of linear regression. See Pearl (2011) and Imai et al. (2010b) (with its associated R package [Imai et al. 2010c]) for approaches with nonlinear models.

 Second, if the explanatory variable is binary, inference should be restricted to the treated (or control units)— even if only as a first step. It is especially important to assess overlap problems that may occur for the regression of Y on X and Z when the effect of X on Z is strong. If the explanatory variable is continuous, model dependence will likely be unavoidable, and this should be acknowledged.

 Third, because randomization is not sufficient to identify average indirect effects (see Bullock et al. 2010; and Bullock and Ha 2011; Green et al. 2010; Robins and Greenland 1992; and Sobel 2008 for additional discussion), the analyst should explicitly acknowledge the additional assumptions implicit in the analysis of indirect effects. In this article, these assumptions were presented in terms of the covariances in (8), (18), (19), and (24). If these assumptions are suspect, then it may be necessary to perform a sensitivity analysis on the basis of these covariances. Recent work provides an alternative approach to sensitivity analysis and bounding (Imai et al. 2010a, 2010b, 2010c, 2010d).

반응형