Abstract
In this paper, we discuss identifiability of mediation, direct and indirect effects of treatment on outcome. The mediation effects are represented by a causal mediation model which includes an unobserved confounder (i.e., a common cause of the mediator and the outcome variable), and the direct and indirect effects are represented by the mediation effects. Without requiring the sequential ignorability assumption or the exclusion restriction assumption (i.e., the absence of direct effect of treatment on outcome), we require that only treatment is randomized and that the degree of equation nonlinearity for the treatment effect on the mediator is higher than that for the outcome. If the requirement of nonlinearity degree is not satisfied, we may use a covariate as an instrumental variable to improve the identifiability. In this paper, we focus on the identifiability of parameters, although, to illustrate our identifiability results, we describe estimation approaches. The simulations show good estimation performance by our approach compared to the standard mediation approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Baron, R.M., Kenny, D.A.: The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. J. Pers. Soc. Psychol. 51, 1173–1182 (1986)
Frangakis, C.E., Rubin, D.B.: Principle stratification in causal inference. Biometrics 58, 21–29 (2002)
Hansen, L.S.: Large sample properties of generalized method of moments estimators. Econometrica 50, 1029–1054 (1982)
Herting, J.R.: Evaluating and rejecting true mediation models: a cautionary note. Prev. Sci. 3, 285–289 (2002)
Imai, K., Keele, L., Yamamoto, T.: Identification, inference and sensitivity analysis for causal mediation effects. Stat. Sci. 25 (1), 51C71 (2010)
Jo, B.: Causal inference in randomized experiments with mediational processes. Psychol. Methods 13 (4), 314–336 (2008)
Kaufman, S., Kaufman, J.S., MacLehose, R., Greenland, S., Poole, C.: Improved estimation of controlled direct effects in the presence of unmeasured confounding of intermediate variables. Stat. Med. 24, 1683–1702 (2005)
Li, Y., Schneider, J.A., Bennet, D.A.: Estimation of the mediation effect with a binary mediator. Stat. Med. 26, 3398–3414 (2007)
MacKinnon, D.P., Fairchild, A.J., Fritz, M.S.: Mediation analysis. Annu. Rev. Psychol. 58, 593–614 (2007)
Newey, W.K., McFadden, D.: Large sample estimation and hypothesis testing. In: R. F. Engle, R.F., McFadden, D. (eds.) Handbook of Econometrics, vol. IV, pp. 2111–2245. Elsevier, Amsterdam (1994)
Pearl, J.: Direct and indirect effects. In: Proc. 17th Conf. Uncertainty in Artificial Intelligence, pp. 411–420 (2000)
Rubin, D.B.: Direct and indirect causal effects via potential outcomes. Scand. J. Stat. 31, 161–170 (2004)
Sobel, M.E.: Identification of causal parameters in randomized studies with mediating variables. J. Educ. Behav. Stat. 33, 230–251 (2008)
Ten Have, T.R., Joffe, M.M., Lynch, K.G., Brown, G.K., Maisto, S.A., Beck, A.T.: Causal mediation analyses with rank preserving models. Biometrics 63, 926–934(2007)
VanderWeele, T.J.: Marginal structural models for the estimation of direct and indirect effects. Epidemiology 20 (1), 18–26 (2009)
VanderWeele, T.J.: Controlled direct and mediated effects: definition, identification and bounds. Scand. J. Stat. 38, 551–563 (2011)
Acknowledgements
This research was supported by NSFC (11171365, 11021463, 10931002), 863 Program of China (2015AA020507) and a project founded by Merck (China).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix 1: Proof of Theorem 1
We separately show the necessity and sufficiency for the identifiability of parameters in model (13.7). For necessity, suppose that the non-linearity condition does not hold, that is, | ρ(E(M | X), X) | = 1. This implies that there exist some a 0 and a 1 satisfying E(M | X) = a 1 X + a 0 almost everywhere. Then from the model (13.7) we have
The above equation implies that Y ismarginally linear with respect to X.For this linear model, only the intercept(b 0 + a 0 b 1) and the slope(a 1 b 1 + b 2) are identifiable as awhole, while parameters b 0,b 1, andb 2cannot be distinguished each other.
For sufficiency, if M is not marginally linearly related with respect to X, then we can find 3 levels: x 1, x 2, and x 3, which satisfy [E(M | x 1) − E(M | x 2)]∕(x 1 − x 2) ≠ [E(M | x 2) − E(M | x 3)]∕(x 2 − x 3). Hence the matrix in (13.10) has full rank. Thus parameters b 1 and b 2 can be identified, and then parameter b 0 can be identified from b 0 = E(Y | x i ) − b 1 E(M | x i ) − b 2 x i .
Appendix 2: Proof for Theorem 2
For sufficiency, when E(M | X, Z) ≠ cX +ψ(Z), there are two situations: (i) E(M | X, Z) = Ψ(X) +ψ(Z), where Ψ(⋅ ) is a nonlinear function of X; (ii) E(M | X, Z) is not additive with respect to X and Z.
For situation (i), since Ψ(⋅ ) is not a linear function, we can choose three levels of X (say x 1, x 2, x 3) and some z satisfying [E(M | x 1, z) − E(M | x 2, z)]∕(x 1 − x 2) ≠ [E(M | x 2, z) − E(M | x 3, z)]∕(x 2 − x 3). Then the following equation from model (13.12) has a unique solution because the coefficient matrix has full rank:
Thus the parameters can be identified.
For situation (ii), since E(M | X, Z) is not additive with respect to X and Z, we can find two levels of X (say x 1, x 2) and two levels of Z (say z 1, z 2) satisfying E(M | x 1, z 1) − E(M | x 2, z 1) ≠ E(M | x 1, z 2) − E(M | x 2, z 2). The following equation derived from model (3.4) has a unique solution because the coefficient matrix has full rank:
Thus the parameters can be identified.
For necessity, if E(M | X, Z) = cX +ψ(Z) for some constant c and ψ(⋅ ), then from model (13.12), we have
where Φ(Z) = b 1 ψ(Z) + E[ϕ(U, Z, ɛ Y ) | Z]. We can easily see that only c, b 1 c + b 2 and Φ(Z) can be identified given observed data of (Z, X, M, Y ). b 1 and b 2 cannot be identified because (1) E(M | X, Z) is linear with respect to X, and (2) E[ϕ(U, Z, ɛ Y ) | Z] cannot be identified since U and ɛ Y are never observed. Thus the parameters in model (13.12) are identifiable only if E(M | X, Z) ≠ cX +ψ(Z).
Appendix 3: Proof for the Equivalence of Different Choices of f(⋅ ) in Eq. (13.15) for the Estimation When the Identifiability Condition in Theorem 1 Holds
We want to show that an arbitrary vector function f(⋅ ) that identifies β via Eq. (13.15) leads to the same estimator as that based on the function f ∗(⋅ ). For an arbitrary vector function f(⋅ ) = (f 1(⋅ ), f 2(⋅ ), ⋯ , f K (⋅ ))′ (K > 2), we can denote it as
Let Q denote the K × 3 matrix on the right-hand side. Equation (13.15) can be rewritten as G β = H, where G = E[f(X), M f(X), X f(X)] and H = E[Y f(X)]. Then the estimation equation for β is \(\widehat{G}\hat{\beta } =\widehat{ H}\). From (13.21), we have
where \(\widehat{E}(\cdot )\) denotes the sample mean of the corresponding variable. Similarly, we have
Then by the function f(⋅ ), the estimation equation for β is equivalent to
Since \(\hat{\beta }^{{\ast}}\) satisfies the equation \(\widehat{G^{{\ast}}}\hat{\beta }^{{\ast}}-\widehat{ H^{{\ast}}} = 0\), we have that \(\hat{\beta }^{{\ast}}\) also satisfies \(\widehat{G}\hat{\beta }^{{\ast}}-\widehat{ H} = 0\). Thus we proved \(\hat{\beta }=\hat{\beta } ^{{\ast}}\) when Q has full rank, which means that the above equation of \(\hat{\beta }\) has a unique solution.
Appendix 4: Proof for Matrix G eff in Sect. 13.4.2 Equals E[f eff(X)f eff(X)′] and Has Full Rank When Non-linearity Condition in Theorem 1 Holds
-
(i)
We show that G eff = E[f eff(X)f eff(X)′]. It is obvious that
$$\displaystyle\begin{array}{rcl} E[\mathbf{f}^{\mathrm{eff}}(X)\mathbf{f}^{\mathrm{eff}}(X)']& =& E\left \{\left [\begin{array}{ccc} 1 & E(M\vert X) & X \\ E(M\vert X)& E(M\vert X)^{2} & E(M\vert X)X \\ X &XE(M\vert X)& X^{2}\end{array} \right ]\right \} {}\\ & =& \left [\begin{array}{ccc} 1 & E(M) & E(X) \\ E(M)&E[E(M\vert X)^{2}]&E(XM) \\ E(X) & E(XM) & E(X^{2})\end{array} \right ] = G^{\mathrm{eff}}.{}\\ \end{array}$$ -
(ii)
We prove that G eff has full rank when non-linearity condition in Theorem 1 holds. To prove that G eff has full rank, we only need show that det(G eff) ≠ 0 when | ρ(X, E(M | X)) | < 1. We have
$$\displaystyle\begin{array}{rcl} \mathrm{det}(G^{\mathrm{eff}})& =& \left \vert \begin{array}{ccc} 1 & E(M) & E(X) \\ E(M)&E[E(M\vert X)^{2}]&E(XM) \\ E(X) & E(XM) & E(X^{2})\end{array} \right \vert {}\\ & =& \left \vert \begin{array}{ccc} 1& E(M) & E(X) \\ 0&E[E(M\vert X)^{2}] - [E(M)]^{2} & E(XM) - E(X)E(M) \\ 0& E(XM) - E(X)E(M) & E(X^{2}) - [E(X)]^{2}\end{array} \right \vert.{}\\ \end{array}$$Since
$$\displaystyle{\mathrm{var}[E(M\vert X)] = E[E(M\vert X)^{2}] - [E(M)]^{2},\mathrm{var}(X) = E(X^{2}) - [E(X)]^{2}}$$and
$$\displaystyle{\mathrm{cov}(X,E(M\vert X)) = E[XE(M\vert X)]-E(X)E[E(M\vert X)] = E(XM)-E(X)E(M),}$$we have
$$\displaystyle\begin{array}{rcl} \mathrm{det}(G^{\mathrm{eff}})& =& \left \vert \begin{array}{ccc} 1& E(M) & E(X) \\ 0& \mathrm{var}[E(M\vert X)] &\mathrm{cov}(X,E(M\vert X)) \\ 0&\mathrm{cov}(X,E(M\vert X))& \mathrm{var}(X)\end{array} \right \vert {}\\ & =& \mathrm{var}[E(M\vert X)]\mathrm{var}(X)(1 - [\rho (X,E(M\vert X)]^{2}) > 0, {}\\ \end{array}$$since | ρ(X, E(M | X)) | < 1.
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
He, P., Wu, Z., Zhang, X.D., Geng, Z. (2016). Identification of Causal Mediation Models with an Unobserved Pre-treatment Confounder. In: He, H., Wu, P., Chen, DG. (eds) Statistical Causal Inferences and Their Applications in Public Health Research. ICSA Book Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-41259-7_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-41259-7_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41257-3
Online ISBN: 978-3-319-41259-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)