Identification of Causal Mediation Models with an Unobserved Pre-treatment Confounder

  • Chapter
  • First Online:
Statistical Causal Inferences and Their Applications in Public Health Research

Part of the book series: ICSA Book Series in Statistics ((ICSABSS))

  • 3170 Accesses

Abstract

In this paper, we discuss identifiability of mediation, direct and indirect effects of treatment on outcome. The mediation effects are represented by a causal mediation model which includes an unobserved confounder (i.e., a common cause of the mediator and the outcome variable), and the direct and indirect effects are represented by the mediation effects. Without requiring the sequential ignorability assumption or the exclusion restriction assumption (i.e., the absence of direct effect of treatment on outcome), we require that only treatment is randomized and that the degree of equation nonlinearity for the treatment effect on the mediator is higher than that for the outcome. If the requirement of nonlinearity degree is not satisfied, we may use a covariate as an instrumental variable to improve the identifiability. In this paper, we focus on the identifiability of parameters, although, to illustrate our identifiability results, we describe estimation approaches. The simulations show good estimation performance by our approach compared to the standard mediation approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 85.59
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 106.99
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 149.79
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Baron, R.M., Kenny, D.A.: The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. J. Pers. Soc. Psychol. 51, 1173–1182 (1986)

    Article  Google Scholar 

  2. Frangakis, C.E., Rubin, D.B.: Principle stratification in causal inference. Biometrics 58, 21–29 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  3. Hansen, L.S.: Large sample properties of generalized method of moments estimators. Econometrica 50, 1029–1054 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  4. Herting, J.R.: Evaluating and rejecting true mediation models: a cautionary note. Prev. Sci. 3, 285–289 (2002)

    Article  Google Scholar 

  5. Imai, K., Keele, L., Yamamoto, T.: Identification, inference and sensitivity analysis for causal mediation effects. Stat. Sci. 25 (1), 51C71 (2010)

    Google Scholar 

  6. Jo, B.: Causal inference in randomized experiments with mediational processes. Psychol. Methods 13 (4), 314–336 (2008)

    Article  Google Scholar 

  7. Kaufman, S., Kaufman, J.S., MacLehose, R., Greenland, S., Poole, C.: Improved estimation of controlled direct effects in the presence of unmeasured confounding of intermediate variables. Stat. Med. 24, 1683–1702 (2005)

    Article  MathSciNet  Google Scholar 

  8. Li, Y., Schneider, J.A., Bennet, D.A.: Estimation of the mediation effect with a binary mediator. Stat. Med. 26, 3398–3414 (2007)

    Article  MathSciNet  Google Scholar 

  9. MacKinnon, D.P., Fairchild, A.J., Fritz, M.S.: Mediation analysis. Annu. Rev. Psychol. 58, 593–614 (2007)

    Article  Google Scholar 

  10. Newey, W.K., McFadden, D.: Large sample estimation and hypothesis testing. In: R. F. Engle, R.F., McFadden, D. (eds.) Handbook of Econometrics, vol. IV, pp. 2111–2245. Elsevier, Amsterdam (1994)

    Google Scholar 

  11. Pearl, J.: Direct and indirect effects. In: Proc. 17th Conf. Uncertainty in Artificial Intelligence, pp. 411–420 (2000)

    Google Scholar 

  12. Rubin, D.B.: Direct and indirect causal effects via potential outcomes. Scand. J. Stat. 31, 161–170 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  13. Sobel, M.E.: Identification of causal parameters in randomized studies with mediating variables. J. Educ. Behav. Stat. 33, 230–251 (2008)

    Article  Google Scholar 

  14. Ten Have, T.R., Joffe, M.M., Lynch, K.G., Brown, G.K., Maisto, S.A., Beck, A.T.: Causal mediation analyses with rank preserving models. Biometrics 63, 926–934(2007)

    Article  MathSciNet  MATH  Google Scholar 

  15. VanderWeele, T.J.: Marginal structural models for the estimation of direct and indirect effects. Epidemiology 20 (1), 18–26 (2009)

    Article  MathSciNet  Google Scholar 

  16. VanderWeele, T.J.: Controlled direct and mediated effects: definition, identification and bounds. Scand. J. Stat. 38, 551–563 (2011)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This research was supported by NSFC (11171365, 11021463, 10931002), 863 Program of China (2015AA020507) and a project founded by Merck (China).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to **aohua Douglas Zhang .

Editor information

Editors and Affiliations

Appendices

Appendix 1: Proof of Theorem 1

We separately show the necessity and sufficiency for the identifiability of parameters in model (13.7). For necessity, suppose that the non-linearity condition does not hold, that is, | ρ(E(M | X), X) |  = 1. This implies that there exist some a 0 and a 1 satisfying E(M | X) = a 1 X + a 0 almost everywhere. Then from the model (13.7) we have

$$\displaystyle\begin{array}{rcl}E(Y \vert X)& =& b_{0} + b_{1}E(M\vert X) + b_{2}X{}\\ & =& (b_{0} + a_{0}b_{1}) + (a_{1}b_{1} + b_{2})X. {}\\\end{array}$$

The above equation implies that Y ismarginally linear with respect to X.For this linear model, only the intercept(b 0 + a 0 b 1) and the slope(a 1 b 1 + b 2) are identifiable as awhole, while parameters b 0,b 1, andb 2cannot be distinguished each other.

For sufficiency, if M is not marginally linearly related with respect to X, then we can find 3 levels: x 1, x 2, and x 3, which satisfy [E(M | x 1) − E(M | x 2)]∕(x 1x 2) ≠ [E(M | x 2) − E(M | x 3)]∕(x 2x 3). Hence the matrix in (13.10) has full rank. Thus parameters b 1 and b 2 can be identified, and then parameter b 0 can be identified from b 0 = E(Y | x i ) − b 1 E(M | x i ) − b 2 x i .

Appendix 2: Proof for Theorem 2

For sufficiency, when E(M | X, Z) ≠ cX +ψ(Z), there are two situations: (i) E(M | X, Z) = Ψ(X) +ψ(Z), where Ψ(⋅ ) is a nonlinear function of X; (ii) E(M | X, Z) is not additive with respect to X and Z.

For situation (i), since Ψ(⋅ ) is not a linear function, we can choose three levels of X (say x 1, x 2, x 3) and some z satisfying [E(M | x 1, z) − E(M | x 2, z)]∕(x 1x 2) ≠ [E(M | x 2, z) − E(M | x 3, z)]∕(x 2x 3). Then the following equation from model (13.12) has a unique solution because the coefficient matrix has full rank:

$$\displaystyle\begin{array}{rcl} \left [\begin{array}{cc} E(M\vert x_{1},z) - E(M\vert x_{2},z)&x_{1} - x_{2} \\ E(M\vert x_{2},z) - E(M\vert x_{3},z)&x_{2} - x_{3}\\ \end{array} \right ]\left [\begin{array}{c} b_{1} \\ b_{2} \end{array} \right ]& =& \left [\begin{array}{c} E(Y \vert x_{1},z) - E(Y \vert x_{2},z) \\ E(Y \vert x_{2},z) - E(Y \vert x_{3},z) \end{array} \right ].{}\\ \end{array}$$

Thus the parameters can be identified.

For situation (ii), since E(M | X, Z) is not additive with respect to X and Z, we can find two levels of X (say x 1, x 2) and two levels of Z (say z 1, z 2) satisfying E(M | x 1, z 1) − E(M | x 2, z 1) ≠ E(M | x 1, z 2) − E(M | x 2, z 2). The following equation derived from model (3.4) has a unique solution because the coefficient matrix has full rank:

$$\displaystyle\begin{array}{rcl} \left [\begin{array}{cc} E(M\vert x_{1},z_{1}) - E(M\vert x_{2},z_{1})&x_{1} - x_{2} \\ E(M\vert x_{1},z_{2}) - E(S\vert x_{2},z_{2}) &x_{1} - x_{2}\\ \end{array} \right ]\left [\begin{array}{c} b_{1} \\ b_{2} \end{array} \right ]& =& \left [\begin{array}{c} E(Y \vert x_{1},z_{1}) - E(Y \vert x_{2},z_{1}) \\ E(Y \vert x_{1},z_{2}) - E(Y \vert x_{2},z_{2}) \end{array} \right ].{}\\ \end{array}$$

Thus the parameters can be identified.

For necessity, if E(M | X, Z) = cX +ψ(Z) for some constant c and ψ(⋅ ), then from model (13.12), we have

$$\displaystyle\begin{array}{rcl} E(Y \vert X,Z)& =& b_{0} + b_{1}E(M\vert X,Z) + b_{2}X + E[\phi (U,Z,\varepsilon _{Y })\vert Z] {}\\ & =& b_{0} + (b_{1}c + b_{2})X +\varPhi (Z), {}\\ \end{array}$$

where Φ(Z) = b 1 ψ(Z) + E[ϕ(U, Z, ɛ Y ) | Z]. We can easily see that only c, b 1 c + b 2 and Φ(Z) can be identified given observed data of (Z, X, M, Y ). b 1 and b 2 cannot be identified because (1) E(M | X, Z) is linear with respect to X, and (2) E[ϕ(U, Z, ɛ Y ) | Z] cannot be identified since U and ɛ Y are never observed. Thus the parameters in model (13.12) are identifiable only if E(M | X, Z) ≠ cX +ψ(Z).

Appendix 3: Proof for the Equivalence of Different Choices of f(⋅ ) in Eq. (13.15) for the Estimation When the Identifiability Condition in Theorem 1 Holds

We want to show that an arbitrary vector function f(⋅ ) that identifies β via Eq. (13.15) leads to the same estimator as that based on the function f (⋅ ). For an arbitrary vector function f(⋅ ) = (f 1(⋅ ), f 2(⋅ ), ⋯ , f K (⋅ ))′ (K > 2), we can denote it as

$$\displaystyle\begin{array}{rcl} \mathbf{f}(X)& =& \left [\begin{array}{ccc} f_{1}(1) & f_{1}(2) & f_{1}(3)\\ \vdots & \vdots & \vdots \\ f_{K}(1)&f_{K}(2)&f_{K}(3) \end{array} \right ]\left [\begin{array}{c} \delta (X = 1) \\ \delta (X = 2) \\ \delta (X = 3) \end{array} \right ].{}\end{array}$$
(13.21)

Let Q denote the K × 3 matrix on the right-hand side. Equation (13.15) can be rewritten as G β = H, where G = E[f(X), M f(X), X f(X)] and H = E[Y f(X)]. Then the estimation equation for β is \(\widehat{G}\hat{\beta } =\widehat{ H}\). From (13.21), we have

$$\displaystyle\begin{array}{rcl} \widehat{G}& =& \widehat{E}[\mathbf{f}(X),M\mathbf{f}(X),X\mathbf{f}(X)] {}\\ & =& \widehat{E}[Q\mathbf{f}^{{\ast}}(X),MQ\mathbf{f}^{{\ast}}(X),XQ\mathbf{f}^{{\ast}}(X)] {}\\ & =& Q\widehat{E}[\mathbf{f}^{{\ast}}(X),M\mathbf{f}^{{\ast}}(X),X\mathbf{f}^{{\ast}}(X)] {}\\ & =& Q\widehat{G^{{\ast}}}, {}\\ \end{array}$$

where \(\widehat{E}(\cdot )\) denotes the sample mean of the corresponding variable. Similarly, we have

$$\displaystyle{\widehat{H} = Q\widehat{H^{{\ast}}}.}$$

Then by the function f(⋅ ), the estimation equation for β is equivalent to

$$\displaystyle\begin{array}{rcl} \widehat{G}\hat{\beta } -\widehat{ H} = Q(\widehat{G^{{\ast}}}\hat{\beta }-\widehat{ H^{{\ast}}}) = 0.& & {}\\ \end{array}$$

Since \(\hat{\beta }^{{\ast}}\) satisfies the equation \(\widehat{G^{{\ast}}}\hat{\beta }^{{\ast}}-\widehat{ H^{{\ast}}} = 0\), we have that \(\hat{\beta }^{{\ast}}\) also satisfies \(\widehat{G}\hat{\beta }^{{\ast}}-\widehat{ H} = 0\). Thus we proved \(\hat{\beta }=\hat{\beta } ^{{\ast}}\) when Q has full rank, which means that the above equation of \(\hat{\beta }\) has a unique solution.

Appendix 4: Proof for Matrix G eff in Sect. 13.4.2 Equals E[f eff(X)f eff(X)′] and Has Full Rank When Non-linearity Condition in Theorem 1 Holds

  1. (i)

    We show that G eff = E[f eff(X)f eff(X)′]. It is obvious that

    $$\displaystyle\begin{array}{rcl} E[\mathbf{f}^{\mathrm{eff}}(X)\mathbf{f}^{\mathrm{eff}}(X)']& =& E\left \{\left [\begin{array}{ccc} 1 & E(M\vert X) & X \\ E(M\vert X)& E(M\vert X)^{2} & E(M\vert X)X \\ X &XE(M\vert X)& X^{2}\end{array} \right ]\right \} {}\\ & =& \left [\begin{array}{ccc} 1 & E(M) & E(X) \\ E(M)&E[E(M\vert X)^{2}]&E(XM) \\ E(X) & E(XM) & E(X^{2})\end{array} \right ] = G^{\mathrm{eff}}.{}\\ \end{array}$$
  2. (ii)

    We prove that G eff has full rank when non-linearity condition in Theorem 1 holds. To prove that G eff has full rank, we only need show that det(G eff) ≠ 0 when | ρ(X, E(M | X)) |  < 1. We have

    $$\displaystyle\begin{array}{rcl} \mathrm{det}(G^{\mathrm{eff}})& =& \left \vert \begin{array}{ccc} 1 & E(M) & E(X) \\ E(M)&E[E(M\vert X)^{2}]&E(XM) \\ E(X) & E(XM) & E(X^{2})\end{array} \right \vert {}\\ & =& \left \vert \begin{array}{ccc} 1& E(M) & E(X) \\ 0&E[E(M\vert X)^{2}] - [E(M)]^{2} & E(XM) - E(X)E(M) \\ 0& E(XM) - E(X)E(M) & E(X^{2}) - [E(X)]^{2}\end{array} \right \vert.{}\\ \end{array}$$

    Since

    $$\displaystyle{\mathrm{var}[E(M\vert X)] = E[E(M\vert X)^{2}] - [E(M)]^{2},\mathrm{var}(X) = E(X^{2}) - [E(X)]^{2}}$$

    and

    $$\displaystyle{\mathrm{cov}(X,E(M\vert X)) = E[XE(M\vert X)]-E(X)E[E(M\vert X)] = E(XM)-E(X)E(M),}$$

    we have

    $$\displaystyle\begin{array}{rcl} \mathrm{det}(G^{\mathrm{eff}})& =& \left \vert \begin{array}{ccc} 1& E(M) & E(X) \\ 0& \mathrm{var}[E(M\vert X)] &\mathrm{cov}(X,E(M\vert X)) \\ 0&\mathrm{cov}(X,E(M\vert X))& \mathrm{var}(X)\end{array} \right \vert {}\\ & =& \mathrm{var}[E(M\vert X)]\mathrm{var}(X)(1 - [\rho (X,E(M\vert X)]^{2}) > 0, {}\\ \end{array}$$

    since | ρ(X, E(M | X)) |  < 1.

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

He, P., Wu, Z., Zhang, X.D., Geng, Z. (2016). Identification of Causal Mediation Models with an Unobserved Pre-treatment Confounder. In: He, H., Wu, P., Chen, DG. (eds) Statistical Causal Inferences and Their Applications in Public Health Research. ICSA Book Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-41259-7_13

Download citation

Publish with us

Policies and ethics

Navigation