Imagine we are interested in the causal effect of x on , and we do so using a simple linear regression where is the outcome (i.e. response) variable, and x is the predictor variable. There is another variable, , that we have collected data on. We need to decide if should be added to the model, so that we estimate the effect of x on when the value of is held constant. Below are five descriptions telling you what kind of variables might be. For each one, decide whether or not you should add to the model.
1. You should add to the model.
2. You should not add Z to the model.
Possible answers
A. is a confounding variable; it has a causal influence on both x and .
B. is a post-treatment variable; x has a causal effect on both x and , but has no causal effect on .
C. is a mediator variable, such that x has a causal influence on both and , and has an additional causal influence on . We are interested in the total effect of x on , which includes the direct effect and the indirect effect that passes through Z.
D. is a mediator variable, such that x has a causal influence on both and , and has an additional causal influence on . We are interested in the direct effect of x on , independent of .
E. is a collider variable; x and both have a causal influence on .