This post builds on a previous post on Testing Indirect Effects/Mediation in R [.html].

What is mediation?

There are many ways to define mediation and mediators. Here’s one way: Mediation is the process by which one variable transmits an effect onto another through one or more mediating variables. For example, as room temperature increases, people get thirstier, and then they drink more water. In this case, thirst transmits the effect of room temperature on water drinking.

What is an indirect effect?

The indirect effect quantifies a mediation effect, if such an effect exists. Referring to the thirst example above, in statistical terms, the indirect effect quantifies the extent to which room temperature is associated with water drinking indirectly through thirstiness. If you’re familiar with interpreting regression coefficients and the idea of controlling for other variables, then you might find it intuitive to think of the indirect effect as the decrease in the relationship between room temperature and water drinking after you’ve partialed out the association between room temperature and thirtiness. In other words, how much does the coefficient for room temperature decrease when you control for thirstiness?

What is moderation?

Moderation refers to how some variable modifies the direction or the strength of the association between two variables. In other words, a moderator variable qualifies the relation between two variables. A moderator is not a part of some proposed causal process; instead, it interacts with the relation between two variables in such a way that their relation is stronger, weaker, or opposite in direction—depending on values of the moderator. For example, as room temperature increases, people may report feeling thirstier. But that may depend on how physically fit people are. Maybe physically fit people don’t report feeling thirsty as room temperature increases, or maybe physically fit people—compared to less physically fit people—have a higher room temperature threshold at which they start feeling thirstier. In this example, the product of one predictor variables and the moderator—their interaction—quantifies the moderator’s effect. Statistically, the product term accounts for variability in thirst or water drinking independently of either predictor variable by itself.

What is a conditional indirect effect (i.e., moderated mediation)?

The conditional indirect concept combines moderation and mediaition. Think back to the idea behind a simple indirect effect: It quantifies the extent to which two variables are related through a third variable, the mediator. Coneptually, the conditional indirect effect quantifies the indirect effect at different values of a moderator. In this sense, an indirect effect may be stronger, weaker, or opposite in sign, depending on values of a moderator. Importantly, a moderator may qualify any relation that’s a part of some proposed mediation model. For example, physical fitness might qualify the association between room temperature and thirstiness, between thirstiness and water drinking, or both.

What is the Index of Moderated Mediation?

Much like the product or interaction term in a linear regression analysis quantifies the relation between a predictor and a moderator, the index of moderated mediation quantifies the relationship between the indirect effect and a moderator. An index of moderated mediation that is significantly different from zero implies that any two conditional indirect effects are smaller, larger, or opposite in sign at different levels of the moderator.

Model and Conceptual Assumptions

  • Correct functional form. Your model variables share linear relationships and don’t interact with eachother.
  • No omitted influences. This one is hard: Your model accounts for all relevant influences on the variables included. All models are wrong, but how wrong is yours?
  • Accurate measurement. Your measurements are valid and reliable. Note that unreliable measures can’t be valid, and reliable measures don’t necessairly measure just one construct or even your construct.
  • Well-behaved residuals. Residuals (i.e., prediction errors) aren’t correlated with predictor variables or eachother, and residuals have constant variance across values of your predictor variables. Also, residual error terms aren’t correlated across regression equations. This could happen if, for example, some omitted variable causes both thirst and water drinking.




I combined the data from Table 3.1 in Mackinnon (2008, p. 56) [.csv] with those from Table 10.1 in Mackinnon (2008, p. 291) [.csv]

thirst_norm <- "" %>% read_csv()
thirst_fit <- "" %>% read_csv()

Code new IDs for fit data

thirst_fit <- thirst_fit %>% mutate(id = 51:100)

Add column in both datasets that identifies fitness group

Unfit = -0.5 and Fit = 0.5

thirst_norm <- thirst_norm %>% mutate(phys_fit = -0.5)
thirst_fit <- thirst_fit %>% mutate(phys_fit = 0.5)

Bind unfit and fit data by rows

Imagine stacking these datasets on top of eachother

thirst_data <- bind_rows(thirst_norm, thirst_fit)

Mean-center predictors

i.e., mean-center everything but the consume variable

thirst_data <- thirst_data %>% mutate(room_temp_c = room_temp - mean(room_temp),
                                      thirst_c = thirst - mean(thirst))

Compute interaction terms

thirst_data <- thirst_data %>% mutate(tmp_fit = room_temp_c * phys_fit,
                                      thrst_fit = thirst_c * phys_fit)

Save to data folder

thirst_data %>% write_csv(path = "data/thirst_data.csv")

Visualize relationships

It’s always a good idea to look at your data. Check some assumptions.

thirst_data %>% 
  select(room_temp, room_temp_c, thirst, thirst_c, consume, phys_fit, tmp_fit, thrst_fit) %>%