This post builds on a previous post on Testing Indirect Effects/Mediation in R [

.html].

There are many ways to define mediation and mediators. Hereâ€™s one way: Mediation is the process by which one variable transmits an effect onto another through one or more mediating variables. For example, as room temperature increases, people get thirstier, and then they drink more water. In this case, thirst transmits the effect of room temperature on water drinking.

The indirect effect quantifies a mediation effect, if such an effect exists. Referring to the thirst example above, in statistical terms, the indirect effect quantifies the extent to which room temperature is associated with water drinking indirectly through thirstiness. If youâ€™re familiar with interpreting regression coefficients and the idea of controlling for other variables, then you might find it intuitive to think of the indirect effect as the decrease in the relationship between room temperature and water drinking after youâ€™ve partialed out the association between room temperature and thirtiness. In other words, how much does the coefficient for room temperature decrease when you control for thirstiness?

Moderation refers to how some variable modifies the direction or the strength of the association between two variables. In other words, a moderator variable qualifies the relation between two variables. A moderator is not a part of some proposed causal process; instead, it interacts with the relation between two variables in such a way that their relation is stronger, weaker, or opposite in directionâ€”depending on values of the moderator. For example, as room temperature increases, people may report feeling thirstier. But that may depend on how physically fit people are. Maybe physically fit people donâ€™t report feeling thirsty as room temperature increases, or maybe physically fit peopleâ€”compared to less physically fit peopleâ€”have a higher room temperature threshold at which they start feeling thirstier. In this example, the product of one predictor variables and the moderatorâ€”their interactionâ€”quantifies the moderatorâ€™s effect. Statistically, the product term accounts for variability in thirst or water drinking independently of either predictor variable by itself.

The conditional indirect concept combines moderation and mediaition. Think back to the idea behind a simple indirect effect: It quantifies the extent to which two variables are related through a third variable, the mediator. Coneptually, the conditional indirect effect quantifies the indirect effect at different values of a moderator. In this sense, an indirect effect may be stronger, weaker, or opposite in sign, depending on values of a moderator. Importantly, a moderator may qualify any relation thatâ€™s a part of some proposed mediation model. For example, physical fitness might qualify the association between room temperature and thirstiness, between thirstiness and water drinking, or both.

Much like the product or interaction term in a linear regression analysis quantifies the relation between a predictor and a moderator, the index of moderated mediation quantifies the relationship between the indirect effect and a moderator. An index of moderated mediation that is significantly different from zero implies that any two conditional indirect effects are smaller, larger, or opposite in sign at different levels of the moderator.

Correct functional form.Your model variables share linear relationships and donâ€™t interact with eachother.No omitted influences.This one is hard: Your model accounts for all relevant influences on the variables included. All models are wrong, but how wrong is yours?Accurate measurement.Your measurements are valid and reliable. Note that unreliable measures canâ€™t be valid, and reliable measures donâ€™t necessairly measure just one construct or even your construct.Well-behaved residuals.Residuals (i.e., prediction errors) arenâ€™t correlated with predictor variables or eachother, and residuals have constant variance across values of your predictor variables. Also, residual error terms arenâ€™t correlated across regression equations. This could happen if, for example, some omitted variable causes both thirst and water drinking.

```
library(tidyverse)
library(knitr)
library(lavaan)
library(psych)
```

I combined the data from Table 3.1 in Mackinnon (2008, p.Â 56) [

.csv] with those from Table 10.1 in Mackinnon (2008, p.Â 291) [.csv]

```
thirst_norm <- "https://raw.githubusercontent.com/nmmichalak/nicholas_michalak/master/blog_entries/2018/nrg01/data/mackinnon_2008_t3.1.csv" %>% read_csv()
thirst_fit <- "https://raw.githubusercontent.com/nmmichalak/nicholas_michalak/master/blog_entries/2018/nrg02/data/mackinnon_2008_t10.1.csv" %>% read_csv()
```

`thirst_fit <- thirst_fit %>% mutate(id = 51:100)`

Unfit = -0.5 and Fit = 0.5

```
thirst_norm <- thirst_norm %>% mutate(phys_fit = -0.5)
thirst_fit <- thirst_fit %>% mutate(phys_fit = 0.5)
```

Imagine stacking these datasets on top of eachother

`thirst_data <- bind_rows(thirst_norm, thirst_fit)`

i.e., mean-center everything but the consume variable

```
thirst_data <- thirst_data %>% mutate(room_temp_c = room_temp - mean(room_temp),
thirst_c = thirst - mean(thirst))
```

```
thirst_data <- thirst_data %>% mutate(tmp_fit = room_temp_c * phys_fit,
thrst_fit = thirst_c * phys_fit)
```

```
thirst_data %>%
headTail() %>%
kable()
```

id | room_temp | thirst | consume | phys_fit | room_temp_c | thirst_c | tmp_fit | thrst_fit |
---|---|---|---|---|---|---|---|---|

1 | 70 | 4 | 3 | -0.5 | -0.13 | 0.87 | 0.06 | -0.44 |

2 | 71 | 4 | 3 | -0.5 | 0.87 | 0.87 | -0.44 | -0.44 |

3 | 69 | 1 | 3 | -0.5 | -1.13 | -2.13 | 0.56 | 1.06 |

4 | 70 | 1 | 3 | -0.5 | -0.13 | -2.13 | 0.06 | 1.06 |

â€¦ | â€¦ | â€¦ | â€¦ | â€¦ | â€¦ | â€¦ | â€¦ | â€¦ |

97 | 71 | 4 | 4 | 0.5 | 0.87 | 0.87 | 0.44 | 0.44 |

98 | 71 | 4 | 5 | 0.5 | 0.87 | 0.87 | 0.44 | 0.44 |

99 | 70 | 3 | 3 | 0.5 | -0.13 | -0.13 | -0.06 | -0.06 |

100 | 71 | 4 | 3 | 0.5 | 0.87 | 0.87 | 0.44 | 0.44 |

`thirst_data %>% write_csv(path = "data/thirst_data.csv")`

Itâ€™s always a good idea to look at your data. Check some assumptions.

```
thirst_data %>%
select(room_temp, room_temp_c, thirst, thirst_c, consume, phys_fit, tmp_fit, thrst_fit) %>%
pairs.panels()
```