What is moderation?

Moderation refers to how some variable modifies the direction or the strength of the association between two variables. In other words, a moderator variable qualifies the relation between two variables. A moderator is not a part of some proposed causal process; instead, it interacts with the relation between two variables in such a way that their relation is stronger, weaker, or opposite in direction—depending on values of the moderator. For example, as room temperature increases, people may report feeling thirstier. But that may depend on how physically fit people are. Maybe physically fit people don’t report feeling thirsty as room temperature increases, or maybe physically fit people—compared to less physically fit people—have a higher room temperature threshold at which they start feeling thirstier. In this example, the product of one predictor variable and the moderator—their interaction—quantifies the moderator’s effect. Statistically, the product term accounts for variability in thirst or water drinking independently of either predictor variable by itself.

What is a simple slope?

In a 2-way interaction, a simple slope represents the relation between two variables (e.g., x and y) at a specific value of a third variable (e.g., a moderator variable). In this sense, a simple slope is a conditional relationship between two variables. For example, if participants are physically fit, then as room temperature increases, thirst also increases.

Model and Conceptual Assumptions

  • Correct functional form. Your model variables share linear relationships.
  • No omitted influences. This one is hard: Your model accounts for all relevant influences on the variables included. All models are wrong, but how wrong is yours?
  • Accurate measurement. Your measurements are valid and reliable. Note that unreliable measures can’t be valid, and reliable measures don’t necessairly measure just one construct or even your construct.
  • Well-behaved residuals. Residuals (i.e., prediction errors) aren’t correlated with predictor variables or eachother, and residuals have constant variance across values of your predictor variables.



# In the code belo,w I want select from the dplyr package from the tidyverse
select <- dplyr::select

Data: Example 1 (categorical x continuous interaction)

I combined the data from Table 3.1 in Mackinnon (2008, p. 56) [.csv] with those from Table 10.1 in Mackinnon (2008, p. 291) [.csv]

thirst_norm <- "https://raw.githubusercontent.com/nmmichalak/nicholas_michalak/master/blog_entries/2018/nrg01/data/mackinnon_2008_t3.1.csv" %>% read_csv()
thirst_fit <- "https://raw.githubusercontent.com/nmmichalak/nicholas_michalak/master/blog_entries/2018/nrg02/data/mackinnon_2008_t10.1.csv" %>% read_csv()

Code new IDs for fit data

thirst_fit <- thirst_fit %>% mutate(id = 51:100)

Add column in both datasets that identifies fitness group

Unfit = -0.5 and Fit = 0.5

thirst_norm <- thirst_norm %>% mutate(phys_fit = -0.5)
thirst_fit <- thirst_fit %>% mutate(phys_fit = 0.5)

Bind unfit and fit data by rows

Imagine stacking these datasets on top of eachother

thirst_data <- bind_rows(thirst_norm, thirst_fit)

Mean-center predictors

i.e., mean-center everything but the consume variable

thirst_data <- thirst_data %>% mutate(room_temp_c = room_temp - mean(room_temp),
                                      thirst_c = thirst - mean(thirst))

Visualize relationships

It’s always a good idea to look at your data. Check some assumptions.

thirst_data %>% 
  select(room_temp, room_temp_c, thirst, thirst_c, consume, phys_fit) %>%