The Three Most Common Types of Hypotheses

In this post, I discuss three of the most common hypotheses in psychology research, and what statistics are often used to test them.

  • Post author By sean
  • Post date September 28, 2013
  • 37 Comments on The Three Most Common Types of Hypotheses

moderation hypothesis formulation

Simple main effects (i.e., X leads to Y) are usually not going to get you published. Main effects can be exciting in the early stages of research to show the existence of a new effect, but as a field matures the types of questions that scientists are trying to answer tend to become more nuanced and specific.  In this post, I’ll briefly describe the three most common kinds of hypotheses that expand upon simple main effects – at least, the most common ones I’ve seen in my research career in psychology – as well as providing some resources to help you learn about how to test these hypotheses using statistics.

Incremental Validity

“Can X predict Y over and above other important predictors?”

Inc_Validity

This is probably the simplest of the three hypotheses I propose. Basically, you attempt to rule out potential confounding variables by controlling for them in your analysis.  We do this because (in many cases) our predictor variables are correlated with each other. This is undesirable from a statistical perspective, but is common with real data. The idea is that we want to see if X can predict unique variance in Y over and above the other variables you include.

In terms of analysis, you are probably going to use some variation of multiple regression or partial correlations.  For example, in my own work I’ve shown in the past that friendship intimacy as coded from autobiographical narratives can predict concern for the next generation over and above numerous other variables, such as optimism, depression, and relationship status ( Mackinnon et al., 2011 ).

“Under what conditions does X lead to Y?”

Of the three techniques I describe, moderation is probably the most tricky to understand.  Essentially, it proposes that the size of a relationship between two variables changes depending upon the value of a third variable, known as a “moderator.”  For example, in the diagram below you might find a simple main effect that is moderated by sex. That is, the relationship is stronger for women than for men:

moderation

With moderation, it is important to note that the moderating variable can be a category (e.g., sex) or it can be a continuous variable (e.g., scores on a personality questionnaire).  When a moderator is continuous, usually you’re making statements like: “As the value of the moderator increases, the relationship between X and Y also increases.”

“Does X predict M, which in turn predicts Y?”

We might know that X leads to Y, but a mediation hypothesis proposes a mediating, or intervening variable. That is, X leads to M, which in turn leads to Y.  In the diagram below I use a different way of visually representing things consistent with how people typically report things when using path analysis.

Mediation

I use mediation a lot in my own research. For example, I’ve published data suggesting the relationship between perfectionism and depression is mediated by relationship conflict ( Mackinnon et al., 2012 ). That is, perfectionism leads to increased conflict, which in turn leads to heightened depression. Another way of saying this is that perfectionism has an indirect effect on depression through conflict.

Helpful links to get you started testing these hypotheses

Depending on the nature of your data, there are multiple ways to address each of these hypotheses using statistics. They can also be combined together (e.g., mediated moderation). Nonetheless, a core understanding of these three hypotheses and how to analyze them using statistics is essential for any researcher in the social or health sciences.  Below are a few links that might help you get started:

Are you a little rusty with multiple regression? The basics of this technique are required for most common tests of these hypotheses. You might check out this guide as a helpful resource:

https://statistics.laerd.com/spss-tutorials/multiple-regression-using-spss-statistics.php

David Kenny’s Mediation Website provides an excellent overview of mediation and moderation for the beginner.

http://davidakenny.net/cm/mediate.htm

http://davidakenny.net/cm/moderation.htm

Preacher and Haye’s INDIRECT Macro is a great, easy way to implement mediation in SPSS software, and their MODPROBE macro is a useful tool for testing moderation.

http://afhayes.com/spss-sas-and-mplus-macros-and-code.html

If you want to graph the results of your moderation analyses, the excel calculators provided on Jeremy Dawson’s webpage are fantastic, easy-to-use tools:

http://www.jeremydawson.co.uk/slopes.htm

  • Tags mediation , moderation , regression , tutorial

37 replies on “The Three Most Common Types of Hypotheses”

I want to see clearly the three types of hypothesis

Thanks for your information. I really like this

Thank you so much, writing up my masters project now and wasn’t sure whether one of my variables was mediating or moderating….Much clearer now.

Thank you for simplified presentation. It is clearer to me now than ever before.

Thank you. Concise and clear

hello there

I would like to ask about mediation relationship: If I have three variables( X-M-Y)how many hypotheses should I write down? Should I have 2 or 3? In other words, should I have hypotheses for the mediating relationship? What about questions and objectives? Should be 3? Thank you.

Hi Osama. It’s really a stylistic thing. You could write it out as 3 separate hypotheses (X -> Y; X -> M; M -> Y) or you could just write out one mediation hypotheses “X will have an indirect effect on Y through M.” Usually, I’d write just the 1 because it conserves space, but either would be appropriate.

Hi Sean, according to the three steps model (Dudley, Benuzillo and Carrico, 2004; Pardo and Román, 2013)., we can test hypothesis of mediator variable in three steps: (X -> Y; X -> M; X and M -> Y). Then, we must use the Sobel test to make sure that the effect is significant after using the mediator variable.

Yes, but this is older advice. Best practice now is to calculate an indirect effect and use bootstrapping, rather than the causal steps approach and the more out-dated Sobel test. I’d recommend reading Hayes (2018) book for more info:

Hayes, A. F. (2018). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach (2nd ed). Guilford Publications.

Hi! It’s been really helpful but I still don’t know how to formulate the hypothesis with my mediating variable.

I have one dependent variable DV which is formed by DV1 and DV2, then I have MV (mediating variable), and then 2 independent variables IV1, and IV2.

How many hypothesis should I write? I hope you can help me 🙂

Thank you so much!!

If I’m understanding you correctly, I guess 2 mediation hypotheses:

IV1 –> Med –> DV1&2 IV2 –> Med –> DV1&2

Thank you so much for your quick answer! ^^

Could you help me formulate my research question? English is not my mother language and I have trouble choosing the right words. My x = psychopathy y = aggression m = deficis in emotion recognition

thank you in advance

I have mediator and moderator how should I make my hypothesis

Can you have a negative partial effect? IV – M – DV. That is my M will have negative effect on the DV – e.g Social media usage (M) will partial negative mediate the relationship between father status (IV) and social connectedness (DV)?

Thanks in advance

Hi Ashley. Yes, this is possible, but often it means you have a condition known as “inconsistent mediation” which isn’t usually desirable. See this entry on David Kenny’s page:

Or look up “inconsistent mediation” in this reference:

MacKinnon, D. P., Fairchild, A. J., & Fritz, M. S. (2007). Mediation analysis. Annual Review of Psychology, 58, 593-614.

This is very interesting presentation. i love it.

This is very interesting and educative. I love it.

Hello, you mentioned that for the moderator, it changes the relationship between iv and dv depending on its strength. How would one describe a situation where if the iv is high iv and dv relationship is opposite from when iv is low. And then a 3rd variable maybe the moderator increases dv when iv is low and decreases dv when iv is high.

This isn’t problematic for moderation. Moderation just proposes that the magnitude of the relationship changes as levels of the moderator changes. If the sign flips, probably the original relationship was small. Sometimes people call this a “cross-over” effect, but really, it’s nothing special and can happen in any moderation analysis.

i want to use an independent variable as moderator after this i will have 3 independent variable and 1 dependent variable…. my confusion is do i need to have some past evidence of the X variable moderate the relationship of Y independent variable and Z dependent variable.

Dear Sean It is really helpful as my research model will use mediation. Because I still face difficulty in developing hyphothesis, can you give examples ? Thank you

Hi! is it possible to have all three pathways negative? My regression analysis showed significant negative relationships between x to y, x to m and m to y.

Hi, I have 1 independent variable, 1 dependent variable and 4 mediating variable May I know how many hypothesis should I develop?

Hello I have 4 IV , 1 mediating Variable and 1 DV

My model says that 4 IVs when mediated by 1MV leads to 1 Dv

Pls tell me how to set the hypothesis for mediation

Hi I have 4 IVs ,2 Mediating Variables , 1DV and 3 Outcomes (criterion variables).

Pls can u tell me how many hypotheses to set.

Thankyou in advance

I am in fact happy to read this webpage posts which carries tons of useful information, thanks for providing such data.

I see you don’t monetize savvystatistics.com, don’t waste your traffic, you can earn additional bucks every month with new monetization method. This is the best adsense alternative for any type of website (they approve all websites), for more info simply search in gooogle: murgrabia’s tools

what if the hypothesis and moderator significant in regrestion and insgificant in moderation?

Thank you so much!! Your slide on the mediator variable let me understand!

Very informative material. The author has used very clear language and I would recommend this for any student of research/

Hi Sean, thanks for the nice material. I have a question: for the second type of hypothesis, you state “That is, the relationship is stronger for men than for women”. Based on the illustration, wouldn’t the opposite be true?

Yes, your right! I updated the post to fix the typo, thank you!

I have 3 independent variable one mediator and 2 dependant variable how many hypothesis I have 2 write?

Sounds like 6 mediation hypotheses total:

X1 -> M -> Y1 X2 -> M -> Y1 X3 -> M -> Y1 X1 -> M -> Y2 X2 -> M -> Y2 X3 -> M -> Y2

Clear explanation! Thanks!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Statistics: Data analysis and modelling

Chapter 6 moderation and mediation.

In this chapter, we will focus on two ways in which one predictor variable may affect the relation between another predictor variable and the dependent variable. Moderation means the strength of the relation (in terms of the slope) of a predictor variable is determined by the value of another predictor variable. For instance, while physical attractiveness is generally positively related to mating success, for very rich people, physical attractiveness may not be so important. This is also called an interaction between the two predictor variables. Mediation is a different way in which two predictors affect a dependent variable. It is best thought of as a causal chain , where one predictor variable determines the value of another predictor variable, which then in turn determines the value of the dependent variable. the difference between moderation and mediation is illustrated in Figure 6.1 .

Figure 6.1: Graphical depiction of the difference between moderation and mediation. Moderation means that the effect of a predictor ( \(X_1\) ) on the dependent variable ( \(Y\) ) depends on the value of another predictor ( \(X_2\) ). Mediation means that a predictor ( \(X_1\) ) affects the dependent variable ( \(Y\) ) indirectly, through its relation to another predictor ( \(X_2\) ) which is directly related to the dependent variable.

6.1 Moderation

6.1.1 physical attractiveness and intelligence in speed dating.

Fisman, Iyengar, Kamenica, & Simonson ( 2006 ) conducted a large scale experiment 15 on dating behaviour. They placed their participants in a speed dating context, where they were randomly matched with a number of potential partners (between 5 and 20) and could converse for four minutes. As part of the study, after each meeting, participants rated how much they liked their speed dating partners, as well as more specifically on their attractiveness, sincerity, intelligence, fun, and ambition. We will focus in particular on ratings of physical attractiveness, fun, and intelligence, and how these are related to the general liking of a person. Ratings were given on a 10-point scale, from 1 (“awful”) to 10 (“great”). A multiple regression analysis predicting general liking from attractiveness, fun, and intelligence (Table 6.1 ) shows that all three predictors have a significant and positive relation with general liking.

Table 6.1: Multiple regression predicting liking from attractiveness, intelligence, and fun.
\(\hat{\beta}\) \(\text{SE}(\hat{\beta})\) \(t\) \(p(\geq \lvert t \rvert)\)
Intercept -0.458 0.160 -2.85 0.004
Attractiveness 0.345 0.019 17.90 0.000
Intelligence 0.266 0.023 11.82 0.000
Fun 0.379 0.021 18.05 0.000

6.1.2 Conditional slopes

If we were to model the relation between overall liking and physical attractiveness and intelligence, we might use a multiple regression model such as: 16 \[\texttt{like}_i = \beta_0 + \beta_{\texttt{attr}} \times \texttt{attr}_i + \beta_\texttt{intel} \times \texttt{intel}_i + \epsilon_i \quad \quad \epsilon_i \sim \mathbf{Normal}(0,\sigma_\epsilon)\] which is estimated as \[\texttt{like}_i = -0.0733 + 0.527 \times \texttt{attr}_i + 0.392 \times \texttt{intel}_i + \hat{\epsilon}_i \quad \quad \hat{\epsilon}_i \sim \mathbf{Normal}(0, 1.25)\] The estimates indicate a positive relation to liking of both attractiveness and intelligence. Note that the values of the slopes are different from those in Table 6.1 . The reason for this is that the model in the Table also includes fun as a predictor. Because the slopes reflect unique effects , these depend on all predictors included in the model. When there is dependence between the predictors (i.e. there is multicollinearity) both the estimates of the slopes and the corresponding significance tests will vary when you add or remove predictors from the model.

In the model above, a relative lack in physical attractiveness can be overcome by high intelligence, because in the end, the general liking of someone depends on the sum of both attractiveness and intelligence (each “scaled” by their corresponding slope). For example, someone with an attractiveness rating of \(\texttt{attr}_i = 8\) and an intelligence rating of \(\texttt{intel}_i = 2\) would be expected to be liked as much as a partner as someone with an attractiveness rating of \(\texttt{attr}_i = 3.538\) and an intelligence rating of \(\texttt{intel}_i = 8\) : \[\begin{aligned} \texttt{like}_i &= -0.073 + 0.527 \times 8 + 0.392 \times 2 = 4.924 \\ \texttt{like}_i &= -0.073 + 0.527 \times 3.538 + 0.392 \times 8 = 4.924 \end{aligned}\]

But what if for those lucky people who are very physically attractive, their intelligence doesn’t matter that much , or even at all ? And what if, for those lucky people who are very intelligent, their physical attractiveness doesn’t really matter much or at all? In other words, what if the more attractive people are, the less intelligence determines how much other people like them as a potential partner, and conversely, the more intelligent people are, the less attractiveness determines how much others like them as a potential partner? This implies that the effect of attractiveness on liking depends on intelligence, and that the effect of intelligence on liking depends on attractiveness. Such dependence is not captured by the multiple regression model above. While a relative lack of intelligence might be overcome by a relative abundance of attractiveness, for any level of intelligence, the additional effect of attractiveness is the same (i.e., an increase in attractiveness by one unit will always result in an increase of the predicted liking of 0.527).

Let’s define \(\beta_{\texttt{attr}|\texttt{intel}_i}\) as the slope of \(\texttt{attr}\) conditional on the value of \(\texttt{intel}_i\) . That is, we allow the slope of \(\texttt{attr}\) to vary as a function of \(\texttt{intel}\) . Similarly, we can define \(\beta_{\texttt{intel}|\texttt{attr}_i}\) as the slope of \(\texttt{intel}\) conditional on the value of \(\texttt{attr}\) . Our regression model can then be written as: \[\begin{equation} \texttt{like}_i = \beta_0 + \beta_{\texttt{attr}|\texttt{intel}_i} \times \texttt{attr}_i + \beta_{\texttt{intel} | \texttt{attr}_i} \times \texttt{intel}_i + \epsilon_i \tag{6.1} \end{equation}\] That’s a good start, but what would the value of \(\beta_{\texttt{attr}|\texttt{intel}_i}\) be? Estimating the slope of \(\texttt{attr}\) for each value of \(\texttt{intel}\) by fitting regression models to each subset of data with a particular value of \(\texttt{intel}\) is not really doable. We’d need lots and lots of data, and furthermore, we wouldn’t also be able to simultaneously estimate the value of \(\beta_{\texttt{intel} | \texttt{attr}_i}\) . We need to supply some structure to \(\beta_{\texttt{attr}|\texttt{intel}_i}\) to allow us to estimate its value without overcomplicating things.

6.1.3 Modeling slopes with linear models

One idea is to define \(\beta_{\texttt{attr}|\texttt{intel}_i}\) with a linear model: \[\beta_{\texttt{attr}|\texttt{intel}_i} = \beta_{\texttt{attr},0} + \beta_{\texttt{attr},1} \times \texttt{intel}_i\] This is just like a simple linear regression model, but now the “dependent variable” is the slope of \(\texttt{attr}\) . Defined in this way, the slope of \(\texttt{attr}\) is \(\beta_{\texttt{attr},0}\) when \(\texttt{intel}_i = 0\) , and for every one-unit increase in \(\texttt{intel}_i\) , the slope of \(\texttt{attr}\) increases (or decreases) by \(\beta_{\texttt{attr},1}\) . For example, let’s assume \(\beta_{\texttt{attr},0} = 1\) and \(\beta_{\texttt{attr},1} = 0.5\) . For someone with an intelligence rating of \(\texttt{intel}_i = 0\) , the slope of \(\texttt{attr}\) is \[\beta_{\texttt{attr}|\texttt{intel}_i} = 1 + 0.5 \times 0 = 1\] For someone with an intelligence rating of \(\texttt{intel}_i = 1\) , the slope of \(\texttt{attr}\) is \[\beta_{\texttt{attr}|\texttt{intel}_i} = 1 + 0.5 \times 1 = 1.5\] For someone with an intelligence rating of \(\texttt{intel}_i = 2\) , the slope of \(\texttt{attr}\) is \[\beta_{\texttt{attr}|\texttt{intel}_i} = 1 + 0.5 \times 2 = 2\] As you can see, for every increase in intelligence rating by 1 point, the slope of \(\texttt{attr}\) increases by 0.5. In such a model, there will be values of \(\texttt{intel}\) which result in a negative slope of \(\texttt{attr}\) . For instance, for \(\texttt{intel}_i = -4\) , the slope of \(\texttt{attr}\) is \[\beta_{\texttt{attr}|\texttt{intel}_i} = 1 + 0.5 \times (-4) = - 1\]

We can define the slope of \(\texttt{intel}\) in a similar manner as \[\beta_{\texttt{intel}|\texttt{attr}_i} = \beta_{\texttt{intel},0} + \beta_{\texttt{intel},1} \times \texttt{attr}_i\] When we plug these definitions into Equation (6.1) , we get \[\begin{aligned} \texttt{like}_i &= \beta_0 + (\beta_{\texttt{attr},0} + \beta_{\texttt{attr},1} \times \texttt{intel}_i) \times \texttt{attr}_i + (\beta_{\texttt{intel},0} + \beta_{\texttt{intel},1} \times \texttt{attr}_i) \times \texttt{intel}_i + \epsilon_i \\ &= \beta_0 + \beta_{\texttt{attr},0} \times \texttt{attr}_i + \beta_{\texttt{intel},0} \times \texttt{intel}_i + (\beta_{\texttt{attr},1} + \beta_{\texttt{intel},1}) \times (\texttt{attr}_i \times \texttt{intel}_i) + \epsilon_i \end{aligned}\]

Looking carefully at this formula, you can recognize a multiple regression model with three predictors: \(\texttt{attr}\) , \(\texttt{intel}\) , and a new predictor \(\texttt{attr}_i \times \texttt{intel}_i\) , which is computed as the product of these two variables. While it is thus related to both variables, we can treat this product as just another predictor in the model. The slope of this new predictor is the sum of two terms, \(\beta_{\texttt{attr},1} + \beta_{\texttt{intel},1}\) . Although we have defined these as different things (i.e. as the effect of \(\texttt{intel}\) on the slope of \(\texttt{attr}\) , and the effect of \(\texttt{attr}\) on the slope of \(\texttt{intel}\) , respectively), their value can not be estimated uniquely. We can only estimate their summed value. That means that moderation in regression is “symmetric”, in the sense that each predictor determines the slope of the other one. We can not say that it is just intelligence that determines the effect of attraction on liking, nor can we say that it is just attraction that determines the effect of intelligence on liking. The two variables interact and each determine the other’s effect on the dependent variable.

With that in mind, we can simplify the notation of the resulting model somewhat, by renaming the slopes of the two predictors to \(\beta_{\texttt{attr}} = \beta_{\texttt{attr},0}\) and \(\beta_{\texttt{intel}} = \beta_{\texttt{intel},0}\) , and using a single parameter for the sum \(\beta_{\texttt{attr} \times \texttt{intel}} = \beta_{\texttt{attr},1} + \beta_{\texttt{intel},1}\) :

\[\begin{equation} \texttt{like}_i = \beta_0 + \beta_{\texttt{attr}} \times \texttt{attr}_i + \beta_{\texttt{intel}} \times \texttt{intel}_i + \beta_{\texttt{attr} \times \texttt{intel}} \times (\texttt{attr} \times \texttt{intel})_i + \epsilon_i \end{equation}\]

Estimating this model gives \[\texttt{like}_i = -0.791 + 0.657 \times \texttt{attr}_i + 0.488 \times \texttt{intel}_i - 0.0171 \times \texttt{(attr}\times\texttt{intel)}_i + \hat{\epsilon}_i \] The estimate of the slope of the interaction, \(\hat{\beta}_{\texttt{attr} \times \texttt{intel}} = -0.017\) , is negative. That means that the higher the value of \(\texttt{intel}\) , the less steep the regression line relating \(\texttt{attr}\) to \(\texttt{like}\) . At the same time, the higher the value of \(\texttt{attr}\) , the less steep the regression line relating \(\texttt{intel}\) to \(\texttt{like}\) . You can interpret this as meaning that for more intelligent people, physical attractiveness is less of a defining factor in their liking by a potential partner. And for more attractive people, intelligence is less important.

A graphical view of this model, and the earlier one without moderation, is provided in Figure 6.2 . The plot on the left represents the model which does not allow for interaction. You can see that, for different values of intelligence, the model predicts parallel regression lines for the relation between attractiveness and liking. While intelligence affects the intercept of these regression lines, it does not affect the slope. In the plot on the right – although subtle – you can see that the regression lines are not parallel. This is a model with an interaction between intelligence and attractiveness. For different values of intelligence, the model predicts a linear relation between attractiveness and liking, but crucially, intelligence determines both the intercept and slope of these lines.

Figure 6.2: Liking as a function of attractiveness (intelligence) for different levels of intelligence (attractiveness), either without moderation or with moderation of the slope of attraciveness by intelligence. Note that the actual values of liking, attractiveness, and intelligence, are whole numbers (ratings on a scale between 1 and 10). For visualization purposes, the values have been randomly jittered by adding a Normal-distributed displacement term.

Note that we have constructed this model by simply including a new predictor in the model, which is computed by multiplying the values of \(\texttt{attr}\) and \(\texttt{intel}\) . While including such an “interaction predictor” has important implications for the resulting relations between \(\texttt{attr}\) and \(\texttt{like}\) for different values of \(\texttt{intel}\) , as well as the relations between \(\texttt{intel}\) and \(\texttt{like}\) for different values of \(\texttt{attr}\) , the model itself is just like any other regression model. Thus, parameter estimation and inference are exactly the same as before. Table 6.2 shows the results of comparing the full MODEL G (with three predictors) to different versions of MODEL R, where in each we fix one of the parameters to 0. As you can see, these comparisons indicate that we can reject the null hypothesis \(H_0\) : \(\beta_0 = 0\) , as well as \(H_0\) : \(\beta_{\texttt{attr}} = 0\) and \(H_0\) : \(\beta_{\texttt{intel}} = 0\) . However, as the p-value is above the conventional significance level of \(\alpha=.05\) , we would not reject the null hypothesis \(H_0\) : \(\beta_{\texttt{attr} \times \texttt{intel}} = 0\) . That implies that, in the context of this model, there is not sufficient evidence that there is an interaction. That may seem a little disappointing. We’ve done a lot of work to construct a model where we allow the effect of attractiveness to depend on intelligence, and vice versa. And now the hypothesis test indicates that there is no evidence that this moderation is present. As we will see later, there is evidence of this moderation when we also include \(\texttt{fun}\) in the model. I have left this predictor out of the model for now to keep things as simple as possible.

Table 6.2: Multiple regression predicting liking from attractiveness, intelligence, and their interaction.
\(\hat{\beta}\) \(\text{SS}\) \(\text{df}\) \(F\) \(p(\geq \lvert F \rvert)\)
Intercept -0.791 4.89 1 3.14 0.077
\(\texttt{attr}\) 0.657 113.65 1 72.91 0.000
\(\texttt{intel}\) 0.488 103.20 1 66.21 0.000
\(\texttt{intel} \times \texttt{attr}\) -0.017 4.74 1 3.04 0.081
Error 2345.89 1505

6.1.4 Simple slopes and centering

It is very important to realise that in a model with interactions, there is no single slope for any of the predictors involved in an interaction, that is particularly meaningful in principle. An interaction means that the slope of one predictor varies as a function of another predictor. Depending on which value of that other predictor you focus on, the slope of the predictor can be positive, negative, or zero. Let’s consider the model we estimated again: \[\texttt{like}_i = -0.791 + 0.657 \times \texttt{attr}_i + 0.488 \times \texttt{intel}_i - 0.0171 \times \texttt{(attr}\times\texttt{intel)}_i + \hat{\epsilon}_i \] If we fill in a particular value for intelligence, say \(\texttt{intel} = 1\) , we can write this as

\[\begin{aligned} \texttt{intel}_i &= -0.791 + 0.657 \times \texttt{attr}_i + 0.488 \times 1 -0.017 \times (\texttt{attr} \times 1)_i + \epsilon_i \\ &= (-0.791 + 0.488) + (0.657 -0.017) \times \texttt{attr}_i + \epsilon_i \\ &= -0.303 + 0.64 \times \texttt{attr}_i + \epsilon_i \end{aligned}\]

If we pick a different value, say \(\texttt{intel} = 10\) , the the model becomes \[\begin{aligned} \texttt{intel}_i &= -0.791 + 0.657 \times \texttt{attr}_i + 0.488 \times 10 -0.017 \times (\texttt{attr} \times 10)_i + \epsilon_i \\ &= (-0.791 + 0.488 \times 10) + (0.657 -0.017\times 10) \times \texttt{attr}_i + \epsilon_i \\ &= 4.09 + 0.486 \times \texttt{attr}_i + \epsilon_i \end{aligned}\] This shows that the higher the value of intelligence, the lower the slope of \(\texttt{attr}\) becomes. If you’d pick \(\texttt{intel} = 38.337\) , the slope would be exactly equal to 0. 17 Because there is not just a single value of the slope, testing whether “the” slope of \(\texttt{attr}\) is equal to 0 doesn’t really make sense, because there is no single value to represent “the” slope. What, then, does \(\hat{\beta}_\texttt{attr} = 0.657\) represent? Well, it is the (estimated) slope of \(\texttt{attr}\) when \(\texttt{intel}_i = 0\) . Similarly, \(\hat{\beta}_\texttt{intel} = 0.488\) is the estimated slope of \(\texttt{intel}\) when \(\texttt{attr}_i = 0\)

A significance test of the null hypothesis \(H_0\) : \(\beta_\texttt{attr} = 0\) is thus a test whether, when \(\texttt{intel} = 0\) , the slope of \(\texttt{attr}\) is 0. This test is easy enough to perform, but is it interesting to know whether liking is related to attractiveness for people who’s intelligence was rated as 0? Perhaps not. For one thing, the ratings were on a scale from 1 to 10, so no one could actually receive a rating of 0. Because the slope depends on \(\texttt{intel}\) and we know that for some value of \(\texttt{intel}\) , the slope of \(\texttt{attr}\) will equal 0, the hypothesis test will not be significant for some values of \(\texttt{intel}\) , and will be significant for others. At which value of \(\texttt{intel}\) we might want to perform such a test is up to us, but the result seems somewhat arbitrary.

That said, we might be interested in assessing whether there is an effect of \(\texttt{attr}\) for particular values of \(\texttt{intel}\) . For instance, whether, for someone with an average intelligence rating, their physical attractiveness matters for how much someone likes them as a potential partner. We can obtain this test by centering the predictors. Centering is basically just subtracting the sample mean of each value of a variable. So for example, we can center \(\texttt{attr}\) as follows: \[\texttt{attr_cent}_i = \texttt{attr}_i - \overline{\texttt{attr}}\] Centering does not affect the relation between variables. You can view it as a simple relabelling of the values, where the value which was the sample mean is now \(\texttt{attr_cent}_i = \overline{\texttt{attr}} - \overline{\texttt{attr}} = 0\) , all values below the mean are now negative, and values above the mean are now positive. The important part of this is that the centered predictor is 0 where the original predictor was at the sample mean. In a model with centered predictors \[\begin{align} \texttt{like}_i =& \beta_0 + \beta_{\texttt{attr_cent}} \times \texttt{attr_cent}_i + \beta_{\texttt{intel_cent}} \times \texttt{intel_cent}_i \\ &+ \beta_{\texttt{attr_cent} \times \texttt{intel_cent}} \times (\texttt{attr_cent} \times \texttt{intel_cent})_i + \epsilon_i \end{align}\] the slope \(\beta_{\texttt{attr_cent}}\) is, as usual, the slope of \(\texttt{attr_cent}\) whenever \(\texttt{intel_cent}_i = 0\) . We know that \(\texttt{intel_cent}_i = 0\) when \(\texttt{intel}_i = \overline{\texttt{intel}}\) . Hence, \(\beta_{\texttt{attr_cent}}\) is the slope of \(\texttt{attr}\) when \(\texttt{intel} = \overline{\texttt{intel}}\) , i.e. it represents the effect of \(\texttt{attr}\) for those with an average intelligence ratings.

Figure 6.3 shows the resulting model after centering both attractiveness and intelligence. When you compare this to the corresponding plot in Figure 6.2 , you can see that the only real difference is in the labels for the x-axis and the scale for intelligence. In all other respects, the uncentered and centered models predict the same relations between attractiveness and liking, and the models provide an equally good account, providing the same prediction errors.

Figure 6.3: Liking as a function of centered attractiveness for different levels of (centered) intelligence in a model including an interaction between attractiveness and intelligence. Note that the actual values of liking, attractiveness, and intelligence, are whole numbers (ratings on a scale between 1 and 10). For visualization purposes, the values have been randomly jittered by adding a Normal-distributed displacement term.

The results of all model comparisons after centering are given in Table 6.3 . A first important thing to notice is that centering does not affect the estimate and test of the interaction term . The slope of the interaction predictor reflects the increase in the slope relating \(\texttt{attr}\) to \(\texttt{like}\) for every one-unit increase in \(\texttt{intel}\) . Such changes to the steepness of the relation between \(\texttt{attr}\) and \(\texttt{like}\) should not – and are not – affected by changing the 0-point of the predictors through centering. A second thing to notice is that centering changes the estimates and test of the “simple slopes” and intercept . In the centered model, the simple slope \(\hat{\beta}_\texttt{attr_cent}\) reflects the effect of \(\texttt{attr}\) on \(\texttt{like}\) for cases with an average rating on \(\texttt{intel}\) . In Figure 6.3 , this is (approximately) the regression line in the middle. In the uncentered model, the simple slope \(\hat{\beta}_\texttt{attr}\) reflects the effect of \(\texttt{attr}\) on \(\texttt{like}\) for cases with \(\texttt{intel} = 0\) . In the top right plot in Figure 6.2 , this is (approximately) the lower regression line. This latter regression line is quite far removed from most of the data, because there are no cases with an intelligence rating of 0. The regression line for people with an average intelligence rating lies much more “within the cloud of data points”, and reflects the model predictions for many more cases in the data. As a result, the reduction in the SSE that can be attributed to the simple slope is much higher in the centered model (Table 6.3 ) than the uncentered one (Table 6.2 ). This results in a much higher \(F\) statistic. You can also think of this as follows: because there are hardly any cases with an intelligence rating close to 0, estimating the effect of attractiveness on liking for these cases is rather difficult and unreliable. Estimating the effect of attractiveness on liking for cases with an average intelligence rating is much more reliable, because there are many more cases with a close-to-average intelligence rating.

Table 6.3: Null-hypothesis significance tests after centering both predictors.
\(\hat{\beta}\) \(\text{SS}\) \(\text{df}\) \(F\) \(p(\geq \lvert F \rvert)\)
Intercept 6.213 53011.95 1 34009.69 0.000
\(\texttt{attr_cent}\) 0.528 1354.68 1 869.09 0.000
\(\texttt{intel_cent}\) 0.380 384.96 1 246.97 0.000
\(\texttt{intel_cent} \times \texttt{attr_cent}\) -0.017 4.74 1 3.04 0.081
Error 2345.89 1505

6.1.5 Don’t forget about fun! A model with multiple interactions

Up to now, we have looked at a model with two predictors, attractiveness and intelligence, and have allowed for an interaction between these. To simplify the discussion a little, we have not included \(\texttt{fun}\) in the model. It is relatively straightforward to extend this idea to multiple predictors. For instance, it might also be the case that the effect of \(\texttt{fun}\) is moderated by \(\texttt{intel}\) . To investigate this, we can estimate the following regression model:

\[\begin{aligned} \texttt{like}_i =& \beta_0 + \beta_{\texttt{attr}} \times \texttt{attr}_i + \beta_{\texttt{intel}} \times \texttt{intel}_i + \beta_{\texttt{fun}} \times \texttt{fun}_i \\ &+ \beta_{\texttt{attr} \times \texttt{intel}} \times (\texttt{attr} \times \texttt{intel})_i + \beta_{\texttt{fun} \times \texttt{intel}} \times (\texttt{fun} \times \texttt{intel})_i + \epsilon_i \end{aligned}\]

The results, having centered all predictors, are given in Table 6.4 . As you can see there, the simple slopes of \(\texttt{attr}\) , \(\texttt{intel}\) , and \(\texttt{fun}\) are all positive. Each of these represents the effect of that predictor when the other predictors have the value 0. Because the predictors are centered, that means that e.g. the slope of \(\texttt{attr}\) reflects the effect of attractiveness for people with an average rating on intelligence and fun. As before, the estimated interaction between \(\texttt{attr}\) and \(\texttt{intel}\) is negative, indicating that attractiveness has less of an effect on liking for those seen as more intelligent, and that intelligence has less of an effect for those seen as more attractive. The hypothesis test of this effect is now also significant, indicating that we have reliable evidence for this moderation. This shows that by including more predictors in a model, it is possible to increase the reliability of the estimates for other predictors. There is also a significant interaction between \(\texttt{fun}\) and \(\texttt{intel}.\) The estimated interaction is positive here. This indicates that fun has more of an effect on liking for those seen as more intelligent, and that intelligence has more of an effect for those seen as more fun. Perhaps you can think of a reason why intelligence appears to lessen the effect of attractiveness, but appears to strengthen the effect of fun…

Table 6.4: A model predicting liking from attractiveness, intelligence, and fun, and their interactions. All predictors are centered.
\(\hat{\beta}\) \(\text{SS}\) \(\text{df}\) \(F\) \(p(\geq \lvert F \rvert)\)
Intercept 6.196 49585.8 1 38655.19 0.000
\(\texttt{attr}\) 0.345 414.1 1 322.80 0.000
\(\texttt{intel}\) 0.258 154.4 1 120.35 0.000
\(\texttt{fun}\) 0.383 429.0 1 334.41 0.000
\(\texttt{attr} \times \texttt{intel}\) -0.043 17.6 1 13.69 0.000
\(\texttt{fun} \times \texttt{intel}\) 0.032 10.0 1 7.83 0.005
Error 1888.2 1472

6.2 Mediation

6.2.1 legacy motives and pro-environmental behaviours.

Zaval, Markowitz, & Weber ( 2015 ) investigated whether there is a relation between individuals’ motivation to leave a positive legacy in the world, and their pro-environmental behaviours and intentions. The authors reasoned that long time horizons and social distance are key psychological barriers to pro-environmental action, particularly regarding climate change. But if people with a legacy motivation put more emphasis on future others than those without such motivation, they may also be motivated to behave more pro-environmentally in order to benefit those future others. In a pilot study, they recruited a diverse sample of 245 U.S. participants through Amazon’s Mechanical Turk. Participants answered three sets of questions: one assessing individual differences in legacy motives, one assessing their beliefs about climate change, and one assessing their willingness to take pro-environmental action. Following these sets of questions, participants were told they would be entered into a lottery to win a $10 bonus. They were then given the option to donate part (between $0 and $10) of their bonus to an environmental cause (Trees for the Future). This last measure was meant to test whether people actually act on any intention to act pro-environmentally.

For ease of analysis, the three sets of questions measuring legacy motive, belief about the reality of climate change, and intention to take pro-environmental action, were transformed into three overall scores by computing the average over the items in each set. After eliminating participants who did not answer all questions, we have data from \(n = 237\) participants. Figure 6.4 depicts the pairwise relations between the four variables. As can be seen, all variables are significantly correlated. The relation is most obvious for \(\texttt{belief}\) and \(\texttt{intention}\) . Looking at the histogram of \(\texttt{donation}\) , you can see that although all whole amounts between $0 and $10 have been chosen at least once, it looks like three values were particularly popular, namely $0, $5, and to a lesser extent $10. This results in what looks like a tri-modal distribution. This is not necessarily an issue when modelling \(\texttt{donation}\) with a regression model, as the assumptions in a regression model concern the prediction errors , and not the dependent variable itself.

Figure 6.4: Pairwise plots for legacy motives, climate change belief, intention for pro-environmental action, and donations.

According to the Theory of Planned Behavior ( Ajzen, 1991 ) , attitudes and norms shape a person’s behavioural intentions, which in turn result in behaviour itself. In the context of the present example, that could mean that legacy motive and climate change beliefs do not directly determine whether someone behaves in a pro-environmental way. Rather, these factors shape a person’s intentions towards pro-environmental behaviour, which in turn may actually lead to said pro-environmental behaviour. This is an example of an assumed causal chain , where legacy motive (partly) determines behavioural intention, and intention determines behaviour. Mediation analysis is aimed at detecting an indirect effect of a predictor (e.g.  \(\texttt{legacy}\) ) on the dependent variable (e.g.  \(\texttt{donation}\) ), via another variable called the mediator (e.g.  \(\texttt{intention}\) ), which is the middle variable in the causal chain.

6.2.2 Causal steps

A traditional method to assess mediation is the so-called causal steps approach ( Baron & Kenny, 1986 ) . The basic idea behind the causal steps approach is as follows: if there is a causal chain from predictor ( \(X\) ) to mediator ( \(M\) ) to dependent variable ( \(Y\) ), then, ignoring the mediator for the moment, we should be able to see a relation between the predictor and dependent variable. This relation reflects the indirect effect of the predictor on the dependent variable. We should also be able to detect an effect of the predictor on the mediator, as well as an effect of the mediator on the dependent variable. Crucially, if there is a true causal chain, then the predictor should not offer any additional predictive power over the mediator. Because the effect of the predictor is assumed to go only “through” the mediator, once we know the value of the mediator, this should be all we need to predict the dependent variable. In more fancy statistical terms, this means that conditional on the mediator, the dependent variable is independent of the predictor, i.e.  \(p(Y \mid M, X) = p(Y \mid M)\) . In the context of a multiple regression model, we could say that in a model where we predict \(Y\) from \(M\) , the predictor \(X\) would not have a unique effect on \(Y\) (i.e. its slope would equal \(\beta_X = 0\) ).

The causal steps (Figure 6.5 ) approach involves assessing a pattern of significant relations in three different regression models. The first model is a simple regression model where we predict \(Y\) from \(X\) . In this model, we should find evidence for a relation between \(X\) and \(Y\) , meaning that we can reject the null hypothesis that the slope of \(X\) on \(Y\) (referred to here as \(\beta_X = c\) ) equals 0. The second model is a simple regression model where we predict \(M\) from \(X\) . In this model, we should find evidence for a relation between \(X\) and \(M\) , meaning that we can reject the null hypothesis that the slope of \(X\) on \(M\) (referred to here as \(\beta_X = a\) here) equals 0. The third model is a multiple regression model where we predict \(Y\) from both \(M\) and \(X\) . In this model, we should find evidence for a unique relation between \(M\) and \(Y\) , meaning that we can reject the null hypothesis that the slope of \(M\) on \(Y\) (referred to here as \(\beta_M = b\) here) equals 0. Controlling for the effect of \(M\) on \(Y\) , in a true causal chain, there should no longer be evidence for a relation between \(X\) and \(Y\) (as any relation between \(X\) and \(Y\) is captured through \(M\) ). Hence, we should not be able to reject the null hypothesis that the slope of \(X\) on \(Y\) in this model (referred to here as \(\beta_X = c\) ’, to distinguish it from the relation between \(X\) and \(Y\) in the first model, which was labelled as \(c\) ) equals 0. If this is so, then we speak of full mediation . When there is still evidence of a unique relation between \(X\) and \(Y\) in the model that includes \(M\) , but the relation is reduced (i.e.  \(|c'| < |c|\) ), we speak of partial mediation .

Figure 6.5: Assessing mediation with the causal steps approach involves testing parameters of three models. MODEL 1 is a simple regression model predicting \(Y\) from \(X\) and the slope of \(X\) ( \(c\) ) should be significant MODEL 2 is a simple regression model predicting \(M\) from \(X\) and the slope of \(X\) ( \(a\) ) should be significant. MODEL 3 is a multiple regression model predicting \(Y\) from both \(X\) and \(M\) . The slope of \(M\) ( \(b\) ) should be significant. The slope of \(X\) ( \(c\) ’) should not be significant (“full” mediation) or be substantially smaller in absolute value (“partial” mediation).

6.2.2.1 Testing mediation of legacy motive by intention with the causal steps approach

Let’s see how the causal steps approach works in practice by assessing whether the relation between \(\texttt{legacy}\) on \(\texttt{donation}\) is mediated by \(\texttt{intention}\) .

In MODEL 1 (Table 6.5 ), we assess the relation between \(\texttt{legacy}\) and \(\texttt{donation}\) . In this model, we find a significant and positive relation between legacy motives and donations, such that people with stronger legacy motives donate more of their potential bonus to a pro-environmental cause. The question is now whether this is a direct effect of legacy motive, or an indirect effect “via” behavioural intent.

Table 6.5: Model 1: Simple regression model predicting donations from legacy motive
\(\hat{\beta}\) \(\text{SE}(\hat{\beta})\) \(t\) \(p(\geq \lvert t \rvert)\)
Intercept -0.325 0.833 -0.39 0.697
Legacy motive 0.733 0.198 3.70 0.000

In MODEL 2 (Table 6.6 ), we assess the relation between \(\texttt{legacy}\) and \(\texttt{intention}\) . In this model, we find a significant and positive relation between legacy motives and intention to act pro-environmentally, such that people with stronger legacy motives have a stronger intention to act pro-environmentally.

Table 6.6: Model 2: Simple regression model predicting pro-environmental intent from legacy motive
\(\hat{\beta}\) \(\text{SE}(\hat{\beta})\) \(t\) \(p(\geq \lvert t \rvert)\)
Intercept 1.785 0.246 7.25 0
Legacy motive 0.267 0.059 4.56 0

In MODEL 3 (Table 6.7 ), we assess the relation between \(\texttt{legacy}\) , \(\texttt{intention}\) , and \(\texttt{donation}\) . In this model, we find a significant and positive relation between intention to act pro-environmentally and donation to a pro-environmental cause, such that people with stronger intentions donate more. We also find evidence of a unique and positive effect of legacy motive on donation, such that people with stronger legacy motives donate more. Because there is still evidence of an effect of legacy motive on donations, after controlling for the effect of behavioural intent, we would not conclude that the effect of legacy motive is fully mediated by intent. When you compare the slope of \(\texttt{legacy}\) in MODEL 3 to that in MODEL 1, you can however see that the (absolute) value is smaller. Hence, when controlling for the effect of behavioural intent, a one-unit increase in \(\texttt{legacy}\) is estimated to increase the amount of donation less then in a model where \(\texttt{intention}\) is not taken into account.

Table 6.7: Model 3: Multiple regression model predicting donations from legacy motive and pro-environmental intent.
\(\hat{\beta}\) \(\text{SE}(\hat{\beta})\) \(t\) \(p(\geq \lvert t \rvert)\)
Intercept -1.961 0.889 -2.21 0.028
Legacy motive 0.488 0.200 2.45 0.015
Behavioral intent 0.917 0.213 4.30 0.000

In conclusion, the causal steps approach indicates that the effect of legacy motive of pro-environmental action (donations) is partially mediated by pro-environmental behavioural intentions. There is a residual direct effect of legacy motive on donations that is not captured by behavioural intentions.

6.2.3 Estimating the mediated effect

One potential problem with the causal steps approach is that it is based on a pattern of significance in four hypothesis tests (one for each parameter \(a\) , \(b\) , \(c\) , and \(c'\) ). This can result in a rather low power of the procedure ( MacKinnon, Fairchild, & Fritz, 2007 ) , which seems to be particularly related to the requirement of a significant \(c\) (the direct effect of \(X\) on \(Y\) in the model without the mediator).

An alternative to the causal steps approach is to estimate the mediated (indirect) effect of the predictor on the dependent variable directly. Algebraically, this mediated effect can be worked out as ( MacKinnon et al., 2007 ) :

\[\begin{equation} \text{mediated effect} = a \times b \end{equation}\]

The rationale behind this is reasonably straightforward. The slope \(a\) reflects the increase in the mediator \(M\) for every one-unit increase in the predictor \(X\) . The slope \(b\) reflects the increase in the dependent variable \(Y\) for every one unit increase in the mediator. So a one-unit increase in \(X\) implies an increase in \(M\) by \(a\) units, which in turn implies an increase in \(Y\) of \(a \times b\) units. Hence, the mediated effect can be expressed as \(a \times b\) .

In a single mediator model such as the one looked at here, the mediated effect \(a \times b\) turns out to be equal to \(c - c'\) , i.e. the difference between the direct effect of \(X\) on \(Y\) in a model without the mediator, and the unique direct effect of \(X\) on \(Y\) in a model which includes the mediator.

To test whether the mediated effect differs from 0, we can try to work out the sampling distribution of the estimated effect \(\hat{a} \times \hat{b}\) , under the null-hypothesis that in reality, \(a \times b = 0\) . Note that this null hypothesis can be true when \(a = 0\) , \(b = 0\) , or both \(a = b = 0\) . In the so-called Sobel-Aroian test, this sampling distribution is assumed to be Normal. However, it has been found that this assumption is often inaccurate. As there is no method to derive an accurate sampling distribution analytically, modern procedures rely on simulation. There are different ways to do this, but we’ll focus on one, namely the nonparametric bootstrap approach ( Preacher & Hayes, 2008 ) . This involves generating a large number (e.g.  \(>1000\) ) of simulated datasets by randomly sampling \(n\) cases with replacement from the original dataset. This means that any given case (i.e. a row in the dataset) can occur 0, 1, 2, times in a simulated dataset. For each simulated dataset, we can estimate \(\hat{a} \times \hat{b}\) by fitting the two corresponding regression models. The variance in these estimates over the different datasets forms an estimate of the variance of the sampling distribution. A 95% confidence interval can then also be computed through by determining the 2.5 and 97.5 percentiles. Because just the original data is used, there is no direct assumption made about the distribution of the variables, apart from that the original data is a representative sample from the Data Generating Process. Applying this procedure (with 1000 simulated datasets) provides a 95% confidence interval for \(a \times b\) of \([0.104, 0.446]\) . As this interval does not contain the value 0, we reject the null hypothesis that the mediated effect of \(\texttt{legacy}\) on \(\texttt{donation}\) “via” \(\texttt{intention}\) equals 0.

Note that in solely focusing on the mediated effect, we do not address the issue of total vs partial mediation. Using our simulated datasets, we can however also compute a bootstrap confidence interval for \(c'\) . For the present set of simulations, the 95% confidence interval for \(c'\) is \([0.189, 0.807]\) . As this interval does not contain the value 0, we reject the null hypothesis that the unique direct effect of \(\texttt{legacy}\) on \(\texttt{donation}\) equals 0. This thus provides a similar conclusion to the causal steps approach.

Here, we analyse only a subset of their data. ↩︎

Note that I’m using more descriptive labels here. If you prefer the more abstract version, then you can replace \(Y_i = \texttt{like}_i\) , \(\beta_1 = \beta_{\texttt{attr}}\) , \(X_{1,i} = \texttt{attr}_i\) . \(\beta_2 = \beta_{\texttt{intel}}\) , \(X_{2,i} = \texttt{intel}_i\) . ↩︎

The value for which the slope is 0 is easily worked out as \(\frac{\hat{\beta}_\texttt{attr}}{- \hat{\beta}_{\texttt{attr} \times \texttt{intel}}}\) . ↩︎

Recoding Introduction to Mediation, Moderation, and Conditional Process Analysis

8 extending the fundamental principles of moderation analysis.

As Hayes opened, “in this chapter, [we’ll see] how [the] principles of moderation analysis are applied when the moderator is dichotomous (rather than a continuum, as in the previous chapter) as well as when both focal antecedent and moderator are continuous (p. 267).”

8.1 Moderation with a dichotomous moderator

Here we load a couple necessary packages, load the data, and take a glimpse() .

Regardless of whether the antecedent variables are continuous or binary, the equation for the simple moderation is still

\[Y = i_Y + b_1 X + b_2 W + b_3 XW + e_Y.\]

We can use that equation to fit our first moderation model with a binary \(W\) (i.e., frame ) like so.

Check the summary.

We’ll compute our Bayeisan \(R^2\) in the typical way.

This model should look familiar to you because it is exactly the same model estimated in the analysis presented in Chapter 7 (see Table 7.4, model 3). The only differences between these two analyses are how the corresponding question is framed, meaning which variable is deemed the focal antecedent and which is the moderator, and how these variables are symbolically labeled as \(X\) and \(W\) . In the analysis in Chapter 7, the focal antecedent variable was a dichotomous variable coding the framing of the cause of the disaster (labeled \(X\) then, but \(W\) now), whereas in this analysis, the focal antecedent is a continuous variable placing each person on a continuum of climate change skepticism (labeled \(W\) then, but \(X\) now), with the moderator being a dichotomous variable coding experimental condition. So this example illustrates the symmetry property of interactions introduced in section 7.1. (p. 272)

8.1.1 Visualizing and probing the interaction.

For the plots in this chapter, we’ll take our color palette from the ochRe package , which provides Australia-inspired colors. We’ll also use a few theme settings from good-old ggthemes . As in the last chapter, we’ll save our adjusted theme settings as an object, theme_08 .

Happily, the ochRe package has a handy convenience function, viz_palette() , that makes it easy to preview the colors available in a given palette. We’ll be using “olsen_qual” and “olsen_seq”.

moderation hypothesis formulation

Behold our Figure 8.3.

moderation hypothesis formulation

In addition to our fancy Australia-inspired colors, we’ll also play around a bit with spaghetti plots in this chapter. To my knowledge, this use of spaghetti plots is uniquely Bayesian. If you’re trying to wrap your head around what on earth we just did, take a look at the first few rows from posterior_samples() object, post .

The head() function returned six rows, each one corresponding to the credible parameter values from a given posterior draw. The lp__ is uniquely Bayesian and beyond the scope of this project. You might think of sigma as the Bayesian analogue to what the OLS folks often refer to as error or the residual variance. Hayes doesn’t tend to emphasize it in this text, but it’s something you’ll want to pay increasing attention to as you move along in your Bayesian career. All the columns starting with b_ are the regression parameters, the model coefficients or the fixed effects. But anyways, notice that those b_ columns correspond to the four parameter values in Formula 8.2 on page 270. Here they are, but reformatted to more closely mimic the text:

  • \(\hat{Y}\) = 2.171 + 0.181 \(X\) + -0.621 \(W\) + 0.181 XW
  • \(\hat{Y}\) = 2.113 + 0.193 \(X\) + -0.535 \(W\) + 0.175 XW
  • \(\hat{Y}\) = 2.58 + 0.047 \(X\) + -0.339 \(W\) + 0.166 XW
  • \(\hat{Y}\) = 2.609 + 0.046 \(X\) + -0.394 \(W\) + 0.159 XW
  • \(\hat{Y}\) = 2.408 + 0.127 \(X\) + -0.591 \(W\) + 0.196 XW
  • \(\hat{Y}\) = 2.384 + 0.115 \(X\) + -0.596 \(W\) + 0.23 XW

Each row of post , each iteration or posterior draw, yields a full model equation that is a credible description of the data–or at least as credible as we can get within the limits of the model we have specified, our priors (which we typically cop out on and just use defaults in this project), and how well those fit when applied to the data at hand. So when we use brms convenience functions like fitted() , we pass specific predictor values through those 4000 unique model equations, which returns 4000 similar but distinct expected \(Y\) -values. So although a nice way to summarize those 4000 values is with summaries such as the posterior mean/median and 95% intervals, another way is to just plot an individual regression line for each of the iterations. That is what’s going on when we depict out models with a spaghetti plot.

The thing I like about spaghetti plots is that they give a three-dimensional sense of the posterior. Note that each individual line is very skinny and semitransparent. When you pile a whole bunch of them atop each other, the peaked or most credible regions of the posterior are the most saturated in color. Less credible posterior regions almost seamlessly merge into the background. Also, note how the combination of many similar but distinct straight lines results in a bowtie shape. Hopefully this clarifies where that shape’s been coming from when we use geom_ribbon() to plot the 95% intervals.

Back to the text, on the bottom of page 274, Hayes pointed out the conditional effect of skeptic when frame == 1 is \(b_1 + b_3 = 0.306\) . We can show that with a little arithmetic followed up with tidybayes::mean_qi() .

But anyways, you could recode frame in a number of ways, including if_else() or, in this case, by simple arithmetic.

With frame_ep in hand, we’re ready to refit the model.

Our results match nicely with the formula on page 275.

If you want to follow along with Hayes on page 276 and isolate the 95% credible intervals for the skeptic parameter, you can use the posterior_interval() function.

8.2 Interaction between two quantitative variables

Here’s the glbwarm data.

In this section we add three covariates (i.e., \(C\) variables) to the basic moderation model. Although Hayes made a distinction between the \(X\) , \(M\) , and \(C\) variables in the text, that distinction is conceptual and doesn’t impact the way we enter them into brm() . Rather, the brm() formula clarifies they’re all just predictors.

Our results cohere nicely with the Hayes’s formula in the middle of page 278 or with the results he displayed in Table 8.2.

Here’s the \(R^2\) summary.

As the \(R^2\) is a good bit away from the boundaries, it’s nicely Gaussian.

moderation hypothesis formulation

8.2.1 Visualizing and probing the interaction.

For our version of Figure 8.5, we’ll need to adjust our nd data for fitted() .

Our fitted() and ggplot2 code will be quite similar to the last spaghetti plot. Only this time we’ll use filter() to reduce the number of posterior draws we show in the plot.

moderation hypothesis formulation

When we reduce the number of lines depicted in the plot, we lose some of the three-dimensional illusion. It’s nice, however, to get a closer look to each individual line. To each their own.

We’ll continue with our spaghetti plot approach for Figure 8.7. Again, when we made the JN technique plot for Chapter 7, we computed values for the posterior mean and the 95% intervals. Because the intervals follow a bowtie shape, we had to compute the \(Y\) -values for many values across the x-axis in order to make the curve look smooth. But as long as we stick with the spaghetti plot approach, all we need are the values at the endpoints of each iteration. Although each line is straight, the combination of many lines is what produces the bowtie effect.

moderation hypothesis formulation

In other words, each of those orange lines is a credible expression of \(\theta_{X \rightarrow Y}\) (i.e., \(b_1 + b_3 W\) ) across a continuous range of \(W\) values.

8.3 Hierarchical versus simultaneous entry

Many investigators test a moderation hypothesis in regression analysis using a method that on the surface seems different than the procedure described thus far. This alternative approach is to build a regression model by adding the product of \(X\) and \(W\) to a model already containing \(X\) and \(W\) . This procedure is sometimes called hierarchical regression or hierarchical variable entry (and easily confused by name with hierarchical linear modeling , which is an entirely different thing). The goal using this method is to determine whether allowing \(X\) ’s effect to be contingent on \(W\) produces a better fitting model than one in which the effect of \(X\) is constrained to be unconditional on \(W\) . According to the logic of hierarchical entry, if the contingent model accounts for more of the variation in Y than the model that forces \(X\) ’s effect to be independent of \(W\) , then the better model is one in which \(W\) is allowed to moderate \(X\) ’s effect. Although this approach works, it is a widely believed myth that it is necessary to use this approach in order to test a moderation hypothesis. (p. 289, emphasis in the original)

Although this method is not necessary, it can be handy to slowly build your model. This method can also serve nice rhetorical purposes in a paper. Anyway, here’s our multivariable but non-moderation model, model8.4 .

Here we’ll compute the corresponding \(R^2\) and compare it with the one for the original interaction model with a difference score.

Note that the Bayesian \(R^2\) performed differently than the \(F\) -test in the text.

moderation hypothesis formulation

We can also compare these with the LOO, which, as is typical of information criteria, corrects for model complexity. First, we compute them and attach the results to the model fit objects.

Now use the loo_compare() function to compare them directly.

As a reminder, we generally prefer models with lower information criteria, which in this case is clearly the moderation model (i.e., model8.1 ). However, the standard error value (i.e., se_diff ) for the difference (i.e., elpd_diff ) is quite large, which suggests that the model with the lowest value isn’t the clear winner. Happily, these results match nicely with the Bayesian \(R^2\) difference score. The moderation model appears somewhat better than the multivariable model, but its superiority is hardly decisive.

8.4 The equivalence between moderated regression analysis and a 2 X 2 factorial analysis of variance

I’m just not going to encourage ANOVA \(F\) -testing methodology. However, I will show the Bayesian regression model. First, here are the data.

Fit the moderation model.

Those results don’t look anything like what Hayes reported in Tables 8.3 or 8.4. However, a little deft manipulation of the posterior samples can yield equivalent results to Hayes’s Table 8.3.

Here are the cell-specific means in Table 8.3.

And here are the marginal means from Table 8.3.

For kicks and giggles, here are what the cell-specific means look like in box plots.

moderation hypothesis formulation

And here are the same for the marginal means. This time we’ll show the shapes of the posteriors with violin plots with horizontal lines depicting the median and interquartile ranges.

moderation hypothesis formulation

On page 294, Hayes used point estimates to compute the simple effect of policy information among Kerry supporters and then the same thing among Bush supporters. Here’s how we’d do that when working with the full vector of posterior draws.

So then computing the main effect for policy information using the simple effects is little more than an extension of those steps.

And we get the same results by strategically subtracting the marginal means.

The main effect of for candidate is similarly computed using either approach.

We don’t have an \(F\) -test for our Bayesian moderation model. But we do have an interaction term. Here’s its distribution.

moderation hypothesis formulation

Following Hayes’s work on the bottom of page 295, here’s how you’d reproduce that by manipulating our \(\overline Y\) vectors.

Extending that logic, we also get the answer this way.

8.4.1 Simple effects parameterization.

We might reacquaint ourselves with the formula from model8.5 .

The results cohere nicely with the “Model 1” results at the top of Table 8.5.

The Bayesian \(R^2\) portion looks on point, too.

Our various Y_bar transformations from before continue to cohere with the coefficients, above, just like in the text. E.g., the policy coefficient may be returned like so.

We can continue to use Hayes’s Y_bar transformations to return the kerry coefficient, too.

Here we compute \(b_3\) with the difference between the simple effects of \(X\) at levels of \(W\) .

And now \(b_{3}\) with the difference between the simple effects of \(W\) at levels of \(X\) .

8.4.2 Main effects parameterization.

A nice feature of brms is you can transform your data right within the brm() or update() functions. Here we’ll make our two new main-effects-coded variables, policy_me and kerry_me , with the mutate() function right within update() .

Transforming your data within the brms functions won’t change the original data structure. However, brms will save the data used to fit the model within the brm() object. You can access that data like so.

But we digress. Here’s our analogue to the “Model 2” portion of Table 8.5.

Like with model8.6 , above, we’ll need a bit of algebra to compute our \(\overline Y_i\) vectors.

With our post for fit5 in hand, we’ll follow the formulas at the top of page 298 to compute our \(b_1\) and \(b_2\) distributions.

Hayes pointed out that the interaction effect, \(b_3\) , is the same across models his OLS Models 1 and 2. This is largely true for our Bayesian HMC model8.5 and model8.6 models.

However, the results aren’t exactly the same because of simulation error. If you were working on a project requiring high precision, increase the number of posterior iterations. To demonstrate, here we’ll increase each chain’s post-warmup iteration count by an order of magnitude, resulting in 80,000 post-warmup iterations rather than the default 4,000.

Now they’re quite a bit closer.

And before you get fixate on how there are still differences after 80,000 iterations, each, consider comparing the two density plots.

moderation hypothesis formulation

8.4.3 Conducting a \(2 \times 2\) between-participants factorial ANOVA using PROCESS another regression model with brms.

Since we’re square in single-level regression land with our brms approach, there’s no direct analogue for us, here. However, notice the post-ANOVA \(t\) -tests Hayes presented on page 300. If we just want to consider the \(2 \times 2\) structure of our two dummy variables as indicative of four groups, we have one more coding system available for the job. With the handy str_c() function, we’ll concatenate the policy and kerry values into a nominal variable, policy_kerry . Here’s what that looks like:

Now check out what happens if we reformat our formula to interest ~ 0 + policy_kerry .

The brm() function recnognized policy_kerry was a character vector and treated it as a nominal variable. The 0 + part of the function removed the model intercept. Here’s how that effects the output.

Without the typical intercept, brm() estimated the means for each of the four policy_kerry groups. It’s kinda like an intercept-only model, but with four intercepts. Here’s what their densities look like:

moderation hypothesis formulation

Since each of the four primary vectors in our post object is of a group mean, it’s trivial to compute difference scores. To compute the difference score analogous to Hayes’s two \(t\) -tests, we’d do the following.

Hayes, A. F. (2018). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach. (2nd ed.). New York, NY, US: The Guilford Press.

Session info

Logo for University of Southern Queensland

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Section 7.3: Moderation Models, Assumptions, Interpretation, and Write Up

Learning Objectives

At the end of this section you should be able to answer the following questions:

  • What are some basic assumptions behind moderation?
  • What are the key components of a write up of moderation analysis?

Moderation Models 

Difference between mediation & moderation.

The main difference between a simple interaction, like in ANOVA models or in moderation models, is that mediation implies that there is a causal sequence. In this case, we know that stress causes ill effects on health, so that would be the causal factor.

Some predictor variables interact in a sequence, rather than impacting the outcome variable singly or as a group (like regression).

Moderation and mediation is a form of regression that allows researchers to analyse how a third variable effects the relationship of the predictor and outcome variable.

Moderation analyses imply an interaction on the different levels of M

PowerPoint: Basic Moderation Model

Consider the below model:

  • Chapter Seven – Basic Moderation Model

Would the muscle percentage be the same for young, middle-aged, and older participants after training? We know that it is harder to build muscle as we age, so would training have a lower effect on muscle growth in older people?

Example Research Question:

Does cyberbullying moderate the relationship between perceived stress and mental distress?

Moderation Assumptions

  • The dependent and independent variables should be measured on a continuous scale.
  • There should be a moderator variable that is a nominal variable with at least two groups.
  • The variables of interest (the dependent variable and the independent and moderator variables) should have a linear relationship, which you can check with a scatterplot.
  • The data must not show multicollinearity (see Multiple Regression).
  • There should be no significant outliers, and the distribution of the variables should be approximately normal.

Moderation Interpretation

PowerPoint: Moderation menu, results and output

Please have a look at the following link for the Moderation Menu and Output:

  • Chapter Seven – Moderation Output

Interpretation

The effects of cyberbullying can be seen in blue, with the perceived stress in green. These are the main effects of the X and M variable on the outcome variable (Y). The interaction effect can be seen in purple. This will tell us if perceived stress is effecting mental distress equally for average, lower than average or higher than average levels of cyberbullying. If this is significant, then there is a difference in that effect. As can be seen in yellow and grey, cyberbullying has an effect on mental distress, but the effect is stronger for those who report higher levels of cyberbullying (see graph).

Simple slope plot

Moderation Write Up

The following text represents a moderation write up:

A moderation test was run, with perceived stress as the predictor, mental distress as the dependant, and cyberbullying as a moderator.  There was a significant main effect found between perceived stress and mental distress, b = -1.23, BCa CI [1.11, 1.34], z =21.38 , p <.001, and nonsignificant main effect of cyberbullying on mental distress b = 1.05, BCa CI [0.72, 1.38], z=6.28, p < .001. There was a significant interaction found by cyberbullying on perceived stress and mental distress, b = -0.05, BCa CI [0.01, 0.09], z=2.16, p =.031. It was found that participants who reported higher than average levels of cyberbullying experienced a greater effect of perceived stress on mental distress ( b = 1.35, BCa CI [1.19, 1.50], z=17.1, p < .001), when compared to average or lower than average levels of cyberbullying ( b = 1.23, BCa CI [1.11, 1.34], z=21.3, p < .001, b = 1.11, BCa CI [0.95, 1.27], z=13.8, p < .001, respectively). From these results, it can be concluded that the effect of perceived stress on mental distress is partially moderated by cyberbullying.

Statistics for Research Students Copyright © 2022 by University of Southern Queensland is licensed under a Creative Commons Attribution 4.0 International License , except where otherwise noted.

Share This Book

  • How It Works

Moderation Analysis in SPSS

Discover Moderation Analysis in SPSS ! Learn how to perform, understand SPSS output , and report results in APA style. Check out this simple, easy-to-follow guide below for a quick read!

Struggling with Moderation Analysis in SPSS? We’re here to help . We offer comprehensive assistance to students , covering assignments , dissertations , research, and more. Request Quote Now !

moderation hypothesis formulation

Introduction

Moderation analysis is a valuable tool in research, allowing researchers to understand how the relationship between two variables changes depending on a third variable, known as the moderator. This analysis is crucial for gaining insights into complex relationships and identifying conditions under which certain effects occur. As the field of data analysis grows, the ability to perform and interpret moderation analysis becomes increasingly important.

Using SPSS, a widely-used statistical software, can simplify the process of conducting moderation analysis. This blog post aims to provide a comprehensive guide on performing moderation analysis in SPSS. We will cover the fundamental concepts, differentiate between mediation and moderation, and outline the steps and assumptions involved in testing moderation. Additionally, we will explore practical examples, interpret SPSS output, and provide guidance on reporting results in APA format.

PS: This post explains the traditional regression method in SPSS for moderation analysis. If you prefer to use the Hayes PROCESS Macro, please visit our guide on “ Moderation Analysis with Hayes PROCESS Macro in SPSS .”

What is Moderation Analysis?

Moderation analysis examines how the relationship between an independent variable (X) and a dependent variable (Y) changes as a function of a third variable, called the moderator (M). The moderator can either strengthen, weaken, or reverse the effect of the independent variable on the dependent variable. By including a moderator, researchers can capture more nuanced relationships and better understand the conditions under which certain effects are stronger or weaker.

moderation hypothesis formulation

This type of analysis is particularly useful in social sciences, where the impact of one variable on another often depends on additional contextual factors. For instance, the effect of stress on performance might vary depending on levels of social support. This helps researchers identify these cofounding effects, providing deeper insights into the dynamics of the studied relationships.

What are Steps in Testing Moderation?

  • Center the Moderator and Independent Variable: Mean-center the independent variable and the moderator to reduce multicollinearity and simplify the interpretation of the interaction term.
  • Create Interaction Term: Multiply the centered independent variable and the centered moderator to create an interaction term.
  • Run Regression Analysis: Enter the independent variable, moderator, and interaction term into a multiple regression model predicting the dependent variable.
  • Plot Interaction: Plot the interaction to visualise how the relationship between the independent variable and the dependent variable changes at different levels of the moderator.

Which is the Method better: Using Hayes PROCESS Macro or Traditional Regression for Moderation Analysis?

Choosing between Hayes PROCESS Macro and traditional regression for moderation analysis depends on your research needs and statistical expertise. The Hayes PROCESS Macro offers a user-friendly interface, automating many steps of the moderation analysis and providing bootstrap confidence intervals for the interaction effects. This method reduces human error and enhances result reliability, making it a preferred choice for those who seek convenience and precision.

In contrast, traditional regression requires manual computation of interaction terms and more steps in the analysis process. While it offers flexibility and a deeper understanding of the moderation process, it demands a higher level of statistical knowledge. The regression might be better suited for researchers who prefer customising their analyses and exploring the underlying data in more detail. Both methods have their advantages, and the choice ultimately depends on the research context and the user’s familiarity with statistical tools.

In this blog, we will give details about regression for moderation analysis, but you can visit the Hayes PROCESS post to see details of the method.

What are the Assumptions of Moderation Analysis?

  • Linearity: The relationships between the independent variable, moderator, and dependent variable must be linear.
  • Independence of Errors: The error terms in the regression equations should be independent of each other.
  • No Multicollinearity: The independent variable, moderator, and their interaction term should not be highly correlated with each other.
  • Homoscedasticity: The variance of the error terms should be constant across all levels of the independent variable and the moderator.
  • Normality: The residuals of the regression equations should be normally distributed.
  • Measurement without Error: The variables involved in the moderation analysis should be measured accurately without error.

What is the Hypothesis of Moderation Analysis?

The primary hypothesis in moderation analysis posits that the strength or direction of the relationship between an independent variable (X) and a dependent variable (Y) depends on the level of a third variable, the moderator (M).

  • H0 (The null hypothesis): The interaction term does not significantly predict the dependent variable (meaning there is no moderation effect.)
  • H1 (The alternative hypothesis): the interaction term significantly predicts the dependent variable. (indicating the presence of a moderation effect.)

Testing these hypotheses involves examining the interaction term in the regression model to determine if the moderation effect is statistically significant.

An Example of Moderation Analysis

Consider a study examining the impact of work stress (X) on job performance (Y) and how this relationship is moderated by social support (M). The hypothesis posits that the negative effect of work stress on job performance will be weaker for employees with high social support compared to those with low social support. To test this, researchers would first mean-center the variables of work stress and social support.

Next, researchers would create an interaction term by multiplying the centered work stress and social support variables. By entering work stress, social support, and the interaction term into a regression model predicting job performance, researchers can assess the main effects and the interaction effect. If the interaction term is significant, it indicates that social support moderates the relationship between work stress and job performance.

How to Perform Moderation Analysis in SPSS

moderation hypothesis formulation

Step by Step: Running Moderation Analysis in SPSS Statistics

Let’s embark on a step-by-step guide on performing the Moderation Analysis using SPSS

– Open your dataset in SPSS, ensuring it includes the independent variable (X), dependent variable (Y), and moderator (M).

Center the Variables

– Compute the mean of the independent variable and the moderator, then subtract these means from their respective variables to create centered variables.

Create Interaction Term

– Multiply the centered independent variable by the centered moderator to create an interaction term.

Run Regression Analysis

– Navigate to ` Analyze > Regression > Linear `.

– Enter the dependent variable (Y) into the Dependent box.

– Move the centered independent variable (X), centered moderator (M), then click Next ” for block 2 enter the interaction term into the Independent box.

– Click OK to run the regression analysis.

Interpret the Output

– Examine the coefficients table to assess the significance of the independent variable, moderator, and interaction term.

– Significant interaction term indicates moderation.

Note: Conducting Moderation Analysis in SPSS provides a robust foundation for understanding the key features of your data. Always ensure that you consult the documentation corresponding to your SPSS version, as steps might slightly differ based on the software version in use. This guide is tailored for SPSS version 25 , and for any variations, it’s recommended to refer to the software’s documentation for accurate and updated instructions.

SPSS Output for Moderation Analysis

Spss output 1, spss output 2.

moderation hypothesis formulation

How to Interpret SPSS Output of Moderation Analysis

When interpreting the SPSS output of your moderation analysis, focus on three key tables: Model Summary, ANOVA, and Coefficients.

Model Summary Table:

  • R: This represents the correlation between the observed and predicted values of the dependent variable. Higher values indicate a stronger relationship.
  • R Square (R²): This value indicates the proportion of variance in the dependent variable explained by the independent, moderator, and interaction variables. An R² value closer to 1 suggests a better fit.
  • Adjusted R Square: Adjusts the R² value for the number of predictors in the model. This value is useful for comparing models with different numbers of predictors.

ANOVA Table:

  • F-Statistic: This tests the overall significance of the model. A significant F-value (p < 0.05) indicates that the model significantly predicts the dependent variable.
  • (p-value): If the p-value is less than 0.05, the model is considered statistically significant, meaning the independent and mediator variables together significantly predict the dependent variable.

Coefficients Table:

  • Unstandardized Coefficients (B): Coefficient of variable
  • Constant (Intercept): The expected value of the dependent variable when all predictors are zero.
  • Standardized Coefficients (Beta): These coefficients are useful for comparing the relative strength of each predictor in the model.
  • t-Statistic and Sig. (p-value): Indicates whether each predictor is significantly contributing to the model. If the p-value is less than 0.05, the predictor is considered statistically significant.

By focusing on these tables, you can effectively interpret the results of your mediation analysis in SPSS, identifying the direct and indirect effects, as well as the overall model significance.

How to Report Results of Moderation Analysis in APA

Reporting the results of moderation analysis in APA (American Psychological Association) format requires a structured presentation. Here’s a step-by-step guide in list format:

  • Introduction : Briefly describe the purpose of the moderation analysis and the variables involved.
  • Descriptive Statistics : Report the means and standard deviations of the independent variable, moderator, and dependent variable.
  • Main Effects : Provide the regression coefficients, standard errors, and p-values for the independent variable and moderator.
  • Interaction Effect : Report the regression coefficient, standard error, and p-value for the interaction term.
  • Model Summary : Include R² and adjusted R² values to indicate the model fit.
  • Significance Tests : Present the results of the F-test and the significance levels for the overall model.
  • Plot Interaction : Include a plot illustrating the interaction effect, showing how the relationship between the independent variable and the dependent variable changes at different levels of the moderator.
  • Figures and Tables : Provide tables and figures to visually represent the statistical results and interaction effects.
  • Conclusion : Summarise the key results and suggest directions for future research.

moderation hypothesis formulation

Get Help For Your SPSS Analysis

Embark on a seamless research journey with SPSSAnalysis.com , where our dedicated team provides expert data analysis assistance for students, academicians, and individuals. We ensure your research is elevated with precision. Explore our pages;

  • SPSS Help by Subjects Area: Psychology , Sociology , Nursing , Education , Medical , Healthcare , Epidemiology , Marketing 
  • Dissertation Methodology Help
  • Dissertation Data Analysis Help
  • Dissertation Results Help
  • Pay Someone to Do My Data Analysis
  • Hire a Statistician for Dissertation
  • Statistics Help for DNP Dissertation
  • Pay Someone to Do My Dissertation Statistics

Connect with us at SPSSAnalysis.com to empower your research endeavors and achieve impactful data analysis results. Get a FREE Quote Today !

Expert SPSS data analysis assistance available.

Struggling with Statistical Analysis in SPSS? - Hire a SPSS Helper Now!

Understanding and Using Mediators and Moderators

  • Published: 06 June 2007
  • Volume 87 , pages 367–392, ( 2008 )

Cite this article

moderation hypothesis formulation

  • Amery D. Wu 1 &
  • Bruno D. Zumbo 1  

16k Accesses

356 Citations

3 Altmetric

Explore all metrics

Mediation and moderation are two theories for refining and understanding a causal relationship. Empirical investigation of mediators and moderators requires an integrated research design rather than the data analyses driven approach often seen in the literature. This paper described the conceptual foundation, research design, data analysis, as well as inferences involved in a mediation and/or moderation investigation in both experimental and non-experimental (i.e., correlational) contexts. The essential distinctions between the investigation of mediators and moderators were summarized and juxtaposed in an example of a causal relationship between test difficulty and test anxiety. In addition, the more elaborate models, moderated mediation and mediated moderation, the use of structural equation models, and the problems with model misspecification were discussed conceptually.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Similar content being viewed by others

moderation hypothesis formulation

Alpha, FACTT, and Beyond

moderation hypothesis formulation

On the Standardization of the Functional Analysis

moderation hypothesis formulation

Concepts and Models from Psychometrics

Path analysis and factor analysis are special cases of SEM. A path analysis is a type of SEM in which each variable has only one indicator and the relationship among the variables are specified. Hence, a path analysis approach to mediation and moderation does not deal with the problem of measurement errors, however, it, can deal with multiple univariate regression analyses in one model. A factor analysis is a type of SEM in which each latent variable has multiple indicators hence deals with the measurement error problem, but there are no relationship specified among the latent variables. A full SEM incorporates and integrates path analysis and factor analysis; the latent variables have multiple indicators and their relationships are modeled.

Aguinis, H., Boik, R. J., & Pierce, C. A. (2001). A generalized solution for approximating the power to detect effect of categorical moderator variables using multiple regression. Organizational Research Methods, 4 , 291–323.

Article   Google Scholar  

Algina, J., & Moulder, B. C. (2001). Note on estimating the Jöreskog-Yang model for latent variable interaction using LISREL 8.3. Structural Equation Modeling, 8 , 40–52.

Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51 , 1173–1182.

Bauer, D. J., Preacher, K. J., & Gil, K. M. (2006). Conceptualizing and testing random indirect effects and moderated mediation in multilevel models: new procedures and recommendations. Psychological Methods, 11 , 142–163.

Bollen, K. A. (1989). Structural equations with latent variables . New York: Wiley.

Google Scholar  

Bollen, K. A., & Paxton, P. (1998). Interactions of latent variables in structural equation models. Structural Equation Modeling, 5 , 267–293.

Brown, R. L. (1997). Assessing specific mediational effect in complex theoretical models. Structural Equation Modeling, 4 , 142–156.

Chaplin, W. F. (1991). The next generation in moderation research in personality psychology. Journal of Personality, 59 , 143–178.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.

Cohen, J. P., Cohen, S. G., West, L. S., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.

Cook, T., & Campbell, D. (1979). Quasi-experimentation: design and analysis issues for field settings . Boston: Houghton Mifflin.

Cole, D. A., & Maxwell, S. E. (2003). Testing mediational models with longitudinal data: questions and tips in the use of structural equation modeling. Journal of Abnormal Psychology, 4 , 558–577.

Collins, L. M., Graham, J. W., & Flaherty, B. P. (1998). An alternative framework for defining mediation. Multivariate Behavioral Research, 33 , 295–313.

Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52 , 281–302.

Frazier, P. A., Tix, A. P., & Baron, K. E. (2004). Testing moderator and mediator effects in counselling psychology. Journal of Counselling Psychology, 51 , 115–134.

Hershberger, S. L. (2006). The problem of equivalent structural models. In G. R. Hancock & R. O. Muller (Eds.), Structural equation modeling: a second course (pp. 13–41). Greenwich, CT: Information Age Publishing.

Holbert, R. L., & Stephenson, M. T. (2003). The importance of indirect effects in media effects research: testing for mediation in structural equation modeling. Journal of Broadcasting and Electronic Media, 47 , 556–572.

Holland, P. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81 , 945–960.

Holmbeck G. N. (1997). Toward terminological, conceptual, and statistical clarity in the study of mediators and moderators: examples from the child-clinical and pediatric psychology literatures. Journal of Consulting & Clinical Psychology, 65 , 599–610.

Holmbeck, G. N. (2002). Post-hoc probing of significant moderational and mediational effects in studies of pediatric populations. Journal of Pediatric Psychology, 27 , 87–96.

Hoyle, R. H., & Kenny, D. A. (1999). Statistical power and tests of mediation. In R. H. Hoyle (Ed.), Statistical strategies for small sample research . Newbury Park: Sage.

Hoyle, R. H., & Robinson, J. I. (2003). Mediated and moderated effects in social psychological research: measurement, design, analysis issues. In C. Sansone, C. Morf, & A. T. Panter (Eds.), Handbook of methods in social psychology (pp. 213–233). Thousand Oaks, CA: Sage.

Hoyle, R. H., & Smith, G. T. (1994). Formulating clinical research hypothesis as structural models: a conceptual overview. Journal of Consulting and Clinical Psychology, 62 , 429–440.

Jaccard, J., Turrisi, R., & Wan, C. K. (1990). Interaction effects in multiple regression . Newbury Park, CA: Sage.

Jaccard, J., & Wan, C. K. (1995). Measurement error in the analysis of interaction effects between continuous predictors using multiple regression: multiple indicator and structural equation approaches. Psychological Bulletin, 117 , 348–357.

Jaccard, J., & Wan, C. K. (1996). LISREL approaches to interaction effects in multiple regression . Thousand Oaks, CA: Sage.

James, L. R., & Brett, J. M. (1984). Mediators, moderators, and test for mediation. Journal of Applied Psychology, 69 , 307–321.

Judd, C. M., & Kenny, D. A. (1981). Process analysis: estimating mediation in treatment evaluation. Evaluation Review, 5 , 602–619.

Kenny, D. A. (1979). Correlation and causality . New York: Wiley.

Kenny, D. A., & Judd, C. M. (1984). Estimating the linear and interative effects of latent variable. Psychological Bulletin, 105 , 361–373.

Kenny, D. A., Kashy, D. A., & Bolger, N. (1998). Data analysis in social psychology. In D. Gilbert, S. Fiske, & G. Lindzey (Eds.), The handbook of social psychology (Vol. 1, 4th ed., pp. 233–265). Boston, MA: McGraw-Hill.

Kenny, D. A., Korchmaros, J. D., & Bolger, N. (2003). Lower level mediation in multilevel models, Psychological Methods, 8 , 115–128.

Kraemer, H. C., Stice, E., Kazdin, A, Offord, D., & Kupfer, D. (2001). How do risk factors work together? Mediators, moderators, and independent, overlapping, and proxy risk factors. American Journal of Psychiatry, 158 , 848–856.

Kraemer, H. C., Wilson, G T., Fairburn, C. G., & Agras, W. S. (2002). Mediators and moderators of treatment effects in randomized clinical trials. Archives of General Psychiatry, 59 , 877–883.

MacCallum, R. C., Wegener, D. T., Uchino, B. N., & Fabrigar, L. R. (1993). The problem of equivalent models in applications of covariance structure analysis. Psychological Bulletin, 114 , 185–199.

MacCallum, R. C., Zhang, S., Preacher, K. J., & Rucker, D. D. (2002). On the practice of dichotomization of quantitative variables. Psychological Methods, 7 , 19–40.

MacCorquodale, K., & Meehl, P. E. (1948). On a distinction between hypothetical constructs and intervening variables. Psychological Review, 55 , 95–107.

MacKinnon, D. P., Krull, J. L., & Lockwood, C. M. (2000). Equivalence of mediation, confounding, and suppression effect. Prevention Sciences, 1 , 173–181.

MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M., West, S. G., & Sheets, V. (2002). A comparisons of methods to test mediation and other intervening variable effects. Psychological Methods, 7 , 83–104.

MacKinnon, D. P., Warsi, G., & Dwyer, J. H. (1995). A simulation study of mediated effect measures. Multivariate Behavioral Research, 30 , 41–62.

Mallinckrodt, B., Abraham, W. T., Wei, M., & Russell, D. W. (2006). Advances in testing the statistical significance of mediation effects. Journal of Counselling Psychology, 53 , 372–378.

Marsh, H. W., Wen, Z., & Hau, K. T. (2004). Structural equation models of latent interactions: evaluation of alternative estimation strategies and indicator construction. Psychological Methods, 9 , 275–299.

Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: American Council of Education & Macmillan.

Mogan-Lopes, A. A., & MacKinnon, D. P. (2006). Demonstration and evaluation of methods for assessing mediated moderation. Behavior Research Methods, 38 , 77–87.

Moulder, B. B., & Algina, J. (2002). Comparison of methods for estimating and testing latent variable interactions. Structural Equation Modeling, 9 , 1–19.

Muller, D., Judd, C. M., & Yzerbyt, V. Y. (2005). When moderation is mediated mediation is moderated. Journal of Personality and Social Psychology, 89 , 852–863.

Pearl, J. (2000). Causality: models reasoning, and inference . Cambridge, UK: Cambridge University Press.

Ping, R. A. Jr. (1996). Latent variable interaction and quadratic effect estimation: a two-step technique using structural equation analysis. Psychological Bulletin, 119 , 166–175.

Preacher, K. J., & Hayes, A. F. (2004). SPSS and SAS procedures for estimating indirect effects in simple mediation models. Behavior Research Methods, Instruments, and Computers, 36 , 717–731.

Preacher, K. J., Rucker, D. D., & Hayes, A. F. (in press). Addressing moderated mediation hypothesis: theory, methods, and prescriptions. Multivariate Behavioral Research .

Rogosa, D. (1987). Causal models do not support scientific conclusions: a comment in support of Freedman. Journal of Educational Statistics, 12 , 185–195.

Rose, B. M., Holmbeck, G. N., Coakley, R. M., & Franks, E. A. (2004). Mediator and moderator effects in developmental and behavioral pediatric research. Developmental and Behavioral Pediatrics, 25 , 58–67.

Rosenbaum, P. R. (1984). From association to causation on observational studies: the role of tests of strongly ignorable treatment assignment. Journal of the American Statistical Association, 79 , 41–48.

Rozeboom, W. W. (1956). Mediation variable in scientific theory. Psychological Review, 63 , 249–264.

Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and non-randomized studies. Journal of Educational Psychology, 66 , 688–701.

Rubin, D. (1986). “Comment: which ifs have causal answers?” Journal of the American Statistical Association, 81 , 961–962.

Saunders, D. R. (1956). Moderator variables in prediction. Educational and Psychological Measurement, 16 , 209–222.

Schermelleh-Engel, K., Klein, A., & Moosbrugger, H. (1998). Estimating nonlinear effects using a Latent Moderated Structural Equations Approach. In R. E. Schumacker & G. A. Marcoulides (Eds.), Interaction and nonlinear effects in structural equation modeling (pp. 203–238). Mahwah, NJ: Erlbaum.

Schumacker, R. E. (2002). Latent variable interaction modeling. Structural Equation Modelling, 9 , 40–54.

Schumacker, R., & Marcoulides, G. (Eds.) (1998). Interaction and nonlinear effects in structural equation modeling . Mahwah, NJ: Erlbaum.

Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference . Boston: Houghton-Mifflin.

Shrout, P. E., & Bolger, N. (2002). Mediation in experimental and nonexperimental studies: new procedures and recommendations. Psychological Methods, 4 , 422–445.

Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effects in structural equation models. In S. Leinhardt (Ed.), Sociological methodology (pp. 290–312). Washington, DC: American Sociological Association.

Sobel, M. E. (1988) Direct and indirect effects in structural equation models. In J. S. Long (Ed.), Common problems/proper solutions: avoiding error in quantitative research (pp. 46–64). Beverly Hill, CA: Sage.

Sobel, M. E. (1990). Effect analysis and causation in linear structural equation models. Psychometrika, 55 , 495–515.

Sobel, M. E. (1995). Causal inference in social and behavioral sciences. In G. Arminger, C. C. Clog, & M. E. Sobel (Eds.), Handbook of statistical modeling for the social and behavioral sciences (pp. 1–32). New York: Plenum Press.

Sobel, M. E. (2005) Discussion: the scientific model of causality. Sociological Methodology, 35 , 99–133.

Spencer, S. J., Zanna, M. P., & Fong, G. T. (2005). Establishing a causal chain: why experiments are often more effective than mediational analysis in examining psychological processes. Journal of Personality and Social Psychology, 89 , 845–851.

Wegener, D., & Fabrigar, L. (2000). Analysis and design for nonexperimental data addressing causal and noncausal hypothesis. In H. T. Reis & C. M. Judd (Eds.), Handbook of research methods in social and personality psychology (pp. 412–450). New York: Cambridge University Press.

West, S. G., Aiken, L. S., & Krull, J. L. (1996). Experimental personality design: analyzing categorical by continuous variable interactions. Journal of Personality, 64 , 1–49.

Woodworth, R. S. (1928). Dynamic psychology. In C. Murchison (Ed.), Psychology of 1925 . Worcester, MA: Clark University Press.

Download references

Author information

Authors and affiliations.

Department of ECPS, University of British Columbia, Scarfe Building, 2125 Main Mall, Vancouver, BC, Canada, V6T 1Z4

Amery D. Wu & Bruno D. Zumbo

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Bruno D. Zumbo .

Rights and permissions

Reprints and permissions

About this article

Wu, A.D., Zumbo, B.D. Understanding and Using Mediators and Moderators. Soc Indic Res 87 , 367–392 (2008). https://doi.org/10.1007/s11205-007-9143-1

Download citation

Received : 08 May 2007

Accepted : 14 May 2007

Published : 06 June 2007

Issue Date : July 2008

DOI : https://doi.org/10.1007/s11205-007-9143-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Moderated mediation
  • Mediated moderation
  • Cause and effect
  • Structural equation model
  • Experimental design
  • Find a journal
  • Publish with us
  • Track your research
  • Tools and Resources
  • Customer Services
  • Affective Science
  • Biological Foundations of Psychology
  • Clinical Psychology: Disorders and Therapies
  • Cognitive Psychology/Neuroscience
  • Developmental Psychology
  • Educational/School Psychology
  • Forensic Psychology
  • Health Psychology
  • History and Systems of Psychology
  • Individual Differences
  • Methods and Approaches in Psychology
  • Neuropsychology
  • Organizational and Institutional Psychology
  • Personality
  • Psychology and Other Disciplines
  • Social Psychology
  • Sports Psychology
  • Share This Facebook LinkedIn Twitter

Article contents

Moderator variables.

  • Matthew S. Fritz Matthew S. Fritz Department of Educational Psychology, University of Nebraska - Lincoln
  •  and  Ann M. Arthur Ann M. Arthur Department of Educational Psychology, University of Nebraska - Lincoln
  • https://doi.org/10.1093/acrefore/9780190236557.013.86
  • Published online: 25 January 2017

Moderation occurs when the magnitude and/or direction of the relation between two variables depend on the value of a third variable called a moderator variable. Moderator variables are distinct from mediator variables, which are intermediate variables in a causal chain between two other variables, and confounder variables, which can cause two otherwise unrelated variables to be related. Determining whether a variable is a moderator of the relation between two other variables requires statistically testing an interaction term. When the interaction term contains two categorical variables, analysis of variance (ANOVA) or multiple regression may be used, though ANOVA is usually preferred. When the interaction term contains one or more continuous variables, multiple regression is used. Multiple moderators may be operating simultaneously, in which case higher-order interaction terms can be added to the model, though these higher-order terms may be challenging to probe and interpret. In addition, interaction effects are often small in size, meaning most studies may have inadequate statistical power to detect these effects.

When multilevel models are used to account for the nesting of individuals within clusters, moderation can be examined at the individual level, the cluster level, or across levels in what is termed a cross-level interaction. Within the structural equation modeling (SEM) framework, multiple group analyses are often used to test for moderation. Moderation in the context of mediation can be examined using a conditional process model, while moderation of the measurement of a latent variable can be examined by testing for factorial invariance. Challenges faced when testing for moderation include the need to test for treatment by demographic or context interactions, the need to account for excessive multicollinearity, and the need for care when testing models with multiple higher-order interactions terms.

  • interaction
  • multilevel moderation
  • latent variable interactions
  • conditional process

Overview of Current Status

When the strength of the association between two variables is conditional on the value of a third variable, this third variable is called a moderator variable . That is, the magnitude and even the direction of the relation between one variable, usually referred to as a predictor or independent variable , and a second variable, often called an outcome or dependent variable , depends on the value of the moderator variable. Consider baking bread in an oven. In general, the higher the temperature of the oven (independent variable), the faster the bread will finish baking (dependent variable). But consider a baker making two different types of bread dough, one with regular white flour and the other with whole-wheat flour. Keeping the temperature constant, if the bread made with whole-wheat flour took longer to finish baking than the bread made with white flour, then the type of flour would be a moderator variable, because the relation between temperature and cooking time differs depending on the type of flour that was used. Note that moderating variables are not necessarily assumed to directly cause the outcome to change, only to be associated with change in the strength and/or the direction of the association between the predictor and the outcome.

Moderator variables are extremely important to psychologists because they provide a more detailed explanation of the specific circumstances under which an observed association between two variables holds and whether this association is the same for different contexts or groups of people. This is one reason why contextual variables and demographic variables, such as age, gender, ethnicity, socioeconomic status, and education, are some of the mostly commonly examined moderator variables in psychology. Moderator variables are particularly useful in experimental psychology to explore whether a specific treatment always has the same effect or if differential effects appear when another condition, context, or type of participant is introduced. That is, moderator variables advance our understanding of the effect. For example, Avolio, Mhatre, Norman, and Lester ( 2009 ) conducted a meta-analysis of leadership intervention studies and found that the effect of leadership interventions on a variety of outcome variables differed depending on whether the participants were all- or majority-male compared to when the participants were all- or majority-female.

The most important issue to consider when deciding whether a variable is a moderator of the relation between two other variables is the word different , because if the relation between two variables does not differ when the value of the third variable changes, the third variable is not a moderator variable and therefore must be playing some other role, if any. As illustrated in Figure 1 , a third variable is a confounder variable when it explains all or part of the relation between an independent variable and an outcome, but unlike a moderating variable, the magnitude of the relation between the independent and dependent variable does not change as the value of the confounder variable changes. A classic example of a confounding effect is the significant positive relation between ice cream consumption and violent crime. Ice cream consumption does not cause an increase in violent crime or vice versa; rather, the rise in both can be explained in part by a third variable—warmer temperatures (Le Roy, 2009 ). Moderator variables are also often confused with mediator variables , which are intermediate variables in a causal chain, such that changes in the independent variable (or antecedent ) cause changes in the mediator variable, which then cause changes in the outcome variable (or consequent ). For example, receiving cognitive-behavioral therapy (CBT; independent variable) has been found to cause reductions in negative thinking (mediating variable), and the reduction in negative thinking in turn reduces depressive symptoms (outcome variable; Kaufman, Rohde, Seeley, Clarke, & Stice, 2005 ). Moderator variables are not assumed to be part of a causal chain.

moderation hypothesis formulation

Figure 1. Path model diagrams for mediator, confounding, and moderator variables.

Interaction Models

When a moderator variable is present, such that the strength of the relation between an independent and dependent variable differs depending on the value of the moderator variable, the moderator variable is said to moderate the relation between the other two variables. The combined effect of the moderator variable with the independent variable is also called an interaction to reflect the interplay between the two variables, which differs from the individual effects of the independent and moderator variables on the dependent variable. This means that although the moderator variable changes the relation between the independent variable and outcome, the strength of the relation between the moderator variable and the outcome in turn differs depending on the values of the independent variable. Hence, the independent and moderator variables simultaneously moderate the relation between the other variable and the outcome. When an interaction term is statistically significant, it is not possible to interpret the effect of the independent variable alone because the effect depends on the level of the moderator variable.

Categorical by Categorical (2x2)

To illustrate the idea of an interaction, consider the finding by Revelle, Humphreys, Simon, and Gilliland ( 1980 ) that the relation between caffeine consumption and performance on a cognitive ability task is moderated by personality type. Specifically, Revelle et al. ( 1980 ) used a 2x2 between-subjects analysis of variance (ANOVA) design to examine the impact of consuming caffeine (independent variable; 0 mg or 200 mg) and personality type (moderator variable; introvert vs. extrovert) on cognitive performance (outcome; score on a practice GRE test). 1 Examination of the mean performance for the main effect of caffeine, which is the effect of caffeine collapsing across the personality type factor and shown in Figure 2a , demonstrates that the participants who received caffeine performed better than those who did not receive caffeine. Hence, one might categorically conclude that caffeine improves performance for everyone. In turn, the mean performance for the main effect of personality, which is the effect of personality type collapsing across the caffeine factor (Figure 2b ), shows that extroverts performed better than introverts. When the means are plotted for the four cross-factor groups in the study (Figure 2c ), however, it is apparent that although caffeine increased the performance of the extroverts, it actually decreased the performance of the introverts. Therefore, personality moderates the relation between caffeine and performance. In turn, caffeine moderates the relation between personality and performance because although introverts performed better than the extroverts regardless of caffeine consumption, the difference in performance between introverts and extroverts is larger for those who did not receive caffeine than those who did. Note that the vertical axis only shows a limited range of observed outcome values, so the response scale may have limited the real differences.

moderation hypothesis formulation

Figure 2. 2x2 Interaction: (a) Main effect of caffeine; (b) Main effect of personality type (black = introvert, white = extrovert); (c) Interaction between caffeine and personality type on Day 1 (black/solid = introvert, white/dotted = extrovert); and (d) Interaction between caffeine and personality on Day 2.

Finding a statistically significant interaction term in an ANOVA model tells us that moderation is occurring, but provides no further information about the specific form of the interaction (unless one looks at the coefficient for the interaction, which is usually ignored in ANOVA, but will be considered when moderator variables are discussed in the multiple regression context). Full understanding of the relation between the independent and moderator variables requires examination of the interaction in more detail, a process called probing (Aiken & West, 1991 ). Probing an interaction in ANOVA typically involves testing each of the simple main effects , which are the effects of the independent variable at each level of the moderator. In the caffeine example, there are two simple main effects of the independent variable at levels of the moderator variable: the simple main effect of caffeine for introverts, represented by the solid line in Figure 2c , and the simple main effect of caffeine for extroverts, represented by the dashed line. The plot makes it clear that caffeine had a larger effect on performance for the extroverts than the introverts (i.e., the ends of the dashed line are farther apart vertically than the ends of the solid line), but the plot alone cannot show whether there is a significant effect of caffeine in either of the personality groups; hence the need for statistical tests.

Another way to conceptualize moderation is to say that moderation occurs when the simple main effects of an independent variable on an outcome are not the same for all levels of the moderator variable. If the effect of caffeine on performance was the same for both introverts and extroverts, the two simple main effects would be the same and the two lines in Figure 2c would be parallel. Instead, the two simple main effect lines are not parallel, indicating different simple main effects (i.e., moderation). Despite the moderating effect of personality on the relation between caffeine and performance illustrated in Figure 2c , the introverts always performed better than the extroverts in this study. As a result, though the lines are not parallel and must cross at some point, the lines do not intersect in the figure. When the simple main effect lines do not intersect within the observed range of values, the interaction is said to be ordinal (Lubin, 1961 ) because the groups maintain their order (e.g., introverts always outperform extroverts). When the simple main effect lines cross within the observed range of values, the interaction is said to be disordinal because the groups do not have the same order for all values of the moderator. A disordinal interaction is illustrated in Figure 2d , which again shows the same simple main effects of caffeine on performance for the different personality types, but for individuals who completed the same protocol the following day (Revelle et al., 1980 ).

What is important to consider when probing an interaction is what effect the moderator has on the relation between the other two variables. For example, the relation between the independent and dependent variables may have the same sign and be statistically significant for all values of the moderator, in which case the moderator only changes the magnitude of the relation. Alternatively, the relation between the independent and dependent variables may not be statistically significant at all values of the moderator, indicating that the relation exists only for specific values of the moderator. A third possibility is that the relation between the independent and dependent variables is statistically significant, but opposite in sign for different values of the moderator. This would indicate that the direction of the relation between the variables depends on the moderator. These are very different interaction effects that the statistical significance of the interaction term alone will not differentiate between, which is why probing interactions is essential to describing the effect of a moderator variable.

There are two additional issues to consider. First, the labeling of one variable as the independent variable and the other variable as the moderator is guided by theory. Because a significant interaction means that caffeine is also moderating the effect of personality on performance, the simple main effects of personality at levels of caffeine may also be considered; in this case, the simple main effect of personality type on performance for the 0 mg caffeine group and the simple main effect of personality type on performance for the 200 mg caffeine group. Since the statistical model is the same regardless of whether personality is the independent variable and caffeine is the moderator or vice versa, the assignment of roles to these variables is left up to the researcher. Second, while the 2x2 ANOVA framework is a simple design that lends itself to probing interactions, splitting a continuous variable at its mean or median in order to force continuous variables to fit into the ANOVA framework is a very bad idea, as it not only results in a loss of information that decreases statistical power, but also increases the likelihood of finding spurious interaction effects (Maxwell & Delaney, 1993 ).

Categorical by Categorical (3x3)

Probing a significant interaction in a 2x2 ANOVA is relatively straightforward because there are only two levels of each factor. When a simple main effect is statistically significant, there is a difference in the average score on the dependent variable between the two levels of the independent variable for that specific value of the moderator. The significant overall interaction then tells us that the difference in the means for the two levels of the independent variable are not the same for both values of the moderator. When there are more than two levels, probing an interaction in ANOVA becomes more complicated. For example, imagine if Revelle et al. ( 1980 ) had employed a 3x3 ANOVA design, where participants were randomized to one of three levels of caffeine (e.g., 0, 100, and 200 mg) and personality type was also allowed to have three levels (e.g., introvert, neutral, extrovert). In this case, a significant main effect of caffeine would only tell us that the mean performance in at least one of the caffeine groups was different than the mean performance in the other two groups, collapsing across personality type, but not specifically which caffeine groups differed in mean performance. Determining which groups differed requires a main effect contrast , also called a main comparison , which specifically compared two or more of the groups. For example, a main effect contrast could be used to examine the mean difference in performance between just the 100 mg and 200 mg groups.

The same issue extends to probing the interaction because a significant interaction in the 3x3 ANOVA case only demonstrates that the simple main effects of caffeine are not the same for all levels personality type (and vice versa), but not specifically how the simple main effects of caffeine are different or for which of the three personality types. One way to probe a 3x3 (or larger) interaction is to first individually test all simple main effects for significance. Then for any simple main effects that are found to be significant (e.g., the effect of caffeine just for introverts), a comparison could be used to test for differences between specific levels of the independent variable for that simple main effect (e.g., 100 mg vs. 200 mg just for introverts), called a simple effect contrast or simple comparison . Alternatively, instead of starting with simple main effects, a significant interaction effect can be probed by beginning with a main comparison (e.g., 100 mg vs. 200 mg). If the main comparison is significant, then one can test whether the main comparison effect differed as a function of personality type (e.g., does the difference in performance between 100 mg and 200 mg differ between any of the personality types), which is called a main effect contrast by factor interaction . If the main effect contrast by factor interaction was significant, the effect can be further examined by testing whether the main effect contrast on the independent variable (e.g., 100 mg vs. 200 mg) differed at specific levels of the moderator (e.g., neutral vs. extrovert). That is, a contrast by contrast interaction specifies contrasts on both factors. For example, testing can show whether the difference in mean performance between the 100 mg and 200 mg caffeine groups differed for neutrals compared to extroverts, which essentially goes back to a 2x2 interaction.

Probing interactions in ANOVA when the factors have more than a few levels can lead to a large number of statistical tests. When a large number of these post hoc tests are examined, there is a danger that the probability of falsely finding a significant mean difference (i.e., making a Type I error) increases beyond a reasonable level (e.g., 0.05). When that happens, a Type I error correction needs to be applied to bring the probability of falsely finding a significant difference across all of the post hoc tests, called the experiment wise Type I error rate , back down to an appropriate level. The most well known of these corrections is the Bonferroni, but Maxwell and Delaney ( 2004 ) show that the Bonferroni overcorrects when the number of post hoc tests is more than about nine. Alternatives to the Bonferroni include the Dunnett correction for when one reference level is to be compared to each other level of the factor, the Tukey correction for all pairwise comparisons of levels, and the Scheffé correction for all possible post hoc tests.

Continuous by Categorical

Although not discussed in detail here, interactions between categorical variables can also be assessed using multiple regression rather than ANOVA. When one or both of the variables involved in the interaction is continuous, however, multiple regression must be used to test moderation hypotheses (Blalock, 1965 ; Cohen, 1968 ). The regression framework permits a moderation hypothesis to be specified with any combination of categorical and continuous variables. Consider the continuous by categorical variable interaction from Sommet, Darnon, and Butera ( 2015 ), who examined interpersonal conflict regulation strategies in social situations. 2 When faced with a disagreeing partner, people generally employ either a competitive strategy or conform to their partner’s point of view. Specifically, they found that the relation between the number of performance-approach goals (e.g., “did you try to show the partner was wrong”; continuous predictor) and competitive regulation scores (continuous outcome) differs depending on the person’s relative academic competence compared to their partner (same, superior, or unspecified; categorical moderator). The significant interaction indicates that performance-approach goals have a higher association with competitive regulation for both superior partners and partners with unspecified competence compared to partners with the same competence (Figure 3 ).

moderation hypothesis formulation

Figure 3. Categorical by continuous variable moderation.

Probing a significant interaction in multiple regression when the predictor is continuous and the moderator variable is categorical differs from probing interactions in ANOVA, but it can be straightforward depending on how the categorical moderator is incorporated into the regression model. There are many methods for representing nominal or ordinal variables in regression equations (e.g., Cohen, Cohen, West, & Aiken, 2003 ; Pedhauzer, 1997 ), though this article focuses only on dummy codes . Creating dummy codes for a categorical variable with k levels, requires k – 1 dummy variables ( D 1 , D 2 , … D k−1 ). Using the Sommet et al. ( 2015 ) example, where competence has three groups ( k = 3), two dummy variables are needed: D 1 and D 2 . Dummy variables are created by first selecting a reference group , which receives a zero on all of the dummy variables. Each of the non-reference groups receives a one for one dummy variable (though not the same dummy variable as any other non-reference group) and a zero for all other dummy variables. If same-competence is selected as the reference group, then one potential set of dummy codes is: D 1 = {same = 0, superior = 1, unspecified = 0} and D 2 = {same = 0, superior = 0, unspecified = 1}. Both dummy variables are then entered into the regression model as predictors. To create the interaction between the predictor and the dummy variables, each of the dummy variables must be multiplied by the continuous predictor and added into the regression model as well. For the interpersonal conflict example, the overall regression model for computing predicted competitive regulation scores (^ denotes a predicted score) from the number of performance approach goals, relative academic competency, and the interaction between goals and competency is equal to:

If regression coefficient b 4 , b 5 , or both are significant, then there is a significant interaction between goals and competence.

Interpreting and probing the interaction between a continuous predictor and a categorical moderator in regression is much easier when using the overall regression equation. Consider what happens when the values for the competency reference group (i.e., same competence) are substituted into the overall regression model.

Since the same-competence group has 0’s for D 1 and D 2 , the overall regression equation reduces to just include b 0 and b 1 . This reduced regression equation represents the relation between performance approach goals and competitive regulation scores for individuals who have the same academic competency as their partners. Equation 2 is called a simple regression equation because it is analogous to the simple main effect in ANOVA. The b 1 coefficient, which represents the relation between goals and competitive regulation for individuals with the same competency, is called the simple slope . But what do the other coefficients in the overall regression model represent?

If the dummy variable values for the superior-competency group are substituted into the equation and then some terms are rearranged, the result is:

Since b 0 and b 1 are the intercept and simple slope for the same competency group, b 2 is the difference in the intercept and b 4 is the difference in simple slope, respectively, between the same- and superior-competency groups. This means that if b 4 is significantly different than zero, the simple slopes for the same- and superior-competency groups are different from one another, and academic competency therefore moderates the relation between goals and competitive regulation. The simple regression equation can also be computed for the unspecified-competency group. These three simple regression lines are illustrated in Figure 3 and show that higher performance-approach goal scores are significantly associated with greater competitive regulation behaviors, although it is now known that this effect differs based on the relative level of competence of the partner.

The significance of b 4 and b 5 demonstrates whether or not the relation between the predictor and outcome variable is moderated by the categorical moderator variable, but a significant interaction does not explain whether the relation between the predictor and the outcome is statistically significant in any of the groups. Since b 1 is automatically tested for significance by most statistical software packages, there is no need to worry about testing the simple slope for the reference group. Aiken and West ( 1991 ) provide equations for computing the standard errors for testing the other two simple slopes, [ b 1 + b 4 ] and [ b 1 + b 5 ], for significance. Alternatively, the dummy coding could be revised to make another group the reference category (e.g., superior-competence), then the complete model could be re-estimated and the significance of the new b 1 value would test the simple slope for the new reference group.

Another characteristic of the simple regression equations that may be of interest is the intersection point of two simple regression lines, which is the value of the predictor variable at which the predicted value of the outcome variable is the same for two different values of the moderator variable. Looking at Figure 3 , the superior- and same-competence simple regression lines appear to intersect at around 5 on the performance-approach goals variable. The exact value of the intersection point can be calculated by setting the simple regression equations for these two groups equal to each other and then using algebra to solve for value of goals. While the intersection point is where the predicted scores for two simple regression equations are exactly the same, the points at which the predicted scores for two simple regression lines begin to be statistically different from each other can be computed. Called regions of significance (Potthoff, 1964 ), this is conceptually similar to a confidence interval that is centered-around the intersection point for two simple regression lines. For any value of the predictor closer to the intersection point than the boundaries of the regions of significance, the predicted outcome values for the two simple regression lines are not statistically significantly different from one another. For any value of the predictor farther away from the intersection point than the boundaries of the regions of significance, the predicted outcome values for the two simple regression lines are statistically significantly different from one another.

Continuous by Continuous

Interactions between a continuous predictor and continuous moderator variable can also be examined using the multiple regression framework. An example of a continuous by continuous variable interaction is that although injustice (continuous predictor) has positive relationships with retaliatory responses such as ruminative thoughts and negative emotions (continuous outcomes), mindfulness (continuous moderator) reduces these associations (Long & Christian, 2015 ). That is, high levels of mindfulness reduce rumination and negative emotions (e.g., anger) by decoupling the self from experiences and disrupting the automaticity of reactive processing. Long and Christian administered measures of mindfulness, perceived unfairness at work, ruminative thoughts, outward-focused anger, and retaliation behavior. They found that lower levels of mindfulness were associated with increased anger, whereas higher mindfulness was associated with lower anger (see Figure 4 ).

moderation hypothesis formulation

Figure 4. Continuous by continuous interaction.

Similar to continuous predictor by categorical moderator interactions in multiple regression, with continuous predictor by continuous moderator interactions each variable is entered into the regression model, then the product of the two variables is entered as a separate predictor variable representing the interaction between these variables. For the anger example, the overall regression model predicting anger from perceived injustice, mindfulness, and the interaction between injustice and mindfulness is equal to:

As with a continuous by categorical interaction, interactions between two continuous variables are probed by investigating the simple regression equations of the outcome variable on the predictor for different levels of the moderator. Unlike categorical moderator variables where one can show how the simple slopes differ between the groups, a continuous moderator variable may not necessarily have specific values of interest. If there are specific values of the continuous moderator that are of interest to the researcher, then the simple regression equation can be computed these values by substituting these values into the overall regression equation. In the absence of specific values of interest, Aiken and West ( 1991 ) recommend examining the mean of the moderator, one standard deviation above the mean, and one standard deviation below the mean. While it may seem that these values are somewhat arbitrary, these three values provide information about what is happening at the average score on the moderator, as well as providing a good range of moderator values without going too far into the tails, where there are likely to be very few observations.

A trick that makes interpreting a continuous by continuous variable interaction easier is to mean center the predictor and moderator variables, but not the outcome variable, prior to creating the interaction term. When injustice and mindfulness are mean centered before they are entered into the complete regression equation and the simple regression equation is calculated for the mean of the moderator, which is zero when the moderator is mean centered, the overall regression model reduces to:

Then b 0 and b 1 in the overall regression model are equal to the intercept and simple slope for participants with an average level of mindfulness, rather than for a person with zero mindfulness.

One issue not yet considered is the values of the regression coefficients themselves. There are two possibilities. When the regression coefficients for the predictor and the interaction are opposite in sign, buffering or dampening interactions occur, which results in larger moderator values decreasing the relationship between the predictor and the outcome. The distinction is based on whether a beneficial phenomenon is being decreased (dampening) or a harmful phenomenon is being decreased (buffering). The mindfulness effect in Figure 4 is a buffering moderator because it further reduces the effect of the independent variable. Alternatively, if the signs of the regression coefficients for the predictor and interaction term are the same, positive or negative, then increasing values of the moderator are related to a larger relationship between the predictor and the outcome variable. This is called a synergistic or exacerbating interaction depending on whether the phenomenon being examined is beneficial or harmful to the individual, respectively. Mathematically, buffering and dampening interactions (or synergistic and exacerbating interactions) are identical, so the distinction is based purely on theory.

Standardized Interaction Coefficients

Given many psychologists’ preference for reporting standardized regression coefficients, researchers should be aware that when regression models include higher-order terms (e.g., interaction terms or curvilinear terms), the standardized coefficients produced by most statistical software packages are incorrect. Consider the unstandardized regression equation for a dependent variable Y and two predictors X 1 and X 2 :

The standardized coefficients can be calculated by multiplying each unstandardized coefficient by the standard deviation of the corresponding predictor divided by the standard deviation of Y (Cohen et al., 2003 ) or equivalently by creating z -scores for Y , X 1 , and X 2 (i.e., standardizing the variables by mean centering each variable, then dividing by its standard deviation) and then estimating the model using the standardized variables ( Z Y , Z X 1 , and Z X 2 ) such that:

where a standardized regression coefficient is denoted with an asterisk.

As previously described, in order to test whether X 2 moderates the relation between Y and X 1 , a new variable must be created in the data set that is the product of the two predictors, X 1 X 2 , and enter it into the regression model as a separate predictor, resulting in the equation:

The software program is unaware that this new predictor X 1 X 2 is, in fact, an interaction term and not just another continuous predictor, however. This means that, when the software is calculating the standardized coefficients, it converts all of the variables in the model into z -scores such that the standardized coefficients come from the following regression equation:

Unfortunately, Z X 1 X 2 is not equal to the value of the product term created from standardized variables, Z X 1 Z X 2 . Hence, b 3 * is not the correct estimate of the standardized interaction coefficient. To obtain the correct estimate of the standardized interaction coefficient, a researcher must manually create Z Y , Z X 1 , Z X 2 , and Z X 1 Z X 2 , to fit the model:

and then use the unstandardized value b 3Z . While using the unstandardized solutions from a regression of standardized variables to get the correct standardized values of the regression coefficients seems counterintuitive, the discrepancy between the unstandardized coefficient b 3Z computed using the standardized variables and the standardized coefficient b 3 * using the unstandardized variables is quite evident in the output. And though the difference in the coefficients may be small, this difference can lead to large differences in inference and interpretation (Aiken & West, 1991 ; Cohen et al., 2003 ; Friedrich, 1982 ).

Curvilinear

Though not always included in discussions of moderator variables, curvilinear change that can be described with a polynomial regression model (i.e., quadratic, cubic, etc.) is a form of moderation, albeit one where a variable moderates itself. Consider the classic finding in psychology that the relation between physiological arousal and task performance is U-shaped (i.e., quadratic; Yerkes & Dodson, 1908 ), illustrated in Figure 5 . If the relation between arousal and performance for very low levels of arousal were described using a straight line, the result would be a regression line with a very steep positive slope. That is, when someone has low arousal, even small increases in arousal can lead to large increases in predicted performance. Describing the same relation for medium levels of arousal would result in a regression line with a very shallow slope, such that a slight increase in arousal would only be met with a slight increase in predicted performance. For very high levels of arousal, the regression line would again have a very steep slope, but now the slope is negative, such that small increases in arousal lead to large decreases in predicted performance. Therefore, the relation between arousal and performance is different depending on the level of arousal, so arousal is both the predictor and the moderator variable. This dual role is shown clearly in the regression equation for the quadratic relation between performance and arousal:

because the squared quadratic term that represents the U-shape is the product of arousal times arousal, the same form as the interaction terms between the predictor and the moderator variable in the two previous examples.

moderation hypothesis formulation

Figure 5. Quadratic relation between arousal and performance.

Three-Way Interactions

Up until this point in the discussion of moderators, the focus has been only on the interaction between two variables, an independent variable and a single moderator, which are known as two-way interactions . But there is no reason why two or more moderator variables cannot be considered simultaneously. Returning to the Revelle et al. ( 1980 ) example, the researchers believed that time of day also had an impact on the relation between caffeine and performance, so they collected data from participants in the morning the first day and in the afternoon on the second day. Figures 2c and 2d clearly show that the interaction between caffeine and personality type differs depending on whether the participants completed the study in the morning (Day 1) or in the afternoon (Day 2). That is, personality type moderates the relation between caffeine and performance, but time of day moderates the interaction between personality and caffeine. The moderation of a two-way interaction by another moderator variable is called a three-way interaction . As withtwo-way interactions in ANOVA, a significant three-way interaction is probed by testing a combination of post hoc effects including simple main effects, simple comparisons, contrast by factor interactions, and contrast by contrast interactions (Keppel & Wickens, 2004 ). In regression, probing a significant three-way interaction involves selecting values for both moderator variables and entering these values simultaneously into the overall regression equation to compute the simple regression equations (Aiken & West, 1991 ). Three-way interactions can also come into play with curvilinear relations. For example, the relation between two variables may be cubic, necessitating a X 3 term, or the quadratic relation between two variables may vary as a function of a third variable.

There are two very important considerations when examining three-way interactions. First, whenever a higher-order interaction is tested in a model, all lower order effects must be included in the model. For a three-way interaction, this means that all two-way interactions as well as all main effects must be included in the model (Cohen, 1978 ). This is more easily illustrated in regression. For example, consider if the two-way interaction between injustice and mindfulness in the Long and Christian ( 2015 ) example was found to differ depending on the person’s gender. 3 The correct regression equation would be:

which includes the three-way interaction between injustice, mindfulness, and gender, the three two-way interactions between these variables, as well as the three first-order effects. As described before, when the highest-order term is significant, no lower-order terms should be interpreted without consideration of the levels of the other variables.

Current Trends in Moderation

After defining moderator variables, providing an overview of the different types of interactions most likely to be encountered by psychologists, and discussing how to probe significant interactions between variables, the next section summarizes current trends in moderation analysis. Recent advances in moderation research have been focused in three areas: (1) moderator variables in the context of clustered data, (2) moderation with latent variables, and (3) models that have both moderator and mediator variables.

Multilevel and Cross-Level Moderation

Multilevel models (Raudenbush & Bryk, 2002 ; Snijders & Bosker, 2012 ), also called hierarchical linear models , mixed models , and random effects models , are a type of regression model that is used when participants are nested or clustered within organizational hierarchies, such as patients within hospitals, students within classrooms, or even repeated-measurements within individuals. Nesting is of interest because nested data violates the assumption of independence between participants, which causes the estimates of the standard errors for the regression coefficients to be too small. For example, two children in the same classroom might be expected to be more alike than two children who are in different classrooms. The degree of similarity of participants within a group or cluster is called the intraclass correlation coefficient , which is the proportion of the total variance that is shared between groups. Multilevel models work by dividing the total variability in scores on the outcome variable into different levels that reflect the nested structure of the data. Two-level models are most commonly used, although any number of levels are possible, such as students (Level 1) nested within teachers (Level 2) nested within schools (Level 3) nested within school districts (Level 4), and so on. Once the variability in the outcome has been attributed to the different levels of nesting, predictors, moderators, and interactions can then be added to the model to explain the variability at the different levels in the exact same manner as in single-level regression models.

Where multilevel models differ from single-level regression models regarding moderation, however, is that multilevel models can specify how variables occurring at one level influence relationships with variables at another level. Seaton, Marsh, and Craven ( 2010 ) use an example of the Big-Fish-Little-Pond effect to illustrate this concept, which states that although individual mathematics ability has a positive relationship with mathematics self-concept, higher school-average ability reduces this association. Here a two-level model is used because the students (Level 1) are nested within schools (Level 2). 4 In a simplified version of their model, Seaton et al. predicted individual mathematics self-concept (outcome variable) from individual mathematics ability (Level 1 predictor):

where i indexes individuals, j indexes schools, r ij is the Level 1 residual, and individual mathematics ability has been centered at the mean for each school.

The Level 1 model is at the student level and predicts self-concept for student i in school j . This model has an intercept ( β 0 j ) representing self-concept for mean mathematics ability across all schools and a slope ( β 1 j ) representing the effect of mathematics ability on self-concept across all schools. It is possible, however, that the effect of mathematics ability on mathematics self-concept is not the same for all schools. To explain the differences between self-concept and math achievement between students, β 0 j and β 1 j are allowed to vary across schools, hence the subscript j and why they are called random coefficients . In other words, each school is allowed to have its own intercept and slope. To model the variability in the intercept and slope of the Level 1 model between schools, two Level 2 models are created which are at the school level:

The Level 1 intercept ( β 0 j ) is partitioned into a mean intercept across schools ( γ 00 ) and a random effect ( u 0 j ), which represents the difference between the mean intercept across schools and the specific intercept for each school. In the same way, the Level 1 slope ( β 1 j ) is partitioned into the mean slope across schools ( γ 10 ) and a random effect ( u 1 j ), which represent the difference in the effect of individual mathematics ability averaged across schools and the effect of individual mathematics ability for a specific school.

Since β 0 j and β 1 j are allowed to vary by school, this variability in the random coefficients may be explained by adding school-level predictors to the Level 2 equations. For example, Seaton et al. ( 2010 ) added average school mathematics ability, centered at the grand mean, as a Level 2 predictor of both the Level 1 intercept and slope:

While a complete dissection of this model is beyond the scope of the current discussion, when the Level 2 equations are substituted into the Level 1 equation to get:

the interaction between student-level mathematics ability and school-level mathematics ability becomes obvious.

When a multilevel model contains a moderating variable from one level and an independent variable from another level, it is called a cross-level interaction (Raudenbush & Bryk, 2002 ). For the current example, students of all abilities had lower mathematics self-concepts if they attended high-ability schools compared to students of similar ability who attended average- or low-ability schools. The decrease in mathematics self-concept was more dramatic for higher-ability students. This phenomenon led Davis ( 1966 ) to warn parents against sending their children to “better” schools where the child would be in the bottom of the class. For multilevel models, it is not necessary to create a product term to estimate a cross-level moderation effect. Rather, if a Level 2 variable has a significant effect on the Level 1 slope, the moderation hypothesis is supported. Interactions between variables at the same level (e.g., a Level 1 predictor and Level 1 moderator) must still be entered manually.

Moderator variables in multilevel models share many of the challenges of moderators in single-level regression. For example, centering is recommended in multilevel models to facilitate interpretation, unless the predictors have a meaningful zero point. When adding Level 1 explanatory variables, centering becomes especially important. There are two ways to center Level 1 variables: grand mean centering (individuals centered around the overall mean) and group mean centering (individuals centered around group means). To avoid confusing a within-group relationship with a between-group relationship, it is recommended to group mean center Level 1 predictors, while grand mean centering Level 2 predictors. For more about centering in multilevel applications, see Enders and Tofighi ( 2007 ).

Moderation in Structural Equation Models

Structural equation modeling (SEM) is a collection of techniques that can be used to examine the relations between combinations of observed variables ( manifest ; e.g., height) and unobservable construct variables ( latent ; e.g., depression). As such, SEM can be used for examining many research questions, including: theory testing, prediction, estimating effect sizes, mediation, group differences, and longitudinal differences (Kline, 2011 ). SEMs can include both a measurement model , which describes the relation between each latent construct and the observed items used to measure individuals’ scores on that latent construct, and a structural model , which specifies the relations between latent constructs, as well as manifest variables.

Multiple-Group Analysis.

Testing for moderation in SEM can be conducted in multiple ways. If both the predictor and the moderator are manifest variables, then an interaction term can be computed by taking the product of the predictor and moderator, which is then added to the SEM as a new variable, just as in multiple regression. Provided the moderator is an observed categorical variable, moderation can also be tested in SEM using a multiple-group analysis. In a multiple-group analysis , the SEM model is fit with the path between the predictor and the outcome variable constrained to be the same in all moderator groups, and then a second time with the path unconstrained, such that the effect is allowed to be different for each group. The overall fit of the two models (i.e., constrained vs. unconstrained) is then compared. If the unconstrained model does not fit significantly better than the constrained model, then the effect is the same for all of the groups and moderation is not present. If the unconstrained model fits significantly better than the constrained model, however, it is concluded that the effect is different for at least one of the groups and moderation is present.

When variables are not perfectly reliable, as routinely occurs in psychology, it is often preferable to create latent variables to provide a mechanism for explicitly modeling measurement error. Latent moderator approaches are divided into partially latent variable approaches, where at least one variable is latent and at least one variable is observed, and fully latent variable approaches, where all variables are latent (Little, Bovaird, & Widaman, 2006 ; Marsh, Wen & Hau, 2006 ). A multiple-group analysis with a latent predictor variable is a partially latent variable approach since the moderator must be observed. Two other historical partially latent approaches include using factor scores in regression and a two-stage least-squares method (Bollen, 1995 ), although these methods are generally inferior to SEM approaches and therefore are not recommended. Fully latent approaches can also implemented within the context of an SEM (e.g., creating a third latent variable to represent the interaction of the two other latent variables), but some issues exist concerning the practicality and interpretation of a latent construct that represents the interaction between two other latent constructs. Several approaches have been proposed for modeling fully latent interactions (see Marsh et al., 2007 , for a review), but most approaches are based on the Kenny and Judd ( 1984 ) product indicator model.

One of the most common reasons for testing for moderation with latent variables in SEM is invariance testing (Mellenbergh, 1989 ; Meredith, 1993 ). Invariance testing is used to determine the degree to which a specific model fits the same in different groups or across time. Invariance is tested by imposing progressively stricter constraints across the groups and then comparing the model fit of the constrained model to a model with fewer constraints. Two types of invariance are discussed: factorial invariance and structural invariance.

Factorial invariance tests the factor structure or the measurement model across groups or time. Five levels of factorial invariance are commonly tested. The first level, dimensional invariance , is used to test whether the number of latent factors is the same across groups—this level of invariance is more commonly assumed than tested. The next level, configural invariance , tests whether the general pattern of item loadings on the latent constructs is the same across groups. If the factor loadings are found not just to have the same general pattern but to be exactly equal across groups, the model has loading or weak invariance across groups, which is the third level of factorial invariance. Loading invariance is the minimal level of invariance needed as evidence that a construct has the same interpretation across groups or time. The next level is intercept or strong invariance , which occurs when, in addition to the observed item loadings, the item intercepts are also equal across groups. The final level of factorial invariance is strict or error invariance , in which the observed item loadings, intercepts, and relations between the residual error terms are equal across groups. With strict factorial invariance, we have evidence that the measurement portion of the model is exactly the same across groups. In other words, this states that any group differences in scores are not due to how the constructs were measured, but rather are due to differences in mean ability levels or differences in the relationships between variables (Angoff, 1993 ; Millsap, 2011 ). We can also test for differences between groups in their average level and variability on a latent construct. Factor (co)variance invariance constrains the factor variances and covariances to be equal across groups, and if this is met, then the variance across groups is homogeneous. The highest level of factorial invariance is latent mean invariance , in which the latent means are constrained to be equal across groups. This is equivalent to a latent t -test or ANOVA, for which homogeneity of variance is an assumption.

To test for group differences that are due to differences in the relations between variables, structural invariance is used, which assumes full factorial invariance and imposes additional constraints on the regression coefficients in the structural model across groups. This is what is generally tested within the multiple-group SEM analysis described previously, which tests whether the path coefficients are the same across observed groups. It is not necessary for group membership to be observed, however. When subgroups are hypothesized, latent class analysis (McCutcheon, 1987 ) is a method used to identify individuals’ memberships in latent groups (i.e., classes), based on responses to a set of observed categorical variables. The latent group membership can be extracted and included in SEMs as a latent moderating variable. Additionally, changes in class membership over time can be examined using latent transition analysis (Collins & Lanza, 2010 ).

A different context in which latent variable models are useful is for modeling measurement error when the moderator variables or the corresponding independent variable have missing data. Enders, Baraldi, and Cham ( 2014 ) showed that re-specifying manifest independent and moderator variables as latent variables with one indicator each, factor loadings of one, and residual errors of zero preserves the intended interpretations but deals with the missing data using the multivariate normality assumptions in maximum likelihood estimation. Latent variables can easily be centered by constraining the latent means to zero, which provides meaningful and interpretable results without the need for transformations. Alternatively, multiple imputation has been shown to produce similar results as maximum likelihood, so the methods are interchangeable for this purpose.

Conditional Process Models

Given that the structural model is often used to reflect causal relations between variables, another topic that can be discussed in the context of SEM is moderation of mediated effects. Conditional process models combine moderator and mediator variables in the same model (Hayes, 2013 ) with process standing for the causal process that is mediation and conditional representing the differential effects of moderation. Consider the Theory of Planned Behavior (TPB; Ajzen, 1991 ), which is an example of a conditional process model. In the TPB, changes in attitudes and subjective norms (antecedent variables) change intentions (mediator variable), which in turn change observed behaviors (outcome variable), but the relation between intention and behavior differs depending on the level of an individual’s perceived behavioral control (moderator variable). The minimum requirements for a conditional process model are a single mediator variable and a single moderator variable, but conditional process models can be much more complex with multiple mediator and moderator variables operating simultaneously. This is the main reason the general term conditional process model has begun to replace the rather confusing historical terms moderated mediation (e.g., Little, Card, Bovaird, Preacher, & Crandall, 2007 ) and mediated moderation (Baron & Kenny, 1986 ). Though these terms were meant to indicate whether the researcher was examining possible moderation of a significant mediated effect (i.e., moderated mediation) or investigating whether a variable mediated a significant moderation effect (i.e., mediated moderation), in practice these terms have been used interchangeably because they can be used to describe identical statistical models. Since both are just special cases of conditional process models, we suggest that psychologists are better off referring to all models that contain both moderators and mediators as conditional process models because this requires that the researcher explain in detail the specific model being estimated, which is clearer all around.

Numerous authors have described how to test conditional process model hypotheses using the multiple regression framework (e.g., Hayes, 2013 ). These methods work quite well and significant interactions can be probed in much the same way as previously described for traditional regression models. When conditional process models become complex and at least one of the moderator variables is categorical, however, a better way to test for moderation is to use a multiple-group structural equation model. In the conditional process model case, a multiple-group SEM can be used to simultaneously test the mediation model across groups and directly test for differences in the mediation process between groups. For example, in M plus (Muthén & Muthén, 2015 ), it is possible to formally test the difference between the mediated effects when the moderator variable is dichotomous. This direct testing of group differences makes this method superior to methods that conduct the same analysis separately for each group (e.g., for males and then for females) and indirectly compare the results for differences.

Current Challenges

By definition, moderator variables illustrate the extent to which relations between variables are dependent on other factors including characteristics related to personality, environment, and context. Identifying moderation effects is particularly important for psychologists not only to better understand how mental processes are related to behaviors, but also to ensure that, in the effort to help, harm is not accidently caused to specific groups of individuals. Therefore, a comprehensive plan to examine all potential moderator variables should be an integral piece of any research study in psychology. Determining if a variable moderates the relation between two other variables poses several challenges to researchers, however, including the need to identify when a treatment causes harm to specific individuals, ensuring adequate statistical power to detect a moderation effect, and the difficulty in probing and interpreting complex moderation effects correctly. In this section, these issues are discussed, along with potential strategies for limiting their impact.

Treatment Interactions

As discussed previously, one of the key reasons psychologists should be interested in moderating variables is that they provide information on how the effect of a treatment, such as a CBT or behavioral prevention intervention, may function differently for groups of individuals. The effect of a treatment can vary depending on a number of different moderator variables, including demographic variables such as gender or ethnicity (Judd, McClelland, & Smith, 1996 ), a participant’s aptitude, called an aptitude by treatment interaction (Cronbach & Snow, 1977 ), or a participant’s pre-treatment level of an outcome or mediator variable, called a baseline by treatment interaction (Fritz et al., 2005 ). When present, these effects provide information that may then be used to tailor a treatment to be more effective for specific at-risk individuals. More important than improving the effectiveness of a treatment, however, is making sure there are no iatrogenic effects of the treatment. An iatrogenic effect occurs when a treatment causes an unplanned, harmful effect. For example, consider an intervention designed to prevent teenagers from using marijuana that actually increases marijuana use for some individuals. Iatrogenic effects are easily missed when they occur in only a small percentage of a sample, but ethically these effects need to be identified. Therefore, it is crucial that all theoretically relevant variables that may moderate the effect of a treatment be measured and tested.

Statistical Power

Theoretical moderating variables are not always supported by empirical research, however (e.g., Zedeck, 1971 ). When we fail to reject a null hypothesis of no moderating effect, there are two potential reasons why: either the null hypothesis is true and the variable truly does not moderate the effect, or the null hypothesis is false but it was not detected by the statistical test conducted (i.e., a Type II error occurred). To prevent incorrect conclusions about moderation effects, the probability of detecting a true effect, or statistical power , must be high. The single biggest issue with detecting moderation, other than ensuring that potential moderator variables are measured and tested in the first place, is that interaction effects tend to explain much less variance than main effects (McClelland & Judd, 1993 ). Hence, even studies that are adequately powered to find main effects are likely to be woefully unpowered when it comes to detecting moderator variables. Some of the factors that result in the under-powering of studies in psychology are beyond control—when studying a rare disorder, it may be impossible to adequately power a study simply by increasing the sample size. But there are other ways to increase statistical power for detecting moderation effects. For example, McClelland ( 2000 ) discusses several methods for increasing the statistical power of a study without increasing the sample size, such as using more reliable measures. And McClelland and Judd ( 1993 ) show that oversampling extreme cases can increase the statistical power for tests of moderation.

Part of the cause of these underpowered studies, however, is that psychological theories are rarely specific enough to include hypotheses about effect sizes for main effects, let alone interactions. A larger concern is the conflation of the size of an effect with the theoretical importance of an effect. Too many psychologists interpret Cohen’s ( 1988 ) small, medium, and large designations of effect sizes as being a measure of an effect’s theoretical importance. Cohen did not intend for large to mean important and small to mean unimportant. Instead, these categories were presented as examples of effect sizes found in a very specific area (abnormal social psychology) that needed to be recalibrated for each area of psychology and set of variables. Therefore, an effect that explains 9% of the variance in a variable (a medium effect using Cohen’s designations) may explain so little variance as to be completely disregarded by one area of psychology, yet so large as to be unobtainable in another area. Regardless of the cause, the consequences of under-powering studies to find moderation are the same: an inability to provide context for effects, resulting in a poorer understanding of the world.

Multicollinearity

Another issue that must be considered when testing interactions is multicollinearity between the variables and the interaction terms. Multicollinearity occurs when predictors in a multiple regression are highly correlated with one another and can cause excessively large standard errors, reducing the statistical power to detect an interaction even further. Since the interaction terms are just the product of the predictors, it is not surprising that the individual predictors and the interaction terms can be highly correlated. Aiken and West ( 1991 ) show that centering the predictors prior to creating an interaction term can decrease the correlation between the predictors and the interaction term by removing the nonessential multicollinearity , which is an artificial relation caused by the scaling of the predictors, while leaving the real relation, called essential multicollinearity . Others (e.g., Hayes, 2013 ) have questioned whether multicollinearity is an issue with interactions and whether centering actually addresses multicollinearity because the highest-order term, in this case the interaction term, is unaffected by centering of the lower-order terms.

Too Many Variables

When all theoretically hypothesized moderators are measured and we have adequate power to test the effect of each moderator, we run into a new problem: too many variables. It is easy to see how nearly every variable in a regression model could be moderated by every other variable in the model. But including too many interaction terms can result in an increased risk of making a Type I error, along with extremely large standard errors and potential computational difficulties. In addition, moderating relationships can be difficult to disentangle from multicollinearity and curvilinear relationships between other variables (Ganzach, 1997 ). Multicollinearity between independent variables can lead to a significant interaction term when the true interaction is not significant (Busemeyer & Jones, 1983 ; Lubinski & Humphreys, 1990 ) or may cause the interaction term to have a curvilinear appearance although the true interaction is not curvilinear. A moderating effect may also be erroneously found when there is a curvilinear relationship between the dependent and independent variables, but the model is mis-specified by excluding curvilinear terms. Lubinski and Humphreys ( 1990 ) illustrate the difficulty of distinguishing between an interaction model and a model with a curvilinear effect in which two variables are highly correlated.

The problem of too many variables is compounded when we consider that the effect of a moderator variable on the relation between an independent and dependent variable may not just differ depending on values of a second moderator variable (i.e., a three-way interaction), but also on a fourth or fifth moderator variable. Returning to the Revelle et al. ( 1980 ) example, suppose that the moderation effect of time of day on the two-way interaction between caffeine and personality type was itself different for gender (a four-way interaction). And suppose the four-way interaction between caffeine, personality type, time of day, and gender was moderated by whether the participant routinely drank highly caffeinated beverages such as coffee and soda (a five-way interaction). While four-way and higher interactions may be of interest to a researcher, an added complexity inherent to higher-order interactions is that, as described before, to properly specify a model with higher-order interactions, all lower-order interaction terms must be included in the model (Cohen, 1978 ; Cohen et al., 2003 ). For example, in an ANOVA with five factors, to correctly estimate the five-way interaction between all five factors, all possible four-way (five with five factors), three-way (nine with five factors), and two-way interactions (ten with five factors), as well as the main effects of the five factors must be included, for a total of 30 effects!

A final concern is that interactions that involve more than three variables can become very difficult to interpret in any meaningful way. This is particularly problematic in ANOVA models with large numbers of factors since many software programs automatically include all possible interactions between the factors. While failing to include an interaction term in a model is equivalent to explicitly saying the interaction effect is exactly zero, taking a kitchen-sink approach and testing all possible interactions is generally a poor strategy. Instead, researchers should test all moderation effects hypothesized by the underlying theory being studied and use diagnostic tools such as plots of residuals to determine if specific unhypothesized interactions may exist in the data, making sure to note that these additional analyses are exploratory.

Conclusions

Moderation and moderator variables are one of the most common analyses in the psychological, social, and behavioral sciences. Regardless of the phenomenon being studied, it is helpful to more fully understand for whom and in what context an effect occurs. Moderation variables help researchers test hypotheses about how the strength and/or direction of the relation between two variables may differ between individuals. Though the basic methods for analyzing moderation effects have not changed dramatically in the past 25 years, new tools have been developed to aid researchers in probing and interpreting significant interactions. The challenge for psychologists today is to include moderator variables in their theories, then plan studies that not only measure these potential moderator variables, but also are adequately powered to find moderation effects.

A majority of the interaction models and probing of significant interaction terms described here can be conducted using any general statistical software package. For psychology, popular general statistical software packages to examine moderation include:

SPSS , SAS , Stata ; and R .

While many of these more general statistical programs can also be used to test for moderation in multilevel and SEM models, specialized software may be preferred. For multilevel models, HLM is often used. For SEM models, especially those that include latent variables, Mplus , LISREL , Amos , EQS , or R may be preferred. For power analyses, two excellent programs are G-Power and Optimal Design .

Acknowledgments

This research was supported in part by a grant from the National Institute on Drug Abuse (DA 009757).

Software Resources

  • Arbuckle, J. L. (2014). Amos (Version 23.0) [computer software]. Chicago: IBM SPSS.
  • Bentler, P. M. (2014). EQS (Version 6.2) [computer software]. Los Angeles, CA: MVSoft, Inc.
  • Faul, F. , Erdfelder, E. , Buchner, A. , & Lang, A.-G. (2014). G-Power (version 3.1.9.2) [computer software].
  • IBM . (2016). SPSS Statistics . (Version 23.0) [computer software]. Armonk, NY: IBM Corp.
  • Joreskog, K. G. , & Sorbom, D. (2016). LISREL (Version 8.8) [computer software]. Skokie, IL: Scientific Software International, Inc.
  • Muthén, L. K. , & Muthén, B. O. (2016). Mplus (Version 7.4) [computer software]. Los Angeles: Muthén & Muthén.
  • R Core Development Team . (2016). R (Version 3.3) [computer software]. Vienna, Austria: R Foundation for Statistical Computing.
  • Raudenbush, S. W. , Bryk, A. S. , & Congdon, R. (2016). HLM (Version 7) [computer software]. Skokie, IL: Scientific Software International, Inc.
  • SAS Institute . (2016). SAS (Version 9.4) [computer software]. Cary, NC: SAS Institute Inc.
  • Spybrook, J. , Bloom, H. , Congdon, R. , Hill, C. , Martinez, A. , & Raudenbush, S. (2011) Optimal Design [computer software].
  • StataCorp . (2015). Stata Statistical Software (Version 14) [computer software]. College Station, TX: StataCorp LP.

Further Reading

  • Aiken, L. S. , & West, S. G. (1991). Multiple regression: Testing and interpreting interactions . Newbury Park, NJ: SAGE.
  • Baron, R. M. , & Kenny, D. A. (1986). The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology , 51 , 1173–1182.
  • Cohen, J. , Cohen, P. , West, S. G. , & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3d ed.). Mahwah, NJ: Lawrence Erlbaum.
  • Dawson, J. F. , & Richter, A. W. (2006). Probing three‐way interactions in moderated multiple regression: Development and application of a slope difference test. Journal of Applied Psychology , 91 (4), 917–926.
  • Hayes, A. F. (2013). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach . New York: Guilford Press.
  • Hoffman, L. (2015). Between-person analysis and interpretation of interactions. In L. Hoffman (Ed.), Longitudinal analysis: Modeling within-person fluctuation and change (pp. 29–78). New York: Routledge.
  • Jaccard, J. (1997). Interaction effects in factorial analysis of variance . Thousand Oaks, CA: SAGE.
  • Jaccard, J. , & Turrisi, R. (2003). Interaction effects in multiple regression (2d ed.). Thousand Oaks, CA: SAGE.
  • Keppel, G. , & Wickens, T. D. (2004). Design and analysis (4th ed.). Upper Saddle River, NJ: Pearson.
  • Preacher, K. J. , Curran, P. J. , & Bauer, D. J. (2006). Computational tools for probing interactions in multiple linear regression, multilevel modeling, and latent curve analysis. Journal of Educational and Behavioral Statistics , 31 (4), 437–448.
  • Ajzen, I. (1991). The theory of planned behavior. Organizational behavior and human decision processes , 50 , 179–211.
  • Angoff, W. H. (1993). Perspectives on differential item functioning methodology. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 3–23). Hillsdale, NJ: Erlbaum.
  • Avolio, B. J. , Mhatre, K. , Norman, S. M. , & Lester, P. (2009). The moderating effect of gender on leadership intervention impact. Journal of Leadership & Organizational Studies , 15 , 325–341.
  • Blalock, H. M. (1965). Theory building and the statistical concept of interaction. American Sociological Review , 30 (3), 374–380.
  • Bollen, K. A. (1995). Structural equation models that are nonlinear in latent variables: A least-squares estimator. Sociological Methodology , 25 , 223–252.
  • Busemeyer, J. R. , & Jones, L. E. (1983). Analysis of multiplicative combination rules when the causal variables are measured with error. Psychological Bulletin , 93 , 549–562.
  • Cohen, J. (1968). Multiple regression as a general data-analytic system. Psychological Bulletin , 70 , 426–443.
  • Cohen, J. (1978). Partialed products are interactions; Partialed powers are curve components. Psychological Bulletin , 85 , 858–866.
  • Cohen, J. (1988). Statistical power analyses for the behavioral sciences (2d ed.). Mahwah, NJ: Lawrence Erlbaum.
  • Collins, L. M. , & Lanza, S. T. (2010). Latent class and latent transition analysis: With applications in the social, behavioral, and health sciences . Hoboken, NJ: Wiley.
  • Cronbach, L. , & Snow, R. (1977). Aptitudes and instructional methods: A handbook for research on interactions . New York: Irvington.
  • Davis, J. (1966). The campus as a frog pond: An application of the theory of relative deprivation to career decisions for college men. American Journal of Sociology , 72 , 17–31.
  • Dawson, J. F. , & Richter, A. W. (2006) Probing three‐way interactions in moderated multiple regression: Development and application of a slope difference test. Journal of Applied Psychology , 91 (4), 917–926.
  • Enders, C. K. , Baraldi, A. N. , & Cham, H. (2014). Estimating interaction effects with incomplete predictor variables. Psychological Methods , 19 , 39–55.
  • Enders, C. K. , & Tofighi, D. (2007). Centering predictor variables in cross-sectional multilevel models: A new look at an old issue. Psychological Methods , 12 (2), 121–138.
  • Faul, F. , Erdfelder, E. , Buchner, A. , & Lang, A.-G. (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods , 41 , 1149–1160.
  • Friedrich, R. J. (1982). In defense of multiplicative terms in multiple regression equations. American Journal of Political Science , 26 , 797–833.
  • Fritz, M. S. , MacKinnon, D. P. , Williams, J. , Goldberg, L. , Moe, E. L. , & Elliot, D. (2005). Analysis of baseline by treatment interactions in a drug prevention and health promotion program for high school male athletes. Addictive Behaviors , 30 , 1001–1005.
  • Ganzach, Y. (1997) Misleading interaction and curvilinear terms. Psychological Methods , 2 , 235–247.
  • Judd, C. M. , McClelland, G. H. , & Smith, E. R. (1996). Testing treatment by covariate interactions when treatment varies within subjects. Psychological Methods , 1 , 366–378.
  • Kaufman, N. K. , Rohde, P. , Seeley, J. R. , Clarke, G. N. , & Stice, E. (2005). Potential mediators of cognitive-behavioral therapy for adolescents with comorbid major depression and conduct disorder. Journal of Consulting and Clinical Psychology , 73 , 38–46.
  • Kenny, D. A. , & Judd, C. M. (1984). Estimating the nonlinear and interactive effects of latent variables. Psychological Bulletin , 96 , 201–210.
  • Kline, R. (2011) Principles and practice of structural equation modeling (3d ed.). New York: Guilford Press.
  • Le Roy, M. (2009). Research methods in political science: An introduction using MicroCase ® . (7th ed.). Boston. Cengage Learning.
  • Little, T. D. , Bovaird, J. A. , & Widaman, K. F. (2006). Powered and product terms: Implications for modeling interactions among latent variables. Structural Equation Modeling , 13 , 497–519.
  • Little, T. D. , Card, N. A. , Bovaird, J. A. , Preacher, K. J. , & Crandall, C. S. (2007). Structural equation modeling of mediation and moderation with contextual factors. In T. D. Little , J. A. Bovaird , & N. A. Card (Eds.), Modeling contextual effects in longitudinal studies (pp. 207–230). New York: Psychology Press.
  • Long, E. , & Christian, M. (2015). Mindfulness buffers retaliatory responses to injustice: A regulatory approach. Journal of Applied Psychology , 100 (5), 1409–1422.
  • Lubin, A. (1961). The interpretation of significant interaction. Educational and Psychological Measurement , 21 , 807–817.
  • Lubinski, D. , & Humphreys, L. G. (1990). Assessing spurious “moderator effects”: Illustrated substantively with the hypothesized (“synergistic”) relation between spatial and mathematical ability. Psychological Bulletin , 107 , 385–393.
  • Marsh, H. W. , & Parker, J. W. (1984). Determinants of student self-concept: Is it better to be a relatively large fish in a small pond even if you don’t learn to swim as well? Journal of Personality and Social Psychology , 47 , 213–231.
  • Marsh, H. W. , Wen, Z. , & Hau, K. T. (2006). Structural equation models of latent interaction and quadratic effects. In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course (pp. 225–265). Charlotte, NC: Information Age.
  • Marsh, H. W. , Wen, Z. , Hau, K. T. , Little, T. D. , Bovaird, J. A. , & Widaman, K. F. (2007). Unconstrained structural equation models of latent interactions: Contrasting residual-and mean-centered approaches. Structural Equation Modeling , 14 , 570–580.
  • Maxwell, S. E. , & Delaney, H. D. (1993). Bivariate median splits and spurious statistical significance. Psychological Bulletin , 113 , 181–190.
  • Maxwell, S. E. , & Delaney, H. D. (2004). Designing experiments and analyzing data (2d ed.). New York: Psychology Press.
  • McClelland, G. H. (2000). Increasing statistical power without increasing sample size. American Psychologist , 55 , 963–964.
  • McClelland, G. H. , & Judd, C. M. (1993). Statistical difficulties of detecting interactions and moderator effects. Psychological Bulletin , 114 , 376–390.
  • McCutcheon, A. L. (1987). Latent class analysis . Newbury Park, CA: SAGE.
  • Mellenbergh, G. J. (1989). Item bias and item response theory. International Journal of Educational Research , 13 , 127–143.
  • Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika , 58 , 525–543.
  • Millsap, R. E. (2011) Statistical approaches to measurement invariance . New York: Routledge.
  • Muthén, L. K. , & Muthén, B. O. (2015). Mplus User’s Guide (7th ed.). Los Angeles: Muthén & Muthén.
  • Pedhauzer, E. J. (1997). Multiple regression analysis in behavioral research: Explanation and prediction (3d ed.). Fort Worth, TX: Wadsworth Publishing.
  • Potthoff, R. F. (1964). On the Johnson-Neyman technique and some extensions thereof. Psychometrika , 29 , 241–256.
  • Raudenbush, S. W. , & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2d ed.). London: SAGE.
  • Revelle, W. , Humphreys, M. S. , Simon, L. , & Gilliland, K. (1980). The interactive effect of personality, time of day, and caffeine: a test of the arousal model. Journal of Experimental Psychology: General , 109 , 1–31.
  • Seaton, M. , Marsh, H. W. , & Craven, R. (2010). Big-fish-little-pond effect: Generalizability and moderation: Two sides of the same coin. American Educational Research Journal , 47 , 390–433.
  • Snijders, T. , & Bosker, R. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling (2d ed.). London: SAGE.
  • Sommet, N. , Darnon, C. , & Butera, F. (2015). To confirm or to conform? Performance goals as a regulator of conflict with more-competent others. Journal of Educational Psychology , 107 , 580–598.
  • Yerkes, R. M. , & Dodson, J. D. (1908). The relation of strength of stimulus to rapidity of habit formation. Journal of Comparative Neurology of Psychology , 18 , 459–482.
  • Zedeck, S. (1971). Problems with the use of “moderator” variables. Psychological Bulletin , 76 , 295–310.

1. For illustrative purposes, we are drawing the details for the example from Figure 3 of Revelle et al. ( 1980 ), which combines results across multiple studies. Though the results presented here approximate those of Revelle et al., they are not based on the actual data, so the reader is encouraged to read Revelle et al.’s thoughtful and much more thorough discussion of the actual results.

2. As with the Revelle et al. ( 1980 ) example, only part of the overall Sommet et al. ( 2015 ) study is used for illustration, and the reader is encouraged to read the original paper for a complete discussion of the results.

3. Gender was not found to be a significant moderator in Long and Christian ( 2015 ), it is being used here only for illustrative purposes

4. In the original Seaton et al. ( 2010 ) paper, a third level (country) was included in the model but has been removed here for simplicity.

Related Articles

  • Goal Setting Theory: Causal Relationships, Mediators, and Moderators
  • Mediator Variables

Printed from Oxford Research Encyclopedias, Psychology. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 30 August 2024

  • Cookie Policy
  • Privacy Policy
  • Legal Notice
  • Accessibility
  • [195.158.225.230]
  • 195.158.225.230

Character limit 500 /500

IMAGES

  1. Formulating Hypothesis in Research

    moderation hypothesis formulation

  2. Results of Moderation Hypothesis Testing

    moderation hypothesis formulation

  3. Hypothesis Testing Assignment Help

    moderation hypothesis formulation

  4. Hypothesis for moderation

    moderation hypothesis formulation

  5. The multiple moderated mediation analysis (model 21 by Hayes, 2013).

    moderation hypothesis formulation

  6. Multiple regression analysis of moderation (Hypothesis 1).

    moderation hypothesis formulation

VIDEO

  1. Formulation of a Hypothesis

  2. B.ed 2nd sem / Formulation of Action Hypothesis / need and importance

  3. Hypothesis Formulation

  4. 3. Hypothesis Formulation

  5. Formulation and Testing of Hypothesis

  6. 9

COMMENTS

  1. The Three Most Common Types of Hypotheses – Savvy Statistics

    Moderation “Under what conditions does X lead to Y?” Of the three techniques I describe, moderation is probably the most tricky to understand. Essentially, it proposes that the size of a relationship between two variables changes depending upon the value of a third variable, known as a “moderator.”

  2. How will you write the moderating hypothesis ... - ResearchGate

    First, you have to identify which variable you will take as moderation, and what literature says, accordingly, you can determine the direction of the relationship.

  3. Chapter 6 Moderation and mediation | Statistics: Data ...

    Moderation and mediation. In this chapter, we will focus on two ways in which one predictor variable may affect the relation between another predictor variable and the dependent variable. Moderation means the strength of the relation (in terms of the slope) of a predictor variable is determined by the value of another predictor variable.

  4. 8 Extending the Fundamental Principles of Moderation Analysis

    Regardless of whether the antecedent variables are continuous or binary, the equation for the simple moderation is still \[Y = i_Y + b_1 X + b_2 W + b_3 XW + e_Y.\] We can use that equation to fit our first moderation model with a binary \(W\) (i.e., frame) like so.

  5. Section 7.3: Moderation Models, Assumptions, Interpretation ...

    Learning Objectives. At the end of this section you should be able to answer the following questions: What are some basic assumptions behind moderation? What are the key components of a write up of moderation analysis? Moderation Models. Difference between Mediation & Moderation.

  6. Section 7.3: Moderation Models, Assumptions, Interpretation ...

    Moderation and mediation is a form of regression that allows researchers to analyse how a third variable effects the relationship of the predictor and outcome variable. Moderation analyses imply an interaction on the different levels of M. PowerPoint: Basic Moderation Model. Consider the below model:

  7. Moderation Analysis in SPSS - Explained, Performing, Reported

    Create Interaction Term: Multiply the centered independent variable and the centered moderator to create an interaction term. Run Regression Analysis: Enter the independent variable, moderator, and interaction term into a multiple regression model predicting the dependent variable.

  8. Understanding and Using Mediators and Moderators | Social ...

    Mediators and moderators are two tools that engage with these puzzles. A mediator is a third variable that links a cause and an effect. A moderator is a third variable that modifies a causal effect. As we shall describe in detail below, mediation and moderation are causal models.

  9. Moderation | The Oxford Handbook of Quantitative Methods in ...

    Moderation effects address critical questions, such as under what circumstances, or for what sort of individuals, does an intervention have a stronger or weaker effect? Moderation can have important theoretical, substantive, and policy implications.

  10. Moderator Variables | Oxford Research Encyclopedia of Psychology

    Recent advances in moderation research have been focused in three areas: (1) moderator variables in the context of clustered data, (2) moderation with latent variables, and (3) models that have both moderator and mediator variables.