ANCOVA -- Notes and R Code
This post covers my notes of ANCOVA methods using R from the book “Discovering Statistics using R (2012)” by Andy Field. Most code and text are directly copied from the book. All the credit goes to him.
- 0a. What is ANCOVA?
- 0b. Assumptions
- 1. Enter data
- 2. Explore your data
- 3. Check that the covariate and any independent variables are independent
- 4. Do the ANCOVA
- 5. Compute contrasts or post hoc tests
- 6. Check for homogeneity of regression slopes
- Robust test (skipped)
- Effect size
- Conclusion
0a. What is ANCOVA?
ANCOVA extends ANOVA by including covariates into the analysis. Covariates mean continuous variables that are not part of the main experimental manipulation but have an influence on the dependent variable.
There are two reasons for including covariates:
- To reduce within-group error variance: If we can explain some of the ‘unexplained’ variance ($ SS_R $) in terms of other variables (covariates), then we reduce the error variance, allowing us to more accurately assess the effect of the independent variable ($ SS_M $).
- Elimination of confounds: In any experiment, there may be unmeasured variables that confound the results (i.e., variables other than the experimental manipulation that affect the outcome variable). If any variables are known to influence the dependent variable being measured, then ANCOVA is ideally suited to remove the bias of these variables. Once a possible confounding variable has been identified, it can be measured and entered into the analysis as a covariate.
0b. Assumptions
It is a part of my another post Assumptions of statistics methods.
Independence of the covariate and treatment effect
For example, anxiety and depression are closely correlated (anxious people tend to be depressed) so if you wanted to compare an anxious group of people against a non-anxious group on some task, the chances are that the anxious group would also be more depressed than the non-anxious group. You might think that by adding depression as a covariate into the analysis you can look at the ‘pure’ effect of anxiety, but you can’t.
Homogeneity of regression slopes
For example, if there’s a positive relationship between the covariate and the outcome in one group, we assume that there is a positive relationship in all of the other groups too.
If you have violated the assumption of homogeneity of regression slopes, or if the variability in regression slopes is an interesting hypothesis in itself, then you can explicitly model this variation using multilevel linear models (see Chapter 19).
1. Enter data
2. Explore your data
Self-test 1
Use R to find out the means and standard deviations of both the participant’s libido and the partner’s libido in the three groups.
Use basic=F
to remove some desciptives that don’t interest us.
Self-test 2
Use ggplot2 to produce boxplots for the Viagra data. Try to re-create Figure 11.4.
The data are currently in wide format, but we need them in long format, so we create a new datafile called restructuredData
that has the data in the correct format using the melt()
function from the reshape2 package
Levels of libido seem to increase for participants as the dose of Viagra increases but the opposite is true for their partners. Also, the spread of scores is more variable for the participants than their partners.
Levene’s test
The output shows that Levene’s test is very non-significant, F(2, 27) = 0.33, p = .72. This means that for these data the variances are very similar.
3. Check that the covariate and any independent variables are independent
Self-test 3
Conduct an ANOVA to test whether partner’s libido (our covariate) is independent of the dose of Viagra (our independent variable).
The main effect of dose is not significant, F(2, 27) = 1.98, p = .16, which shows that the average level of partner’s libido was roughly the same in the three Viagra groups. In other words, the means for partner’s libido are not significantly different in the placebo, low- and high-dose groups. This result means that it is appropriate to use partner’s libido as a covariate in the analysis.
4. Do the ANCOVA
If we want Type I sums of squares, then we enter covariate(s) first, and the independent variable(s) second.
A wrong way, without setting contrasts
A correct way, with setting orthogonal contrasts
To calculate Type III sums of squares properly we must specify orthogonal contrasts.
From the output, we can find that the p-value of Intercept after setting orthogonal contrasts becomes much smaller.
Adjusted means
Based on the output in self-test 1, we can see the low- and high-dose groups have very similar means, 4.88 and 4.85, whereas the placebo group mean is much lower at 3.22. Actually we can’t interpret these group means because they have not been adjusted for the effect of the covariate.
To get the adjusted means we need to use the effect()
function in the effects package.
The output shows the adjusted means (and their confidence intervals) and also the standard errors. Unlike the means in self-test 1, these adjusted means for the low-dose and high-dose groups are fairly different. In other words, when the means are adjusted for the effect of the covariate it looks very much like as dose increases, libido increases (from 2.93 in the placebo group, to 4.71 in the low-dose group and 5.15 in the high-dose group).
Self-test 4
Run a one-way ANOVA to see whether the three groups differ in their levels of libido.
This output shows (for illustrative purposes) the ANOVA table for these data when the covariate is not included. It is clear from the significance value, which is greater than .05, that Viagra seems to have no significant effect on libido. Therefore, without taking account of the libido of the participants’ partners we would have concluded that Viagra had no significant effect on libido, yet it does.
5. Compute contrasts or post hoc tests
Interpret planned contrasts
The overall ANCOVA does not tell us which means differ, so to break down the overall effect of dose we need to look at the contrasts that we specified before we created the ANCOVA model.
The first dummy variable (dose1) compares the placebo group with the low- and high-dose groups. The associated t-statistic is significant, indicating that the placebo group was significantly different from the combined mean of the Viagra groups.
The second dummy variable (dose2) compares the low- and high-dose groups. The associated t-statistic is not significant, indicating that the high-dose group did not produce a significantly higher libido than the low-dose group.
Post hoc tests
Because we want to test differences between the adjusted means, we can use only the glht()
function; the pairwise.t.test()
function will not test the adjusted means. As such, we are limited to using Tukey or Dunnett’s post hoc tests.
This output suggests significant differences between the high-dose and placebo groups (t = 2.77, p < .05). The confidence intervals also confirm this conclusion because they do not cross zero for the comparison of the high dose and placebo groups.
Plots
aov() function automatically generates some plots that we can use to test the assumptions. We can see these graphs by executing:
You will actually see four graphs, but the first two are the most important. The first graph (on the left of the figure) can be used for testing homogeneity of variance. The plot we have does show funnelling (the spread of scores is wider at some points than at others), which implies that the residuals might be heteroscedastic (a bad thing). The second plot (on the right) is a Q-Q plot, which tells us about the normality of residuals in the model.
It looks like the diagonal line has not washed for several weeks and the dots are running away from the smell.
Again, this is not good news for the model. These plots suggest that a robust version of ANCOVA might be in order.
6. Check for homogeneity of regression slopes
Self-test 5
Use ggplot2 to re-create Figure 11.3.
The output shows the scatterplots of the relationship between partnerLibido and libido in the three groups. This scatterplot showed that although this relationship was comparable in the low-dose and placebo groups, it appeared different in the high-dose group.
Use Anova() to test homogeneity of regression slopes
To test the assumption of homogeneity of regression slopes we need to run the ANCOVA again, but include the interaction between the covariate and predictor variable.
The main thing in which we’re interested is the interaction term, so look at the significance value of the covariate by outcome interaction (partnerLibido:dose), if this effect is significant then the assumption of homogeneity of regression slopes has been broken.
Robust test (skipped)
Based on the previous output, a robust version of ANCOVA might be in order. However, since the book uses a different dataset, I’d like to skip this part.
Effect size
There are several ways to calculate effect sizes.
- Eta squared ($\eta^2$), see chapter 10. This effect size is just $r^2$ by another name and is calculated by dividing the effect of interest, $SS_M$, by the total amount of variance in the data, $SS_T$. As such, it is the proportion of total variance explained by an effect. Since we have more than one effect, we could calculate $\eta^2$ for each effect.
- Partial eta squared (partial $\eta^2$), see section 11.6. This differs from eta squared in that it looks not at the proportion of total variance that a variable explains, but at the proportion of variance that a variable explains that is not explained by other variables in the analysis.
- Omega squared ($\omega^2$), see section 10.7. This measure computes the overall effect size. It can be calculated only when we have equal numbers of participants in each group. If it’s not the case, then use
rcontrast()
below. rcontrast(t, df)
, see section 10.7 and 11.6. This measure computes the effect size for more focused comparisons like planned contrasts.
Therefore, based on the output from Interpret planned contrasts in Step 5, the t-value of each effect is 2.23, 2.79, and 0.54. The degree of freedom is 26.
The output shows that the effect of the covariate (.400) and the difference between the combined dose groups and the placebo (.479) both represent medium to large effect sizes (they’re both between .4 and .5). The difference between the high- and low-dose groups (.106) was a fairly small effect.
mes()
, see section 11.6. This method calculates effect sizes between all combinations of groups. I’d like to skip this method.
Conclusion
I only keep the R code and some very brief interpretation of the results. To see the rationale of each method or read more description of each method, it is a good idea to read the book sections. For convenience, I have added section numbers for some methods.
Thanks for reading and feel free to correct me if I made any mistake.