Plotting Games-Howell ANOVA post hoc differences in R

I am a regular reader of the Statistics by Jim blog and the motivation for my post stems from a comment I answered on one of his blog posts.

Background: Classic One-Way, Welch ANOVA, and post hoc tests

One of his recent posts discussed using a Welch’s ANOVA to the ‘Classic’ One-Way ANOVA. For those less familiar, the classic One-Way Analysis of Variance (ANOVA) allows us to compare the means of three or more groups. It uses Fisher’s F-test to determine if the mean of one of those groups is statistically different from the other means. Like the Student’s t-test, there is an assumption that each group has equal variance. Like Welch’s t-test, Welch’s ANOVA relaxes the equal variance assumption. In his blog post, Jim goes into greater detail and example on how the Welch ANOVA is less likely to produce Type I errors when there is unequal variance.

Post hoc test allow us a closer look at the pairwise differences between group means. There are many different post hoc tests, each with their own nuances such as how conservatively they adjust the pairwise p-values. Among the most popular post hoc tests for the classic One-Way ANOVA is Tukey’s HSD (honest significant difference).

For the Welch ANOVA, Jim opted to use the Games-Howell post hoc test. It is similar to the Tukey HSD, but unlike Tukey HSD, it does not assume groups to have the same standard deviation. This relaxed assumption goes well with the Welch ANOVA.

Reading post hoc results from raw software output is fine, but plotting the results can be much more visually appealing and help quickly sift through the results. Jim used Minitab to create his Games-Howell post hoc plot and was unsure how to help someone that asked him how to create the plot in R. I didn’t know how either, but it fired up my curiosity. Here are my results.

Creating a Games-Howell post hoc plot in R

Plotting the post hoc for a Tukey HSD is simple. Just run save the results of your Classic One-Way ANOVA as an object and place it inside of an Tukey HSD function, which can be nested into a plot function, like so:

aov.classic <- aov(DV~IV, data=dataset)
# alternatively save the Tukey results as an object and plot it
aov.classic.tukey <- TukeyHSD(aov.classic)

I wasn’t able to find a similar plot option for the Games-Howell post hoc results, but I’ll come to that later.

First, I had to install the userfriendlyscience package to access the posthocTGH function: The posthocTGH reports the frequency, mean, and variance for each group. Saving the results of that function as an object (here as gh) provides the following attributes.

gh <- posthocTGH(y=dataset$DV, x=dataset$IV)
[1] "input"        "intermediate" "output"      
[1] "posthocTGH"

Digging a bit more into the attributes for each of the names yields some interesting data for later use.

[1] "y"            "x"           
[3] "method"       "conf.level"  
[5] "digits"       "p.adjust"    
[7] "formatPvalue"
 [1] "x"               "y"              
 [3] "n"               "groups"         
 [5] "df"              "means"          
 [7] "variances"       "names"          
 [9] "pairNames"       "descriptives"   
[11] "errorVariance"   "se"             
[13] "dmeans"          "t"              
[15] "p.tukey"         "alpha"          
[17] "qcrit"           "tukey.low"      
[19] "tukey.high"      "df.corrected"   
[21] "se.corrected"    "t.corrected"    
[23] "qcrit.corrected" "gh.low"         
[25] "gh.high"         "p.gameshowell"  
[1] "tukey"        "games.howell"

Now that I have the Games-Howell results, I found & tweaked a TukeyHSD function code to work with that post hoc. Note: TukeyHSD function is in the stats package that typically loads with R at the start. The plotTukeyHSD function by Nathan Brouwern that I modified appears to offer a very similar plot. I obtained the TukeyHSD code from: Below is are the first few lines of code that needed modification.

#### The function STARTS here ####
plotTukeyHSD <- plotTukeysHSD <- function(tukey.out,
                           x.axis.label = "Comparison",
                           y.axis.label = "Effect Size",
                           axis.adjust = 0,
                           adjust.x.spacing = 5){
       tukey.out <-[[1]])
       means <- tukey.out$diff
       categories <- row.names(tukey.out)
       groups <- length(categories)
       ci.low <- tukey.out$lwr
       ci.up  <- tukey.out$upr   

The code snippet above uses a object defined tukey.out which contains the stored results of a TukeyHSD function. I modified the snippet to instead use the saved posthocTGH results, defined as object ght.out, and use extracted pair-names, means, and CI values.

#### The function STARTS here ####
plotGamesHowell <- function(ght.out,
                           x.axis.label = "Comparison",
                           y.axis.label = "Effect Size",
                       axis.adjust = 0,
                       adjust.x.spacing = 5){
       #ght.out <-[[1]]) # not used 
       means <- ght.out$intermediate$dmeans
       categories <- ght.out$intermediate$pairNames
       groups <- length(categories)
       ci.low <- ght.out$intermediate$gh.low
       ci.up  <- ght.out$intermediate$gh.high 

Here is a link to my modified code in Github:

Visualizing a Games-Howell post hoc plot in R

Let’s use the trusty (and lovingly overused) mtcars dataset built into R to help with an example. Since this is an ANOVA, let’s compare if cars differ on average by their quarter-mile time by the number of cylinders. Cylinders is stored numerically, so I will need to treat it as a factor in the ANOVA.

library(userfriendlyscience) # activate package
data(mtcars)  # load data
str(subset(mtcars, select=c(cyl, qsec)))  # examine variables
'data.frame':	32 obs. of  2 variables:
 $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
 $ qsec: num  16.5 17 18.6 19.4 17 ...

As usual, I like to take quick look at the summary stats before diving into an analysis. There are an uneven number of observations in each cylinder group, but not too awful since the smallest (7) is half as many as the largest (14).

descr(mtcars$qsec, transpose=TRUE, stats="common")
cbind(var=round(by(data=mtcars$qsec, INDICES=mtcars$cyl, FUN=var, na.rm=TRUE),2),
      sd=round(by(data=mtcars$qsec, INDICES=mtcars$cyl, FUN=sd, na.rm=TRUE),2))
Type: Numeric  

              Freq   % Valid   % Valid Cum.   % Total   % Total Cum.
----------- ------ --------- -------------- --------- --------------
          4     11     34.38          34.38     34.38          34.38
          6      7     21.88          56.25     21.88          56.25
          8     14     43.75         100.00     43.75         100.00
       <NA>      0                               0.00         100.00
      Total     32    100.00         100.00    100.00         100.00
Descriptive Statistics  
N: 32  

              Mean   Std.Dev     Min   Median     Max   N.Valid   Pct.Valid
---------- ------- --------- ------- -------- ------- --------- -----------
      qsec   17.85      1.79   14.50    17.71   22.90     32.00      100.00
   var   sd
4 2.83 1.68
6 2.91 1.71
8 1.43 1.20

Eyeballing the variance & standard deviations above, they look different but a Brown-Forsythe test shows they are not different enough to say the variances are not equal. So I could use the classic One-Way ANOVA & Tukey HSD here, but I’ll continue to showcase the Welch ANOVA & Games-Howell post hoc and report both post hoc plots.

leveneTest(qsec ~ as.factor(cyl), data=mtcars, center=mean)
Levene's Test for Homogeneity of Variance (center = mean)
      Df F value Pr(>F)
group  2  0.6759 0.5165

First the results from the Classic One-Way ANOVA and then the Welch ANOVA, both using functions from the stats package.

aov.classic <- aov(qsec ~ as.factor(cyl), data=mtcars)
               Df Sum Sq Mean Sq F value  Pr(>F)   
as.factor(cyl)  2  34.61   17.30   7.794 0.00196 **
Residuals      29  64.38    2.22                   
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
oneway.test(qsec ~ as.factor(cyl), data=mtcars, var.equal=FALSE)
	One-way analysis of means (not assuming equal variances)

data:  qsec and as.factor(cyl)
F = 7.7044, num df = 2.000, denom df = 14.047, p-value = 0.005512

As expected, the results are essentially the same, but let’s jump to the post hoc tests. Reminder, posthocTGH is from the userfriendlyscience package.

Tangent: Also part of the userfriendlyscience package is a oneway function that gives the ANVOA summary results, effect size, and a more detailed post hoc in a single function than the two usual summary of aov and TukeyHSD functions. According to the help documentation, the oneway function allows post hoc options for: Tukey, Games-Howell, Holm, Hochberg, Hommel, Bonferroni, Benjamin-Hochberg (BH), Benjamini & Yekutieli (BY), False Discovery Rate (FDR), or none. The posthocTGH will only do Games-Howell or Tukey.

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = qsec ~ as.factor(cyl), data = mtcars)

        diff       lwr        upr     p adj
6-4 -1.16013 -2.939271  0.6190113 0.2574564
8-4 -2.36513 -3.847748 -0.8825122 0.0013300
8-6 -1.20500 -2.908398  0.4983980 0.2053353
gh.test <- posthocTGH(y=mtcars$qsec, x=as.factor(mtcars$cyl), method="games-howell")
   n means variances
4 11    19       2.8
6  7    18       2.9
8 14    17       1.4

    diff ci.lo ci.hi   t   df    p
6-4 -1.2  -3.3  1.01 1.4 12.8  .36
8-4 -2.4  -3.9 -0.83 3.9 17.4 <.01
8-6 -1.2  -3.2  0.80 1.7  9.1  .27

Visualizing these as post hoc plots.

plot(TukeyHSD(aov.classic), las=1) # rotate axis labels
Tukey HSD post hoc plot
Games-Howell post hoc plot

And there we go. A new plot for the Games-Howell post hoc test results derived from modifying a Tukey HSD plot. Thanks to those that have come before me and made your code available to everyone.