What effect does sex and species have on the mandible length of a kangaroo skull?
The data to be analyzed here are selected skull measurements on 148 kangaroos of known sex and species. There are 11 columns of data, corresponding to the following features. The measurements are in meters/10000 (mm/10).
column
Variable name
Description
1 *
sex
sex (1=M, 2=F)
2 *
species
species (0=M. giganteus, 1=M.f. melanops, 2=M.f. fuliginosus)
3
pow
post orbit width
4
rw
rostal width
5
sopd
supra-occipital - paroccipital depth
6
cw
crest width
7
ifl
incisive foramina length
8 *
ml
mandible length
9
mw
mandible width
10
md
mandible depth
11
arh
ascending ramus height
Some of the observations in the data set are missing (not available). These are represented by a period ., which in the read_csv() function is specified by the na = "." option.
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# First, download the data to your computer,# save in the same folder as this Rmd file.dat_kang <-read_csv("ADA2_CL_09_kang.csv" , na =c("", ".") ) %>%# subset only our columns of interestselect( sex, species, ml ) %>%# make dose a factor variable and label the levelsmutate(sex =factor(sex , labels =c("M","F")) , species =factor(species, labels =c("Mg", "Mfm", "Mff")) )
Rows: 148 Columns: 11
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dbl (11): sex, species, pow, rw, sopd, cw, ifl, ml, mw, md, arh
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Removed 148 - 136 = 12 observations with missing values.
# The first few observationshead(dat_kang)
# A tibble: 6 × 3
sex species ml
<fct> <fct> <dbl>
1 M Mg 1086
2 M Mg 1158
3 M Mg 1131
4 M Mg 1090
5 M Mg 1175
6 M Mg 901
(1 p) Interpret plots of the data, distributional centers and shapes
The side-by-side boxplots of the data compare the mandible lengths across the 6 combinations of sex and species. Comment on the distributional shapes and compare the typical mandible lengths across groups.
Solution
[answer]
(1 p) Do the plots above suggest there is an interaction?
Do the lines for each group seem to be very different from parallel?
Solution
[answer]
Fit the two-way interaction model
Solution
[answer]
(1 p) Check model assumptions for full model
Recall that we assume that the full model is correct before we perform model reduction by backward selection.
Solution
[answer]
(3 p) ANOVA table, test for interaction and main effects
Test for the presence of interaction between sex and species. Also test for the presence of main effects, effects due to the sex and species.
Solution
[answer]
(1 p) Reduce to final model, test assumptions
If the model can be simplified (because interaction is not significant), then refit the model with only the main effects. Test whether the main effects are significant, reduce further if sensible. Test model assumptions of your final model.
Solution
[answer]
(2 p) Summarize the differences
Summarize differences, if any, in sexes and species using relevant multiple comparisons. Give clear interpretations of any significant effects.
This code is here to get you started. Determine which comparisons you plan to make and modify the appropriate code. Make the code chunk active by moving the {R} to the end of the initial code chunk line.
{R}
library(emmeans)
# Contrasts to perform pairwise comparisons
cont_kang <- emmeans(lm_object, specs = "sex")
cont_kang <- emmeans(lm_object, specs = "species")
cont_kang <- emmeans(lm_object, specs = "sex", by = c("species"))
cont_kang <- emmeans(lm_object, specs = "species", by = c("sex"))
# Means and CIs
confint(cont_kang, adjust = "bonferroni")
# Pairwise comparisons
cont_kang %>% pairs(adjust = "bonf") # adjust = "tukey" is default
EMM plot interpretation
This EMM plot (Estimated Marginal Means, aka Least-Squares Means) is only available when conditioning on one variable. The blue bars are confidence intervals for the EMMs; don’t ever use confidence intervals for EMMs to perform comparisons – they can be very misleading. The red arrows are for the comparisons among means; the degree to which the “comparison arrows” overlap reflects as much as possible the significance of the comparison of the two estimates. If an arrow from one mean overlaps an arrow from another group, the difference is not significant, based on the adjust setting (which defaults to “tukey”).
{R, fig.height = 5, fig.width = 6}
# Plot means and contrasts
p <- plot(cont_kang, comparisons = TRUE, adjust = "bonf")
p <- p + labs(title = "Bonferroni-adjusted contrasts")
p <- p + theme_bw()
print(p)
Please refer to the Chapter 5 section named emmeans and Bonferroni corrections for how to appropriately calculate the Bonferroni p-values for a two-way interaction model.