# ADA1: Class 21, Mean inference and hypothesis testing

Advanced Data Analysis 1, Stat 427/527, Fall 2023, Prof. Erik Erhardt, UNM

Author

Published

October 18, 2023

# Rubric

library(erikmisc)
library(tidyverse)
ggplot2::theme_set(ggplot2::theme_bw())  # set theme_bw for all plots

# Mechanics of a hypothesis test (review)

1. Set up the null and alternative hypotheses in words and notation.

• In words: “The population mean for [what is being studied] is different from [value of $$\mu_0$$].” (Note that the statement in words is in terms of the alternative hypothesis.)
• In notation: $$H_0: \mu=\mu_0$$ versus $$H_A: \mu \ne \mu_0$$ (where $$\mu_0$$ is specified by the context of the problem).
2. Choose the significance level of the test, such as $$\alpha=0.05$$.

3. Compute the test statistic, such as $$t_{s} = \frac{\bar{Y}-\mu_0}{SE_{\bar{Y}}}$$, where $$SE_{\bar{Y}}=s/\sqrt{n}$$ is the standard error.

4. Determine the tail(s) of the sampling distribution where the $$p$$-value from the test statistic will be calculated (for example, both tails, right tail, or left tail – in the direction of the alternative hypothesis).

5. State the conclusion in terms of the problem.

• Reject $$H_0$$ in favor of $$H_A$$ if $$p\textrm{-value} < \alpha$$.
• Fail to reject $$H_0$$ if $$p\textrm{-value} \ge \alpha$$. (Note: We DO NOT accept $$H_0$$.)
6. Check assumptions of the test.

# Height data for our class, test whether different from national average

Is the population mean height of UNM students eligible to take Stat 427/527 different from the US average for men (5 ft 9 1/2 in) or women (5 ft 4 in)?

# Height vs Hand Span
dat_hand <-
na.omit() |>
mutate(
Gender_M_F = Gender_M_F |> factor(levels = c("M", "F"))
, Year       = Year       |> factor()
)
Rows: 504 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): Year, Gender_M_F
dbl (4): Table, Person, Height_in, HandSpan_cm

ℹ Use spec() to retrieve the full column specification for this data.
ℹ Specify the column types or set show_col_types = FALSE to quiet this message.
str(dat_hand)
tibble [297 × 6] (S3: tbl_df/tbl/data.frame)
$Year : Factor w/ 4 levels "F15","F16","F19",..: 1 1 1 1 1 1 1 1 1 1 ...$ Table      : num [1:297] 1 1 1 1 1 1 1 1 2 2 ...
$Person : num [1:297] 1 2 3 4 5 6 7 8 1 2 ...$ Gender_M_F : Factor w/ 2 levels "M","F": 1 2 2 2 1 1 2 1 1 2 ...
$Height_in : num [1:297] 69 66 65 62 67 67 65 70 67 63 ...$ HandSpan_cm: num [1:297] 21.5 20 20 18 19.8 23 22 21 21.2 16.5 ...
- attr(*, "na.action")= 'omit' Named int [1:207] 9 13 14 15 16 17 18 22 23 24 ...
..- attr(*, "names")= chr [1:207] "9" "13" "14" "15" ...

Plot the estimated mean from our class sample versus the true US mean.

## If we create a summary data.frame with a similar structure as our data, then we
##   can annotate our plot with those summaries.

est_mean <-
dat_hand |>
group_by(
Gender_M_F
) |>
summarize(
Height_in = mean(Height_in)
, .groups = "drop_last"
) |>
ungroup() |>
mutate(
TrueEst = "Est"
)

true_mean <-
tribble(
~Gender_M_F, ~Height_in, ~TrueEst
,         "F",       64.0,   "True"
,         "M",       69.5,   "True"
)

trueest_mean <-
est_mean |>
bind_rows(
true_mean
)

trueest_mean
# A tibble: 4 × 3
Gender_M_F Height_in TrueEst
<chr>          <dbl> <chr>
1 M               70.4 Est
2 F               65.5 Est
3 F               64   True
4 M               69.5 True   

Here’s two ways to plot our data, annotating the observed and hypothesized means.

library(ggplot2)
p1 <- ggplot(data = dat_hand, aes(x = Gender_M_F, y = Height_in))
p1 <- p1 + geom_boxplot(alpha = 1/4)
p1 <- p1 + geom_jitter(position = position_jitter(width = 0.1))
p1 <- p1 + geom_point(data = trueest_mean, aes(colour = TrueEst, shape = TrueEst), size = 4, alpha = 3/4)
p1 <- p1 + labs(title = "Boxplots")
#print(p1)

library(ggplot2)
p2 <- ggplot(data = dat_hand, aes(x = Height_in))
p2 <- p2 + geom_histogram(binwidth = 1)
p2 <- p2 + geom_vline(data = trueest_mean, aes(xintercept = Height_in, colour = TrueEst, linetype = TrueEst))
p2 <- p2 + facet_grid(Gender_M_F ~ .)
p2 <- p2 + labs(title = "Histograms")
#print(p2)

p_arranged <-
cowplot::plot_grid(
plotlist  = list(p1, p2)  # list of plots
, nrow      = 1             # number of rows for grid of plots
, ncol      = NULL          # number of columns, left unspecified
, labels    = "AUTO"        # A and B panel labels
, rel_heights = c(1, 2)     # let Plot 2 take twice as much horizontal space as Plot 1
)

print(p_arranged)

## Example: Test female height equal to US.

# look at help for t.test
# ?t.test
# defaults include: alternative = "two.sided", conf.level = 0.95

# test females
t_summary_F <-
t.test(
dat_hand |> filter(Gender_M_F == "F") |> pull(Height_in)
, mu = 64
, alternative = "two.sided"
)

t_summary_F

One Sample t-test

data:  pull(filter(dat_hand, Gender_M_F == "F"), Height_in)
t = 7.0299, df = 137, p-value = 8.902e-11
alternative hypothesis: true mean is not equal to 64
95 percent confidence interval:
65.07520 65.91683
sample estimates:
mean of x
65.49601 
names(t_summary_F)
 [1] "statistic"   "parameter"   "p.value"     "conf.int"    "estimate"
[6] "null.value"  "stderr"      "alternative" "method"      "data.name"  
e_plot_ttest_pval(t_summary_F)

# assess model assumptions with bootstrap
e_plot_bs_one_samp_dist(
dat = dat_hand |> filter(Gender_M_F == "F") |> pull(Height_in)
)

Hypothesis test

1. “The population mean height for females at UNM eligible to take Stat 427/527 is different from the US population value of $$\mu_0=64$$ inches.”

• $$H_0: \mu = 64$$ versus $$H_A: \mu \ne 64$$
2. Let $$\alpha = 0.05$$, the significance level of the test and the Type-I error probability if the null hypothesis is true.

3. $$t_{s} = 7.03$$.

4. $$p = 8.9\times 10^{-11}$$, this is the observed significance of the test.

5. Because $$p = 8.9\times 10^{-11} < 0.05$$, we have sufficient evidence to reject $$H_0$$, concluding that the observed mean height is different than the US population mean.

6. Model assumptions are “not violated” (don’t say they were “met”) because the sampling distribution of the mean is approximately normal based on the bootstrap.

7. Type-1 or Type-2 error. I’ll let you answer this one.

## (5 p) Your turn: Test male height greater than US.

As above, set up the hypothesis test for males, but whether UNM males are taller on average than males in the US population.

## You'll need to modify the statement below to correspond
## to the hypothesis you wish to test

# test males
t_summary_M <-
t.test(
dat_hand |> filter(Gender_M_F == "M") |> pull(Height_in)
, mu = 0
, alternative = "two.sided"
)

t_summary_M

One Sample t-test

data:  pull(filter(dat_hand, Gender_M_F == "M"), Height_in)
t = 314.4, df = 158, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
69.96618 70.85080
sample estimates:
mean of x
70.40849 
e_plot_ttest_pval(t_summary_M)

# assess model assumptions with bootstrap
e_plot_bs_one_samp_dist(
dat = dat_hand |> filter(Gender_M_F == "M") |> pull(Height_in)
)

Hypothesis test

1. “”

• $H_0:$ versus $H_A:$
2. Let $$\alpha=0.05$$, the significance level of the test and the Type-I error probability if the null hypothesis is true.

3. $t_{s} =$.

4. $p =$, this is the observed significance of the test.

5. Because $p =$, …

6. Model assumptions are “not violated”/“violated” and why?

7. Given your conclusion, state whether you could have made a Type-1 or Type-2 error and why it is one but not the other.

# Length of gestation, confidence interval

Every year, the United States Department of Health and Human Services releases to the public a large dataset containing information on births recorded in the country. This dataset has been of interest to medical researchers who are studying the relation between habits and practices of expectant mothers and the birth of their children. In this exercise we work with a random sample of 1,000 cases from the dataset released in 2014. The length of pregnancy, measured in weeks, is commonly referred to as gestation. It is commonly reported that average length of human gestation is 280 days, or 40 weeks, from the first day of the woman’s last menstrual period. Test whether this data supports this claim.

library(openintro)
Loading required package: airports
Loading required package: cherryblossom
Loading required package: usdata
ggplot(births14, aes(x = weeks)) +
geom_histogram(binwidth = 1) +
labs(
x = "Gestation (weeks)",
y = "Count",
title = "Random sample of 1,000 births"
)

t_summary_births <-
t.test(
births14$weeks , mu = 0 , alternative = "two.sided" ) t_summary_births  One Sample t-test data: births14$weeks
t = 476.7, df = 999, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
38.50683 38.82517
sample estimates:
mean of x
38.666 
e_plot_ttest_pval(t_summary_births)

# assess model assumptions with bootstrap
e_plot_bs_one_samp_dist(births14$weeks) ## (5 p) Hypothesis test 1. “” •$H_0: $versus$H_A: $2. Let $$\alpha=0.05$$, the significance level of the test and the Type-I error probability if the null hypothesis is true. 3.$t_{s} = $. 4.$p = $, this is the observed significance of the test. 5. Because$p = \$, …

6. Given your conclusion, state whether you could have made a Type-1 or Type-2 error and why it is one but not the other.