```
library(erikmisc)
library(tidyverse)
::theme_set(ggplot2::theme_bw()) # set theme_bw for all plots ggplot2
```

# ADA1: Class 21, Mean inference and hypothesis testing

Advanced Data Analysis 1, Stat 427/527, Fall 2023, Prof. Erik Erhardt, UNM

# Rubric

# Mechanics of a hypothesis test (review)

Set up the

**null and alternative hypotheses**in words and notation.- In words: “The population mean for [what is being studied] is different from [value of \(\mu_0\)].” (Note that the statement in words is in terms of the alternative hypothesis.)
- In notation: \(H_0: \mu=\mu_0\) versus \(H_A: \mu \ne \mu_0\) (where \(\mu_0\) is specified by the context of the problem).

Choose the

**significance level**of the test, such as \(\alpha=0.05\).Compute the

**test statistic**, such as \(t_{s} = \frac{\bar{Y}-\mu_0}{SE_{\bar{Y}}}\), where \(SE_{\bar{Y}}=s/\sqrt{n}\) is the standard error.Determine the

**tail(s)**of the sampling distribution where the**\(p\)-value**from the test statistic will be calculated (for example, both tails, right tail, or left tail – in the direction of the alternative hypothesis).State the

**conclusion**in terms of the problem.- Reject \(H_0\) in favor of \(H_A\) if \(p\textrm{-value} < \alpha\).
- Fail to reject \(H_0\) if \(p\textrm{-value} \ge \alpha\). (Note: We DO NOT
*accept*\(H_0\).)

**Check assumptions**of the test.

# Height data for our class, test whether different from national average

Is the population mean height of UNM students eligible to take Stat 427/527 different from the US average for men (5 ft 9 1/2 in) or women (5 ft 4 in)?

```
# Height vs Hand Span
<-
dat_hand read_csv("ADA1_CL_09_Data-CorrHandSpan.csv") |>
na.omit() |>
mutate(
Gender_M_F = Gender_M_F |> factor(levels = c("M", "F"))
Year = Year |> factor()
, )
```

```
Rows: 504 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): Year, Gender_M_F
dbl (4): Table, Person, Height_in, HandSpan_cm
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
```

`str(dat_hand)`

```
tibble [297 × 6] (S3: tbl_df/tbl/data.frame)
$ Year : Factor w/ 4 levels "F15","F16","F19",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Table : num [1:297] 1 1 1 1 1 1 1 1 2 2 ...
$ Person : num [1:297] 1 2 3 4 5 6 7 8 1 2 ...
$ Gender_M_F : Factor w/ 2 levels "M","F": 1 2 2 2 1 1 2 1 1 2 ...
$ Height_in : num [1:297] 69 66 65 62 67 67 65 70 67 63 ...
$ HandSpan_cm: num [1:297] 21.5 20 20 18 19.8 23 22 21 21.2 16.5 ...
- attr(*, "na.action")= 'omit' Named int [1:207] 9 13 14 15 16 17 18 22 23 24 ...
..- attr(*, "names")= chr [1:207] "9" "13" "14" "15" ...
```

Plot the estimated mean from our class sample versus the true US mean.

```
## If we create a summary data.frame with a similar structure as our data, then we
## can annotate our plot with those summaries.
<-
est_mean |>
dat_hand group_by(
Gender_M_F|>
) summarize(
Height_in = mean(Height_in)
.groups = "drop_last"
, |>
) ungroup() |>
mutate(
TrueEst = "Est"
)
<-
true_mean tribble(
~Gender_M_F, ~Height_in, ~TrueEst
"F", 64.0, "True"
, "M", 69.5, "True"
,
)
<-
trueest_mean |>
est_mean bind_rows(
true_mean
)
trueest_mean
```

```
# A tibble: 4 × 3
Gender_M_F Height_in TrueEst
<chr> <dbl> <chr>
1 M 70.4 Est
2 F 65.5 Est
3 F 64 True
4 M 69.5 True
```

Here’s two ways to plot our data, annotating the observed and hypothesized means.

```
library(ggplot2)
<- ggplot(data = dat_hand, aes(x = Gender_M_F, y = Height_in))
p1 <- p1 + geom_boxplot(alpha = 1/4)
p1 <- p1 + geom_jitter(position = position_jitter(width = 0.1))
p1 <- p1 + geom_point(data = trueest_mean, aes(colour = TrueEst, shape = TrueEst), size = 4, alpha = 3/4)
p1 <- p1 + labs(title = "Boxplots")
p1 #print(p1)
library(ggplot2)
<- ggplot(data = dat_hand, aes(x = Height_in))
p2 <- p2 + geom_histogram(binwidth = 1)
p2 <- p2 + geom_vline(data = trueest_mean, aes(xintercept = Height_in, colour = TrueEst, linetype = TrueEst))
p2 <- p2 + facet_grid(Gender_M_F ~ .)
p2 <- p2 + labs(title = "Histograms")
p2 #print(p2)
<-
p_arranged ::plot_grid(
cowplotplotlist = list(p1, p2) # list of plots
nrow = 1 # number of rows for grid of plots
, ncol = NULL # number of columns, left unspecified
, labels = "AUTO" # A and B panel labels
, rel_heights = c(1, 2) # let Plot 2 take twice as much horizontal space as Plot 1
,
)
print(p_arranged)
```

## Example: Test female height equal to US.

```
# look at help for t.test
# ?t.test
# defaults include: alternative = "two.sided", conf.level = 0.95
# test females
<-
t_summary_F t.test(
|> filter(Gender_M_F == "F") |> pull(Height_in)
dat_hand mu = 64
, alternative = "two.sided"
,
)
t_summary_F
```

```
One Sample t-test
data: pull(filter(dat_hand, Gender_M_F == "F"), Height_in)
t = 7.0299, df = 137, p-value = 8.902e-11
alternative hypothesis: true mean is not equal to 64
95 percent confidence interval:
65.07520 65.91683
sample estimates:
mean of x
65.49601
```

`names(t_summary_F)`

```
[1] "statistic" "parameter" "p.value" "conf.int" "estimate"
[6] "null.value" "stderr" "alternative" "method" "data.name"
```

`e_plot_ttest_pval(t_summary_F)`

```
# assess model assumptions with bootstrap
e_plot_bs_one_samp_dist(
dat = dat_hand |> filter(Gender_M_F == "F") |> pull(Height_in)
)
```

**Hypothesis test**

“The population mean height for females at UNM eligible to take Stat 427/527 is different from the US population value of \(\mu_0=64\) inches.”

- \(H_0: \mu = 64\) versus \(H_A: \mu \ne 64\)

Let \(\alpha = 0.05\), the significance level of the test and the Type-I error probability if the null hypothesis is true.

\(t_{s} = 7.03\).

\(p = 8.9\times 10^{-11}\), this is the observed significance of the test.

Because \(p = 8.9\times 10^{-11} < 0.05\), we have sufficient evidence to reject \(H_0\), concluding that the observed mean height is different than the US population mean.

Model assumptions are “not violated” (don’t say they were “met”) because the sampling distribution of the mean is approximately normal based on the bootstrap.

Type-1 or Type-2 error. I’ll let you answer this one.

## (5 p) Your turn: Test male height greater than US.

As above, set up the hypothesis test for males, but whether UNM males are **taller** on average than males in the US population.

```
## You'll need to modify the statement below to correspond
## to the hypothesis you wish to test
# test males
<-
t_summary_M t.test(
|> filter(Gender_M_F == "M") |> pull(Height_in)
dat_hand mu = 0
, alternative = "two.sided"
,
)
t_summary_M
```

```
One Sample t-test
data: pull(filter(dat_hand, Gender_M_F == "M"), Height_in)
t = 314.4, df = 158, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
69.96618 70.85080
sample estimates:
mean of x
70.40849
```

`e_plot_ttest_pval(t_summary_M)`

```
# assess model assumptions with bootstrap
e_plot_bs_one_samp_dist(
dat = dat_hand |> filter(Gender_M_F == "M") |> pull(Height_in)
)
```

**Hypothesis test**

“”

- $H_0: $ versus $H_A: $

Let \(\alpha=0.05\), the significance level of the test and the Type-I error probability if the null hypothesis is true.

$t_{s} = $.

$p = $, this is the observed significance of the test.

Because $p = $, …

Model assumptions are “not violated”/“violated” and why?

Given your conclusion, state whether you could have made a Type-1 or Type-2 error and why it is one but not the other.

# Length of gestation, confidence interval

Every year, the United States Department of Health and Human Services releases to the public a large dataset containing information on births recorded in the country. This dataset has been of interest to medical researchers who are studying the relation between habits and practices of expectant mothers and the birth of their children. In this exercise we work with a random sample of 1,000 cases from the dataset released in 2014. The length of pregnancy, measured in weeks, is commonly referred to as gestation. It is commonly reported that average length of human gestation is 280 days, or 40 weeks, from the first day of the woman’s last menstrual period. Test whether this data supports this claim.

`library(openintro)`

`Loading required package: airports`

`Loading required package: cherryblossom`

`Loading required package: usdata`

```
ggplot(births14, aes(x = weeks)) +
geom_histogram(binwidth = 1) +
labs(
x = "Gestation (weeks)",
y = "Count",
title = "Random sample of 1,000 births"
)
```

```
<-
t_summary_births t.test(
$weeks
births14mu = 0
, alternative = "two.sided"
,
)
t_summary_births
```

```
One Sample t-test
data: births14$weeks
t = 476.7, df = 999, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
38.50683 38.82517
sample estimates:
mean of x
38.666
```

`e_plot_ttest_pval(t_summary_births)`

```
# assess model assumptions with bootstrap
e_plot_bs_one_samp_dist(births14$weeks)
```

## (5 p) Hypothesis test

“”

- $H_0: $ versus $H_A: $

Let \(\alpha=0.05\), the significance level of the test and the Type-I error probability if the null hypothesis is true.

$t_{s} = $.

$p = $, this is the observed significance of the test.

Because $p = $, …

Given your conclusion, state whether you could have made a Type-1 or Type-2 error and why it is one but not the other.