Answer the questions in this document, compile to html, print to pdf, and submit to UNM Learn. **Do not** add this to your “ALL” `.Rmd`

document.

How can we estimate the proportion of water on the globe using a beach ball?

- (0 p) What is a good sampling strategy to pick points at random from a sphere?

In previous classes we brainstorm a strategy as we look at a beachball of the globe.

- Suggestions
- Some suggest sampling latitude and longitudes, but those are not uniformly distributed on the earth and the poles would be sampled more densely than the equator.
- Some suggest cutting the ball into pieces and measuring how much water is on each piece.
- Finally, I suggest that we toss the ball around the room and when you catch it, look at your right pointer finger and determine if it’s on water or land; tossing randomizes the orientation of the ball, and catching samples a point on the ball.

- (3 p) How can this strategy be used to estimate the proportion of the globe covered by water?

Assuming we use Strategy 3, … **[answer here]**

- (0 p) Below are the data that we collected in class from a previous year. Compute the confidence interval for the true proportion of water on the ball.

Let \(n=\) the total number of observations and let \(x=\) the number of “successes” (number of water observations, land is a “failure”). These numbers are entered into the `prop.test()`

and `binom.test()`

functions below.

`library(tidyverse)`

`## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --`

```
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.3 v dplyr 1.0.7
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 2.0.0 v forcats 0.5.1
```

```
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
```

```
## notes for prop.test() and binom.test()
# x = number of "successes"
# n = total sample size
x = 21
n = 21 + 14
dat_globe <-
tribble(
~type , ~freq , ~prop
, "Water" , x , x / n
, "Land" , n - x , (n - x) / n
)
dat_globe
```

```
## # A tibble: 2 x 3
## type freq prop
## <chr> <dbl> <dbl>
## 1 Water 21 0.6
## 2 Land 14 0.4
```

```
# prop.test() is an asymptotic (approximate) test for a binomial random variable
p_summary <- prop.test(x = x, n = n, conf.level = 0.95)
p_summary
```

```
##
## 1-sample proportions test with continuity correction
##
## data: x out of n, null probability 0.5
## X-squared = 1.0286, df = 1, p-value = 0.3105
## alternative hypothesis: true p is not equal to 0.5
## 95 percent confidence interval:
## 0.4220904 0.7564794
## sample estimates:
## p
## 0.6
```

```
# binom.test() is an exact test for a binomial random variable
b_summary <- binom.test(x = x, n = n, conf.level = 0.95)
b_summary
```

```
##
## Exact binomial test
##
## data: x and n
## number of successes = 21, number of trials = 35, p-value = 0.3105
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
## 0.4211177 0.7612919
## sample estimates:
## probability of success
## 0.6
```

- (4 p) Interpret the confidence interval for the proportion of water.

**[answer here]**

- (3 p) Here’s a gimme! Label the plot: the title, \(x\)-axis, and \(y\)-axis.

Note how to add error bars using `geom_errorbar()`

. First determine the CI bounds from the `binom.test()`

previously, then set those as limits.

**[answer in plot]**

```
# get names of objects in b_summary
names(b_summary)
```

```
## [1] "statistic" "parameter" "p.value" "conf.int" "estimate"
## [6] "null.value" "alternative" "method" "data.name"
```

```
# here's the confidence interval bounds (the attribute tells us this is a 95% interval)
b_summary$conf.int
```

```
## [1] 0.4211177 0.7612919
## attr(,"conf.level")
## [1] 0.95
```

`b_summary$conf.int[1]`

`## [1] 0.4211177`

`b_summary$conf.int[2]`

`## [1] 0.7612919`

```
library(ggplot2)
p <- ggplot(data = dat_globe %>% filter(type == "Water"), aes(x = type, y = prop))
p <- p + geom_hline(yintercept = c(0, 1), alpha = 1/4)
p <- p + geom_bar(stat = "identity", fill = "gray60")
p <- p + geom_errorbar(aes(min = b_summary$conf.int[1], max = b_summary$conf.int[2]), width=0.25)
p <- p + scale_y_continuous(limits = c(0, 1))
print(p)
```