Rubric

Answer the questions in this document, compile to html, print to pdf, and submit to UNM Learn. Do not add this to your “ALL” .Rmd document.


Sample the Globe example

How can we estimate the proportion of water on the globe using a beach ball?

Questions to answer

  1. (0 p) What is a good sampling strategy to pick points at random from a sphere?

In previous classes we brainstorm a strategy as we look at a beachball of the globe.

  • Suggestions
    1. Some suggest sampling latitude and longitudes, but those are not uniformly distributed on the earth and the poles would be sampled more densely than the equator.
    2. Some suggest cutting the ball into pieces and measuring how much water is on each piece.
    3. Finally, I suggest that we toss the ball around the room and when you catch it, look at your right pointer finger and determine if it’s on water or land; tossing randomizes the orientation of the ball, and catching samples a point on the ball.
  1. (3 p) How can this strategy be used to estimate the proportion of the globe covered by water?

Assuming we use Strategy 3, … [answer here]

  1. (0 p) Below are the data that we collected in class from a previous year. Compute the confidence interval for the true proportion of water on the ball.

Let \(n=\) the total number of observations and let \(x=\) the number of “successes” (number of water observations, land is a “failure”). These numbers are entered into the prop.test() and binom.test() functions below.

library(tidyverse)
## -- Attaching packages ----------------------------------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2     v purrr   0.3.4
## v tibble  3.0.3     v dplyr   1.0.0
## v tidyr   1.1.0     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0
## -- Conflicts -------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
## notes for prop.test() and binom.test()
# x = number of "successes"
# n = total sample size

x = 21
n = 21 + 14

dat_globe <-
  tribble(
    ~type   , ~freq , ~prop
  , "Water" ,     x ,      x  / n
  , "Land"  , n - x , (n - x) / n
  )
dat_globe
## # A tibble: 2 x 3
##   type   freq  prop
##   <chr> <dbl> <dbl>
## 1 Water    21   0.6
## 2 Land     14   0.4
# prop.test() is an asymptotic (approximate) test for a binomial random variable
p_summary <- prop.test(x = x, n = n, conf.level = 0.95)
p_summary
## 
##  1-sample proportions test with continuity correction
## 
## data:  x out of n, null probability 0.5
## X-squared = 1.0286, df = 1, p-value = 0.3105
## alternative hypothesis: true p is not equal to 0.5
## 95 percent confidence interval:
##  0.4220904 0.7564794
## sample estimates:
##   p 
## 0.6
# binom.test() is an exact test for a binomial random variable
b_summary <- binom.test(x = x, n = n, conf.level = 0.95)
b_summary
## 
##  Exact binomial test
## 
## data:  x and n
## number of successes = 21, number of trials = 35, p-value = 0.3105
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
##  0.4211177 0.7612919
## sample estimates:
## probability of success 
##                    0.6
  1. (4 p) Interpret the confidence interval for the proportion of water.

[answer here]

  1. (3 p) Here’s a gimme! Label the plot: the title, \(x\)-axis, and \(y\)-axis.

Note how to add error bars using geom_errorbar(). First determine the CI bounds from the binom.test() previously, then set those as limits.

[answer in plot]

# get names of objects in b_summary
names(b_summary)
## [1] "statistic"   "parameter"   "p.value"     "conf.int"    "estimate"   
## [6] "null.value"  "alternative" "method"      "data.name"
# here's the confidence interval bounds (the attribute tells us this is a 95% interval)
b_summary$conf.int
## [1] 0.4211177 0.7612919
## attr(,"conf.level")
## [1] 0.95
b_summary$conf.int[1]
## [1] 0.4211177
b_summary$conf.int[2]
## [1] 0.7612919
library(ggplot2)
p <- ggplot(data = dat_globe %>% filter(type == "Water"), aes(x = type, y = prop))
p <- p + geom_hline(yintercept = c(0, 1), alpha = 1/4)
p <- p + geom_bar(stat = "identity", fill = "gray60")
p <- p + geom_errorbar(aes(min = b_summary$conf.int[1], max = b_summary$conf.int[2]), width=0.25)
p <- p + scale_y_continuous(limits = c(0, 1))
print(p)