# Rubric

Answer the questions in this document, compile to html, print to pdf, and submit to UNM Learn. Do not add this to your “ALL” .Rmd document.

1. (2 p) Read and plot data, with $$x$$ = Avg_Mercury vs $$y$$ = Alkalinity.

2. (1 p) Describe the relationship you see.

3. (4 p) Determine an appropriate transformation of the $$x$$-variable, $$y$$-variable, or both in order to have a straight-line relationship.

• I recommend creating three more plots: $$(\log(x), y)$$, $$(x,\log(y))$$, and $$(\log(x), \log(y))$$. Choose the one that, in your view, is best described by a straight line.

• Describe in a sentence what makes this one the best choice.

4. (3 p) Interpret the slope on the transformed scale. For example, “For each unit increase in [$$x$$-variable], …”

## 1. Read and plot data

Save the datafile from the website to your computer. Read the data.

library(tidyverse)
## -- Attaching packages ------------------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2     v purrr   0.3.4
## v tibble  3.0.3     v dplyr   1.0.0
## v tidyr   1.1.0     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0
## -- Conflicts ---------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::lag()    masks stats::lag()
# the "skip = 25" ignores the first 25 lines of the text file (where I put descriptive text)
#   and starts reading at line 26.
dat_fish <-
, skip = 25
)
str(dat_fish)
## 'data.frame':    53 obs. of  12 variables:
##  $ID : int 1 2 3 4 5 6 7 8 9 10 ... ##$ Lake                  : chr  "Alligator" "Annie" "Apopka" "BlueCypress" ...
##  $Alkalinity : num 5.9 3.5 116 39.4 2.5 19.6 5.2 71.4 26.4 4.8 ... ##$ pH                    : num  6.1 5.1 9.1 6.9 4.6 7.3 5.4 8.1 5.8 6.4 ...
##  $Calcium : num 3 1.9 44.1 16.4 2.9 4.5 2.8 55.2 9.2 4.6 ... ##$ Chlorophyll           : num  0.7 3.2 128.3 3.5 1.8 ...
##  $Avg_Mercury : num 1.23 1.33 0.04 0.44 1.2 0.27 0.48 0.19 0.83 0.81 ... ##$ No.samples            : int  5 7 6 12 12 14 10 12 24 12 ...
##  $min : num 0.85 0.92 0.04 0.13 0.69 0.04 0.3 0.08 0.26 0.41 ... ##$ max                   : num  1.43 1.9 0.06 0.84 1.5 0.48 0.72 0.38 1.4 1.47 ...
##  $X3_yr_Standard_Mercury: num 1.53 1.33 0.04 0.44 1.33 0.25 0.45 0.16 0.72 0.81 ... ##$ age_data              : int  1 0 0 0 1 1 1 1 1 1 ...

Plot $$x$$ = Avg_Mercury vs $$y$$ = Alkalinity on their natural (original) scales.

## 3. Transform and plot

Note, there are two ways to plot the transformed data in ggplot(). Do either of these but not both.

1. Transform variables, plot transformed variables.
2. Plot original variable with rescaled axes.

(Note: Do not plot transformed variables on scaled axes, since that’s like transforming twice: $$\log(\log(x))$$.)

# With ggplot() consider using these "scale_?_log10()"" commands
#   to plot the original variables with scaled axes.
#   Compare to plotting the transformed variables directly.
p <- p + scale_x_log10()
p <- p + scale_y_log10()