---
title: "ADA1: Class 09, Linear Regression"
author: "Your Name Here"
date: "`r format(Sys.time(), '%B %d, %Y')`"
output:
html_document:
toc: true
---
# Rubric
Answer the questions in this document, compile to html, print to pdf, and submit to UNM Learn.
Do not add this to your "ALL" `.Rmd` document.
## Part 1, simple linear regression intuition-building exercise
(20-25 min)
Use this online app to play with the next four challenges.
* http://www.shodor.org/interactivate/activities/Regression/
Complete each of steps and have a tablemate check your work for each.
Give each other a thumbs up when you've got it!
1. __Choose 3 points where the line of best fit is $y = 0 + 1 x$.__
1. Click "Add Points" and click 3 times in the plot area.
2. Click "Display line of best fit" to show the red "best fit" line.
3. Click "Move Points" then move the points in the plot area.
4. Try to get within $\pm 0.05$ of the target intercept of 0 and slope of 1. Note that the app displays the equations slope before the intercept, as in $y = 1 x + 0$.
* When done, deselect "Display line of best fit" and click "Reset".
2. __Fit your own line to 7 points.__
1. Click "Add Points" and click 7 times in the plot area.
2. Click "Fit your own line" to show the green "your own" line.
3. Click "Move Your Fit Line" then use the green circles on the line to move it.
4. Recall that the best line passes through the mean (center) of the data and minimizes the sum of squared error (in the $y$ direction).
5. Click "Display line of best fit" to see how close your line was to the red "best fit" line.
3. Repeat a couple times.
* Another good app for comparing your own line to the best fit: https://www.geogebra.org/m/xC6zq7Zv
* When done, deselect "Fit your own line" and "Display line of best fit" and click "Reset".
3. __Illustrate the concept of leverage.__
* "Leverage" is a measure of how much a point is an outlier (extreme) in
the $x$ direction. It's called leverage because points with high
leverage potentially have a lot of influence on the regression line
slope, pulling it up or down like a lever.
1. Click "Display line of best fit" to show the red "best fit" line.
2. Place 9 points in a "cluster" on one side of the plot.
3. Place 1 "solo" point by itself on the other side of the plot.
4. Move the "solo" point up and down and notice how the regression line responds.
5. Move one of the "cluster" points up and down and notice how the regression line responds.
6. Discuss this behavior with a tablemate.
* When done, deselect "Display line of best fit" and click "Reset".
4. __Relationship between correlation and slope.__
1. Click "Add Points" and click 7 times in the plot area.
2. Click "Display line of best fit" to show the red "best fit" line.
3. Click "Move Points" then move the points in the plot area.
4. Make $r < 0$ and a best fit line with positive slope.
## Part 2, interpreting analysis
### Five questions to answer
Refer to the __data and output__ below these questions.
Answer the questions in this document, compile to html, print to pdf, and submit to UNM Learn.
1. (2 p) Write regression equation.
2. (2 p) Interpret slope.
3. (2 p) Interpret R2
4. (2 p) Complete this table of predictions.
Replace the question marks with values. You can use R as a calculator.
`agewks` | `shearpsi`
-|-
5|?
20|?
40|?
5. (2 p) Predictions: How comfortable do you feel ("good" or "bad") about predictions for each of these values, and why?
* `agewks` = 5:
* `agewks` = 20:
* `agewks` = 40:
### Data and output
A rocket motor is manufactured by bonding an igniter propellant and a sustainer
propellant together inside a metal housing. The shear strength of the bond
between the two types of propellant is an important quality characteristic. It
is suspected that shear strength is related to the age in weeks of the batch of
sustainer propellant. Twenty observations on these two characteristics are
given below. The first column is shear strength in psi (`shearpsi`), the second is age of
propellant in weeks (`agewks`).
```{R}
## Save the Rmd and .dat the ADA_WS_09_Data-BrainSizeData.csv data file to your computer
# this file uses spaces as delimiters, so use read.table()
rocket <- read.table("ADA1_WS_09_Data-RocketPropellant.dat", header = TRUE)
str(rocket)
head(rocket)
library(ggplot2)
p <- ggplot(rocket, aes(x = agewks, y = shearpsi))
p <- p + geom_point()
p <- p + geom_smooth(method = lm, se = FALSE, fullrange = TRUE)
p <- p + xlim(0,NA)
print(p)
# fit the simple linear regression model
lm.shearpsi.agewks <- lm(shearpsi ~ agewks, data = rocket)
# use summary() to parameters estimates (slope, intercept) and other summaries
summary(lm.shearpsi.agewks)
```