Answer the questions in this document, compile to html, print to pdf, and submit to UNM Learn. Do not add this to your “ALL” `.Rmd`

document.

(20-25 min)

Use this online app to play with the next four challenges.

Complete each of steps and have a tablemate check your work for each. Give each other a thumbs up when you’ve got it!

**Choose 3 points where the line of best fit is \(y = 0 + 1 x\).**- Click “Add Points” and click 3 times in the plot area.
- Click “Display line of best fit” to show the red “best fit” line.
- Click “Move Points” then move the points in the plot area.
- Try to get within \(\pm 0.05\) of the target intercept of 0 and slope of 1. Note that the app displays the equations slope before the intercept, as in \(y = 1 x + 0\).

- When done, deselect “Display line of best fit” and click “Reset”.

**Fit your own line to 7 points.**- Click “Add Points” and click 7 times in the plot area.
- Click “Fit your own line” to show the green “your own” line.
- Click “Move Your Fit Line” then use the green circles on the line to move it.
- Recall that the best line passes through the mean (center) of the data and minimizes the sum of squared error (in the \(y\) direction).
- Click “Display line of best fit” to see how close your line was to the red “best fit” line.
- Repeat a couple times.
- Another good app for comparing your own line to the best fit: https://www.geogebra.org/m/xC6zq7Zv

- When done, deselect “Fit your own line” and “Display line of best fit” and click “Reset”.

**Illustrate the concept of leverage.**- “Leverage” is a measure of how much a point is an outlier (extreme) in the \(x\) direction. It’s called leverage because points with high leverage potentially have a lot of influence on the regression line slope, pulling it up or down like a lever.

- Click “Display line of best fit” to show the red “best fit” line.
- Place 9 points in a “cluster” on one side of the plot.
- Place 1 “solo” point by itself on the other side of the plot.
- Move the “solo” point up and down and notice how the regression line responds.
- Move one of the “cluster” points up and down and notice how the regression line responds.
- Discuss this behavior with a tablemate.

- When done, deselect “Display line of best fit” and click “Reset”.

**Relationship between correlation and slope.**- Click “Add Points” and click 7 times in the plot area.
- Click “Display line of best fit” to show the red “best fit” line.
- Click “Move Points” then move the points in the plot area.
- Make \(r < 0\) and a best fit line with positive slope.

Refer to the **data and output** below these questions.

Answer the questions in this document, compile to html, print to pdf, and submit to UNM Learn.

(2 p) Write regression equation.

(2 p) Interpret slope.

(2 p) Interpret R2

(2 p) Complete this table of predictions.

Replace the question marks with values. You can use R as a calculator.

`agewks` |
`shearpsi` |
---|---|

5 | ? |

20 | ? |

40 | ? |

- (2 p) Predictions: How comfortable do you feel (“good” or “bad”) about predictions for each of these values, and why?
`agewks`

= 5:`agewks`

= 20:`agewks`

= 40:

A rocket motor is manufactured by bonding an igniter propellant and a sustainer propellant together inside a metal housing. The shear strength of the bond between the two types of propellant is an important quality characteristic. It is suspected that shear strength is related to the age in weeks of the batch of sustainer propellant. Twenty observations on these two characteristics are given below. The first column is shear strength in psi (`shearpsi`

), the second is age of propellant in weeks (`agewks`

).

```
## Save the Rmd and .dat the ADA_WS_09_Data-BrainSizeData.csv data file to your computer
# this file uses spaces as delimiters, so use read.table()
rocket <- read.table("ADA1_WS_09_Data-RocketPropellant.dat", header = TRUE)
str(rocket)
```

```
## 'data.frame': 20 obs. of 2 variables:
## $ shearpsi: num 2159 1678 2316 2061 2208 ...
## $ agewks : num 15.5 23.8 8 17 5.5 ...
```

`head(rocket)`

```
## shearpsi agewks
## 1 2158.70 15.50
## 2 1678.15 23.75
## 3 2316.00 8.00
## 4 2061.30 17.00
## 5 2207.50 5.50
## 6 1708.30 19.00
```

```
library(ggplot2)
p <- ggplot(rocket, aes(x = agewks, y = shearpsi))
p <- p + geom_point()
p <- p + geom_smooth(method = lm, se = FALSE, fullrange = TRUE)
p <- p + xlim(0,NA)
print(p)
```

```
# fit the simple linear regression model
lm.shearpsi.agewks <- lm(shearpsi ~ agewks, data = rocket)
# use summary() to parameters estimates (slope, intercept) and other summaries
summary(lm.shearpsi.agewks)
```

```
##
## Call:
## lm(formula = shearpsi ~ agewks, data = rocket)
##
## Residuals:
## Min 1Q Median 3Q Max
## -215.98 -50.68 28.74 66.61 106.76
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2627.822 44.184 59.48 < 2e-16 ***
## agewks -37.154 2.889 -12.86 1.64e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 96.11 on 18 degrees of freedom
## Multiple R-squared: 0.9018, Adjusted R-squared: 0.8964
## F-statistic: 165.4 on 1 and 18 DF, p-value: 1.643e-10
```