Archived from Spring 2017 (Current year here.)

Spring 2017 Syllabus is below table. Spring 2017 schedule; Time: TR 1530-1645; Location: CTLB 300; Stat 428, CRN 33933; Stat 528, CRN 33935 + Peer mentors via UNM Stat 495/595: Statistics Education Practicum (SEP) Stat 495.001 or Stat 595.001, CRN 30543 or 41683

## Goal

Learn to produce beautiful (markdown) and reproducible (knitr) reports with informative plots (ggplot2) and tables (xtable) by writing code (R, Rstudio) to answer questions using fundamental statistical methods (analysis of covariance, logistic regression, and multivariate methods), which you’ll be proud to present (poster).

## News

3/1/17 – Data resources for poster: kaggle drivendata 538 agridat package wise data sources statsci datasets vanderbilt datasets 2/16/17 – How to get pairwise comparison plots from lsmeans(); there’s an object with a complicated name from lsmeans (but, below in the second line, start typing “lsm\$” and press TAB for the suggested object), for example: `lsm <- lsmeans( lm.ml.full, list( pairwise ~ species | sex ), adjust = "tukey" )` `plot( lsm\$`pairwise differences of contrast, sex | sex` )` 2/3/17 – scatter3d() plot from library(rgl) for Mac users (Ch 02, class 03): Install XQuartz (X11), reboot, log out and log back in, then install.packages(“rgl”). 1/22/17 – RStudio, disabling notebook “inline” results, prepared by TA Geoff Schultz.
Classroom computers: Please reboot classroom laptops at the end of class by request of the IT staff. Saving data: If you’re using classroom computers, use flash drives or UNM’s OneDrive (available in LoboMail) for saving files. I recommend using a very systematic folder structure, such as a main folder called Stat428_ADA2, with subfolders called homework, in-class, reading, poster, etc.

# Course content

## Weekly structure (also see Assessment below)

1. Pre-class (Tuesday): Reading, Video, Quiz (due before class — solutions become available Tue 3:30, after the quiz is due)
2. In-class: Activities in class Tuesday submitted to UNM Learn (evaluated by TA within 1 week), Wed 5pm turn in completed assignment. Thursday we will start the homework in class to allow you to struggle but get questions answered before finishing on your own.
3. Post-class (Thursday): Homework submitted to UNM Learn the following Thursday (evaluated by TA within 1 week).  Assignments will be common for all students.
UNM Learn for content, YouTube Video playlist (try 1.5 speed, then pause/rewatch as you need). Video: Upgrading R on Windows.

## Course notes, code, data, and video lectures

Ch Chapter Title Notes R code Datasets Video lectures playlist
01 R statistical software and review pdf R turkey.csv, rocket.dat 01-1, 01-2
02 Introduction to Multiple Linear Regression pdf R indian.dat, gce.dat 02-1, 02-2
03 A Taste of Model Selection for Multiple Regression pdf R ratliver.csv 03-1, 03-2
04 One Factor Designs and Extensions pdf R none 04
05 Paired Experiments and Randomized Block Experiments pdf R battery.dat, beetles.dat, itch.csv, ratinsulin.dat 05-0 05-1 05-2 05-3 05-4 05-5 05-6 05-7 05-8 05-9
06 A Short Discussion of Observational Studies pdf R sat.dat 06
07 Analysis of Covariance: Comparing Regression Lines pdf R tools.dat, toolsfake.dat, twins.dat 07-1 07-2 07-3 HW helper video
08 Polynomial Regression pdf R cloudpoint.dat, mooney.dat 08-1 08-2
09 Discussion of Response Models with Factors and Predictors pdf R faculty.dat 09-1 09-2 09-3
10 Automated Model Selection for Multiple Regression pdf R oxygen.dat 10-1 10-2 10-3
11 Logistic Regression pdf R beetles.dat, leuk.dat, menarche.csv, shuttle.csv, trauma.dat 11-1 11-2 11-3 11-4
12 An Introduction to Multivariate Methods pdf R none 12
13 Principal Component Analysis pdf R bgs.dat, shells.dat, sparrows.dat, temperature.dat 13-1 13-2 13-3
14 Cluster Analysis pdf R birthdeath.dat, teeth.dat 14-1 14-2 14-3
15 Multivariate Analysis of Variance pdf R shells_mf.dat 15
16 Discriminant Analysis pdf R mower.dat 16-1 16-2
17 Classification pdf R business.dat 17-1 17-2 17-3
18 Data Cleaning pdf R conversions.txt, dalton.txt, dirty_iris.csv, edits.txt, people.txt, unnamed.txt
lm_diag_plots.R function for a large set of standard diagnostic plots
(I reserve the right to continue to improve the materials throughout the semester.)

## Timetable

Wk-Date Cl Topic Reading, Video, Quiz In-class Worksheet, Data Homework Due before class
00-01/16 00 Install software See Step 0 video: 00
01-01/17 01 01 R, Review read: Ch 01 video: 01-1, 01-2 Note: numbers refer to week numbers
01-01/19 02 In-class quiz In-class: 02 R Review Rmd html dat Videos: 1, 2, 3 No HW 01
02-01/24 03 02 Introduction to Multiple Linear Regression read: Ch 02 video: 02-1, 02-2 quiz: 02 In-class: Rmd html dat Submit pdf with solutions by Wed 5pm. Quiz 02
02-01/26 04 HW: 02 Mult LR Rmd html dat Submit your pdf to UNM Learn. 2/02 Submit
03-01/31 05 03 A Taste of Model Selection for Multiple Linear Regression read: Ch 03, 04 video: 03-1, 03-2, 04 quiz: 03 (2 parts) In-class: Rmd html dat Quiz 03
03-02/02 06 04 Experimental Design: One and Two Factor Designs HW: 03 Taste Model Sel Rmd html dat 2/09 Submit Turn in HW 02
04-02/07 07 05 Paired Experiments and Randomized Block Designs read: Ch 05 (start – 5.2) video: 05-0 05-1 05-2 05-3 05-4 05-5 quiz: 04 In-class: Rmd html Quiz 04
04-02/09 08 HW: 04 Experiments 1 Rmd html 2/16 Submit Turn in HW 03
05-02/14 09 read: Ch 05 (5.3 – end) video: 05-6 05-7 05-8 05-9 quiz: 05 In-class: Rmd html dat Quiz 05
05-02/16 10 HW: 05 Experiments 2 Rmd html dat 2/23 Submit Turn in HW 04
06-02/21 11 06 Discussion of Observational Studies read: Ch 06-07 video: 06 07-1 07-2 07-3 quiz: 06 (2 parts) In-classhtml turn in paper version Quiz 06
06-02/23 12 07 Analysis of Covariance: Comparing Regression Lines HW: 06 ANCOVA 1 Rmd html dat 3/02 Submit Discuss Wald test matrix specification. Turn in HW 05
07-02/28 13 08 Polynomial Regression read: Ch 08-1 08-2 09-1 09-2 09-3 video: quiz: 07 (2 parts) In-class: Rmd html dat Quiz 07
07-03/02 14 09 Response Models with Factors and Predictors HW: 07 ANCOVA 2 Rmd html dat Helper video 3/09 Submit Turn in HW 06
08-03/07 15 10 Model Selection for Multiple Regression read: Ch 10 video: 10-1 10-2 10-3 quiz: 08 HW 07 Continued in class Quiz 08
08-03/09 16 HW 07 Continued in class, due by 5pm Turn in HW 07
09-03/14 17 Spring Break
09-03/16 18 Spring Break
10-03/21 19 11 Logistic Regression read: Ch 11 video: 11-1 11-2 11-3 11-4 quiz: 10 In-class: Rmd html dat Poster: Poster Planning Rmd html Due 3/28 Choose/define poster project requiring a method from class: ANCOVA, Logistic multiple regression, PCA, etc. Quiz 10
10-03/23 20 HW: 10 Logistic Regression Rmd html dat 3/30 Submit
11-03/28 21 12 An Introduction to Multivariate Methods read: Ch 12-13 video: 12 13-1 13-2 13-3 quiz: 11 (2 parts) In-class: Rmd html dat Quiz 11
11-03/30 22 13 Principal Components Analysis (PCA) HW: 11 PCA Rmd html dat 4/06 Submit Turn in HW 10
12-04/04 23 14 Cluster Analysis read: Ch 14-15 video: 14-1 14-2 14-3 15 quiz: 12 (2 parts) In-class: Clustering Rmd html dat Quiz 12
12-04/06 24 15 Multivariate Analysis of Variance (MANOVA) HW: 12 MANOVA Rmd html dat 4/13 Submit Turn in HW 11
13-04/11 25 16 Discriminant Analysis 17 Classification read: Ch 16-17 video: 16-1 16-2 17-1 17-2 17-3 quiz: 13 (2 parts) In-class: Discriminant analysis for classification Rmd html dat Quiz 13, Grade HW 11
13-04/13 26 13+11+17 PCA and logistic regression classifcation HW: 13+11+17 PCA and logistic Classification Rmd html dat 4/20 Submit Turn in HW 12
14-04/18 27 Posters begin HW: Poster document 1 of 2: Analysis, Due Friday Rmd html
14-04/20 28 4/21 Submit Turn in HW 13, Turn in Poster Doc 1/2 Fri 4/21
15-04/25 29 HW: Poster document 2 of 2: Intro/Discuss/Bib, Due Friday Rmd html
15-04/27 30 4/28 Submit Turn in Poster Doc 2/2 Fri 4/28
16-05/02 31 Survey Poster finalize Poster template pdf, Rnw, sty, bib, logo Example poster pdf, Rnw Transition from Markdown to LaTeX Video for poster transition Poster printing ARI Graphix \$9+tax poster printing Open Mon-Fri 7:30-5:30 4716 McLeod Rd NE Do not use their website! Email: plotting@abqrepro.com, Subject: ADA2 class poster Text: indicate to print “in color on bond paper”. Attach: Poster pdf with your name in the filename, such as “FirstLast_ADA2_poster.pdf”. Try to send by Tuesday 5 PM for the poster to be ready by Thursday (earlier is better). Arrange to pick up the poster. Price is \$0.75/sq ft for Spring 2017.
16-05/04 32 POSTERS Poster session in SMLC lobby 3:30-5:30pm Poster: Submit poster pdf to UNM Learn Fri 5/9 5pm Poster reviewing rubric
17-05/09 FINALS WEEK (no final) Surveys Due — submit receipt or confirmation page to UNM LearnLearning StudioEvalKit in Learn Surveys Due 5/11 5pm

# Syllabus

Description: A continuation of 427/527 that focuses on methods for analyzing multivariate data and categorical data. Topics include MANOVA, principal components, discriminant analysis, classification, factor analysis, analysis of contingency tables including log-linear models for multidimensional tables and logistic regression. Prerequisite: Stat 427 (ADA1) Semesters offered: Spring Lecture: Stat 428/528.001 (CRN 33933 or 33935), TR 1530-1645, CTLB 300 Video email: “Erik B. Erhardt” <erike@stat.unm.edu>, please include “ADA2” in the subject line Textbook: Peter Dalgaard, “Introductory Statistics with R“, Second Edition, 2008, ISBN: 978-0-387-79053-4. The book is not required, but it will provide a backup for what you learn in class. Office hours: SMLC 312, TR 1400-1500 Laptops running R: I encourage you to bring a laptop to class each day so you can try the R programming exercises in class. If you don’t have one, no problem, there are some laptops in class and teamwork is encouraged — sit next to someone friendly who likes to share.

## Teaching Assistants and Peer Mentors

Lindsey Pittington <lpittin@unm.edu>, SMLC 301 office hours Mon 12-2 PM, Wed 3-5 PM Yiming Yang <yiming@unm.edu>, SMLC 319 office hours Tue 10 AM – 12 PM, Thu 10 AM – 12 PM Geoffrey Dylan Schultz <gdschultz@unm.edu>, SMLC 345 office hours Wed 1-3 PM, Fri 1-3 PM And Erik’s office hours are  SMLC 312, Tue 2-3 PM and Thu 2-3 PM So many office hours!  Mon 12-2, Tue 10-12, 2-3, Wed 1-5, Thu 10-12, 2-3, & Fri 1-3.

## Student learning outcomes

Similar as in ADA1, but at a higher level.

## Assessment

• Quizzes will be due each Tuesday before class. Purpose: to assess reading and video comprehension and assure you’re prepared to actively participate in class activities with minimal lecture. (About 12, 20% of final grade, the lowest few are dropped.) Most weeks plan for 1-3 hours reading and video, 30-60 minute quiz.
• In-class assignments are due the following day by 5pm, submitted to UNM Learn. Purpose: to struggle and find success in class with the concepts and skills. (About 12, includes class participation, 20% of final grade, the lowest few are dropped.) Plan to start and finish in class, sometimes 1-2 hours beyond class.
• Homework (HW) assignments are assigned each Thursday and due the following Thursday, submitted to UNM Learn. Purpose: to apply concepts and skills to your class poster project. (About 12, 40% of final grade, the lowest few are dropped.) Most weeks plan on 2-12 hours per assignment.
• Poster will be developed and completed in the last weeks of the semester, and the last week we’ll have poster presentations. Purpose: to have an overarching set of questions to answer using methods learned in the course, with a deliverable you can be proud of! (16% total: 1 poster and presentation, 2% preparation, 10% poster, 2% presentation, and 2% evaluations of others of final grade.) In the last couple weeks, assembling this poster may take 3-5 hours, using a template provided to you.
• Course surveys are to collect information to help facilitate the class or to encourage participation in course evaluations. Purpose: to participate in national project-based learning projects and improve the course. (About 2, 4% of final grade [and a simple way to go from B+ to A].)
Final grade may include a small buffer at the discretion of the instructor. For example, final grade could be the total points earned divided by the total possible points times 0.95 for graduate students and 0.90 for undergraduate students. That is [Final Grade] = [Points Earned]/[Points possible * 0.95], so that your grade is slightly higher than you earned. Student Attendance:  If a student has more than 3 absences, I reserve the right to assign to that student a WF and drop mid-semester or assign an F at the end of the semester without warning.  Students in this situation need to speak with Erik immediately. Late assignments will not be accepted. Rubrics guide assessment (and self-assessment) of homework, code, projects, exams, and presentations. Each assignment will have its own specific rubric.  Homework formatting example. All R code for the assignment should be included with the part of the problem it addresses (for code and output use a fixed-width font, such as Courier). Do NOT use your R code and output as your answer to the problem, but include them to show me how you arrived at your answer. Your prose solution (in a non-fixed-width font) should be provided in addition to R output.

### Collaboration and citation

For homeworks I encourage you to work together. Please discuss the data, code, and problems with one another, but do your own exploration and write up. We expect everyone to hand in substantially different homeworks, and we will enforce this under the honor code. The small benefit you might get from plagiarism is not worth the severe penalty (of lost trust, being reported to the dean, no points for the assignment, etc.). As in life, please use any resources available to you. Projects and some homeworks will explicitly encourage you to use resources on the internet, but showing extra initiative will always be appreciated. You may find R programming tough at first, so feel free discuss your problems with other classmates or meet with or email questions to the TAs or me. I encourage you to use the ideas of others, but make them your own, giving credit. For projects have a formal bibliography, for homework cite casually, and for code simply copy the URL in as a comment (which is doubly helpful for finding the resource again).

## Statements

### Disability statement

If you have a documented disability that will impact your work in this class, please contact me to discuss your needs. You’ll also need to register with the Accessibility Resource Center in 2021 Mesa Vista Hall (building 56) across the courtyard east from the SUB.

### Title IX statement

In an effort to meet obligations under Title IX, UNM faculty, Teaching Assistants, and Graduate Assistants are considered “responsible employees” by the Department of Education (see pg 15).   This designation requires that any report of gender discrimination which includes sexual harassment, sexual misconduct and sexual violence made to a faculty member, TA, or GA must be reported to the Title IX Coordinator at the Office of Equal Opportunity. For more information on the campus policy regarding sexual misconduct.

# Our Classroom

We’re doing this because:
• We want you to be empowered with statistics.
• We believe everyone should get out of this course with awesome skills
• Real-time feedback promotes efficient learning
“It encourages me to engage actively with the course material and take responsibility for my learning.”

## GAISE Connections

Our six recommendations include the following:
1. Emphasize statistical literacy and develop statistical thinking
2. Use real data
3. Stress conceptual understanding, rather than mere knowledge of procedures
4. Foster active learning in the classroom
5. Use technology for developing conceptual understanding and analyzing data
6. Use assessments to improve and evaluate student learning

Learning without thought is labor lost. What I hear, I forget. What I see, I remember. What I do, I understand. – Confucius

# Archive

Did you receive a registration error? Send me an email with the following answers: 1. What registration error did you get (copy/paste is best)? 2. What is your UNM ID? 3. What is your Math/Stat background (that is, do you have the pre-reqs)? If you are waitlisted and qualified and we have enough seats, I will override you into the course. Don’t worry. Step 0: Before our first class (Tue 1/17) please read through the following and install the required software on your computer. If you don’t have a computer, there are classroom computers which will be of limited availability when the room is open.
1. Install or upgrade R (windows or mac) then Rstudio. Videos that may be helpful:
2. Install R packages, also update all packages within RStudio.
3. Install Mendeley.
4. Install LaTeX (for poster at end of semester).

### Passion Driven Statistics (PDS) data

Install PDS package. AddHealthW1 Sampling Design, Codebook, RData. AddHealthW4  Sampling Design, Codebook, RData. NESARC  Sampling Design, Codebook, RData. OutlookOnLife  Sampling Design, Codebook, RData. GapMinder  Sampling Design, Codebook, RData.

## Why stats now?

Important enough to have a US Chief Data Scientist (1) (2).