ADA2 S22

UNM Stat 428/528: Advanced Data Analysis II (ADA2)

Spring 2022 Syllabus is below tables
  • Spring 2022
  • Time: 9:30-10:45 AM
  • Location: CTLB 300
  • Stat 428.001, CRN 33933; Stat 528.001, CRN 33935

Asking smart questions

  • Smart Questions” guide (note “hackers build things, crackers break them”)
  • Follow this Rubric when emailing a question:
    • Send a new email for each new question.  Use “Reply” to continue a conversation on a question (do not start a new email, again).
    • Include “ADA2” as the first word of the subject line in new emails (if replying, just use reply), with the rest of the subject indicating the assignment and type of problem.
    • Begin the email with a short question summary (that is, don’t bury your question in the middle of the third paragraph).  Then, begin the detail of your question in the second paragraph.
    • When possible, include commented code in the email body — Comments (starting with # symbol) should indicate where the problem is, what the expected behavior is, and what steps are necessary to reproduce the problem.
    • Attach your Rmd file so that the instructor can reproduce the problem.   If attaching code, please include all the files necessary to run your code (data, etc.)
    • [Attaching code supersedes this: Code should include a “Minimum representative test case” (]
    • Assume the best. Your instructors want to help and we will do our best. Do not abuse your helpers even if you feel frustrated.

Step 0

Before our first “class” (Mon 1/17/22) please read through the following actions and install the required software on your computer.
  1. Install R (windows or mac) or upgrade, then RStudio.
  2. Install R packages.
    1. Follow these instructions: R packages.  (Ignore warning about rtools or any packages unavailable.)
    2. In RStudio, open Packages tab, click on “Update”, Select All, Install Updates (“No” to restart, “No” to compile from source).
  3. Install (or update) erikmisc package.
    1. Run the two lines under “Installation”:
      1. install.packages("devtools")
      2. devtools::install_github("erikerhardt/erikmisc")
        1. If it asks to update packages (it should not ask this if you updated packages above), press 3 [Enter] for “None”.
        2. If asks about “make” command, click “Cancel”.
        3. If asks about “git” command, click “Cancel”.
    2. Make sure it works by printing the logo:
      1. library(erikmisc)
      2. erikmisc_logo()
  4. Set up your computer
    1. RStudio disable notebook
    2. Operating system to be more friendly to programming.
If you have a Chromebook or no laptop, consider using RStudio Cloud > Individuals,  and when installing packages remove the type="binary" option.


This Is Statistics

Learn to produce beautiful (markdown) and reproducible (knitr) reports with informative plots (ggplot2) and tables (kable) by writing code (R, tidyverse, Rstudio) to answer questions using fundamental statistical methods (multiple regression, analysis of covariance, logistic regression, and multivariate methods), which you’ll be proud to present (poster).


Course content

Weekly structure

(also see “Assessment” below)
  1. Preparation (Tuesday): Reading, Video, Quiz due Tue 9:30 AM (before class).
  2. Worksheet 1 (Tuesday): Assignment due by Fri 11:50 PM.
  3. Worksheet 2 (Thursday): Assignment due by Mon 11:50 PM.

Course notes, code, data, and video lectures

Notes from Spring 2020: ADA2_notes_S20.pdf includes all chapters in one document. Creative Commons License Lecture notes for Advanced Data Analysis 2 (ADA2) Stat 428/528 University of New Mexico is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Based on a work at
Ch Chapter Title Notes R code Datasets Video lectures playlist
01 R statistical software and review pdf R turkey.csv, rocket.dat (Videos based on S16 notes) 01-1, 01-2
02 Introduction to Multiple Linear Regression pdf R indian.dat, gce.dat 02-1, 02-2
03 A Taste of Model Selection for Multiple Regression pdf R ratliver.csv 03-1, 03-2
04 One Factor Designs and Extensions pdf R none 04
05 Paired Experiments and Randomized Block Experiments pdf R battery.dat, beetles.dat, itch.csv, ratinsulin.dat 05-0 05-1 05-2 05-3 05-4 05-5 05-6 05-7 05-8 05-9
06 A Short Discussion of Observational Studies pdf R sat.csv 06
07 Analysis of Covariance: Comparing Regression Lines pdf R tools.dat, toolsfake.dat, twins.dat 07-1 07-2 07-3 HW helper video
08 Polynomial Regression pdf R cloudpoint.dat, mooney.dat 08-1 08-2
09 Discussion of Response Models with Factors and Predictors pdf R faculty.dat 09-1 09-2 09-3
10 Automated Model Selection for Multiple Regression pdf R oxygen.dat 10-1 10-2 10-3
11 Logistic Regression pdf R beetles.dat, leuk.dat, menarche.csv, shuttle.csv, trauma.dat 11-1 11-2 11-3 11-4
12 An Introduction to Multivariate Methods pdf R none 12
13 Principal Component Analysis pdf R bgs.dat, shells.dat, sparrows.dat, temperature.dat 13-1 13-2 13-3
14 Cluster Analysis pdf R birthdeath.dat, teeth.dat 14-1 14-2 14-3
15 Multivariate Analysis of Variance pdf R shells_mf.dat 15
16 Discriminant Analysis pdf R mower.dat 16-1 16-2
17 Classification pdf R business.dat 17-1 17-2 17-3
18 Data Cleaning pdf R conversions.txt, dalton.txt, dirty_iris.csv, edits.txt, people.txt, unnamed.txt

(I reserve the right to continue to improve the materials throughout the semester.)


Date Class Topic Reading, Video, Quiz class Worksheet, Data
01/10 00 Install software
  • See Step 0
  • video: S21 Intro (similar for S22, except we’re face-to-face in class and probably no assignment preview videos)
01/18 01 01 R statistical software and review
  • read: Ch 01
  • video: 01-1, 01-2
  • quiz: 01
  • Due M 01/24 (11:50 PM)
01/20 02
  • continued
01/25 03 02 Introduction to Multiple Linear Regression
  • read: Ch 02
  • video: 02-1, 02-2
  • quiz: 02
  • Due T 01/25 (9:30 AM for all remaining)
  • Rmd html dat
  • Class 03, Ch 02 Introduction to Multiple Linear Regression
  • video: CL03
  • Due F 01/28
01/27 04
  • Rmd html dat
  • Class 04, Ch 02 Introduction to Multiple Linear Regression
  • video: CL04
  • Due M 01/31
02/01 05 03 A Taste of Model Selection for Multiple Linear Regression
  • read: Ch 03, 04
  • video: 03-1, 03-2, 04
  • quiz: 03a, 03b
  • Due T 02/01
  • Rmd html dat
  • Class 05, Ch 03 A Taste of Model Selection for Multiple Regression
  • video:  CL05
  • Due F 02/04
02/03 06 04 Experimental Design: One and Two Factor Designs
  • Rmd html dat
  • Class 06, Ch 03 A Taste of Model Selection for Multiple Regression
  • video: CL06
  • Due M 02/07
02/08 07 05 Paired Experiments and Randomized Block Designs
  • Rmd html
  • Class 07, Ch 05a Paired Experiments and Randomized Block Experiments: Randomized complete block design (RCBD)
  • video: CL07
  • Due F 02/11
02/10 08
  • Rmd html dat
  • Class 08, Ch 05a Paired Experiments and Randomized Block Experiments
  • video: CL08
  • Due M 02/14
02/15 09
  • Rmd html dat
  • Class 09, Ch 05b Paired Experiments and Randomized Block Experiments: Two-way Factor design
  • video: CL09
  • Due F 02/18
02/17 10
  • Rmd html dat
  • Class 10, Ch 05b Paired Experiments and Randomized Block Experiments: Two-way Factor design
  • video: CL10
  • Due M 02/21
02/22 11 06 Discussion of Observational Studies
  • read: Ch 06-07
  • video:  06 07-1 07-2 07-3
  • quiz: 06a, 06b
  • Due T 02/22
  • Rmd html dat1 dat2
  • Class 11, Chs 05 and 07, writing and plotting model equations
  • video: CL11
  • Due F 02/25
02/24 12 07 Analysis of Covariance: Comparing Regression Lines
  • Rmd html dat
  • Class 12, Ch 07a, Analysis of Covariance: Comparing Regression Lines
  • video: CL12
  • Due M 02/28
03/01 13 08 Polynomial Regression
  • Rmd html dat
  • Class 13, Ch 08, polynomial regression
  • video: CL13
  • Due F 03/04
03/03 14 09 Response Models with Factors and Predictors
03/08 15 10 Model Selection for Multiple Regression
03/10 16
03/15 Spring Break
03/17 Spring Break
03/22 17 11 Logistic Regression
  • Rmd html dat
  • Class 17, Ch 11, Logistic Regression
  • video: CL17
  • Due F 03/25
03/24 18
  • Rmd html dat
  • Class 18, Ch 11, Logistic Regression
  • video: CL18
  • Due M 03/28
03/29 19 12 An Introduction to Multivariate Methods 13 Principal Components Analysis (PCA)
  • read: Ch 12, Ch 13
  • video: 12 13-1 13-2 13-3
  • quiz: 11a, 11b
  • Due T 03/29
  • Rmd html dat
  • Class 19, Ch 13, Principal Components Analysis (PCA)
  • video: CL19
  • Due M 04/04
03/31 20
04/05 21 PCA, continued
  • Rmd html dat
  • Class 21, Ch 13, Principal Components Analysis (PCA)
  • video: CL21
  • Due M 04/11
04/07 22
04/12 23 14 Cluster Analysis
04/14 24
04/19 25 15 Multivariate Analysis of Variance (MANOVA)
  • read: Ch 15
  • video: 15
  • quiz: 12b
  • Due T 04/19
  • Rmd html dat
  • Class 25, Ch 15, Multivariate Analysis of Variance (MANOVA)
  • video: CL25
  • Due M 04/25
04/21 26
04/26 27 16 Discriminant Analysis 17 Classification
  • Rmd html dat
  • Class 27, Chs 16+17, Discrimination for Classification
  • video: CL27
  • Due M 05/02
04/28 28
05/03 29 13+11+17 PCA and logistic regression classification
  • none
  • Rmd html dat
  • Class 29, Chs 13+11+17, PCA and Logistic Regression for Classification
  • video: CL29
  • Due M 05/09
05/05 30
05/10 FINALS WEEK (no final) Surveys Due — submit receipt or confirmation page to UNM Learn * Learning Studio * EvalKit in Learn


  • Description: A continuation of 427/527 that focuses on methods for analyzing multivariate data and categorical data. Topics include MANOVA, principal components, discriminant analysis, classification, factor analysis, analysis of contingency tables including log-linear models for multidimensional tables and logistic regression.
  • Prerequisite: Stat 427/527 (ADA1)
  • Semesters offered: Spring
  • Lecture: Stat 428/528.001 (CRN 33933 or 33935) (TR 0930-1045, CTLB 300 Video)
  • Email: Please include “ADA2” in the subject line of all emails.


  • Professor
    • Erik Erhardt <>, he/him
  • Teaching Assistants
    • Davis Dodson <>, he/him
    • Shuang Yang <>, he/him
  • Peer Learning Facilitators (PLF)
    • Valerie Fong, she/her
    • Ola Anifowoshe, he/him

Office hours

See email “ADA2, Stat 428/528, Announcements” from 1/14/22 for Zoom links and instructions.
Time Mon Tue Wed Thu Fri Sat Sun
8 AM
9 AM Class Class
10 AM EE Class Class EE
12 PM
1 PM
2 PM
7 PM
8 PM
9 PM
  • We are also all available by appointment by email if these many hours do not work for you.

Student learning outcomes

Similar to ADA1, but at a higher level.


  • Quizzes will be due each Tuesday before class.  Purpose: to assess reading and video comprehension and assure you’re prepared to actively participate in class activities with minimal lecture. (About 12, 20% of final grade.)  Most weeks plan for 1-2 hours reading and video, 20-minute quiz. Quizzes are not timed, they can be taken twice, and the higher of the two scores is used for grade calculation.
    • Viewing quiz solutions after the due date in UNM Learn is not intuitive.  Click on the “Begin” button (this is the non-intuitive part since you are not actually beginning the quiz), then click “View All Attempts” to see the scores.  Finally, click the score in the “Calculated Grade” column to see the feedback for each question of the quiz.
  • Worksheet assignments.  Purpose: to struggle and find success in class with the concepts and skills. (About 24, includes class participation, 78% of final grade) Most weeks plan to finish in class.
  • Course surveys are due at the end of the course (EvalKit).  (About 1, 2% of final grade.)
  • The lowest 2-weeks’ worth of assignments are dropped, so your lowest 2 quizzes and 4 worksheet assignments are not included in the calculation of your grade.  This policy takes the place of bickering and groveling over late or flawed assignments in favor of dignity and ease for you and me.
Final grade may include a small buffer at the discretion of the instructor. For example, final grade could be the total points earned divided by the total possible points times 0.98 for graduate students and 0.95 for undergraduate students. That is [Final Grade] = [Points Earned]/[Points possible * 0.95] so that your grade is slightly higher than you earned.


  • All assignments in this class are electronic, submitted to UNM Learn.  For all submissions:
    1. In RMarkdown, knit Rmd file to HTML,
    2. Open HTML file in your internet browser,
    3. Print HTML to pdf file, and
    4. Submit pdf to UNM Learn.
  • Browser choice: Chrome is the best browser choice.  On a Mac, Safari adds “.txt” to RMarkdown files when downloaded, and Firefox sometimes fails on upload of a pdf to UNM Learn.
  • Late assignments will not be accepted.
  • Rubrics guide assessment (and self-assessment) of homework, code, projects, exams, and presentations.  Each assignment will have its own specific rubric.
  • The use of R and RMarkdown are required for the course.  This will include all of the R code for the assignment with the part of the problem it addresses in a fixed-width font and syntax highlighting. You will weave your code with prose narrations of your work and solutions.

Collaboration and citation

  • For homework, I encourage you to work together. Please discuss the data, code, and problems with one another, but do your own exploration and write-up. We expect everyone to hand in substantially different homework, and we will enforce this under the honor code. The small benefit you might get from plagiarism is not worth the severe penalty (of lost trust, being reported to the dean, no points for the assignment, etc.).
  • As in life, please use any resources available to you. Projects and some homework will explicitly encourage you to use resources on the internet, but showing extra initiative will always be appreciated. You may find R programming tough at first, so feel free to discuss your problems with other classmates or meet with or email questions to the TAs or me.
  • I encourage you to use the ideas of others, but make them your own, giving credit. For projects have a formal bibliography, for homework cite casually, and for code simply copy the URL into your code as a comment (which is doubly helpful to you for finding the resource again).


UNM Administrative Mandate on Required Vaccinations

  • Bringing Back the Pack link
  • UNM requires COVID-19 vaccination and a booster for all students, faculty, and staff, or an approved exemption (see: UNM Administrative Mandate on Required Vaccinations).  Proof of vaccination and booster, or a medical, religious, or online remote exemption, must be uploaded to the UNM vaccination verification site. Failure to provide this proof may result in a registration hold and/or disenrollment for students and disciplinary action for UNM employees.
  • Booster Requirement: Individuals who received their second dose of a Pfizer or Moderna vaccine on or before June 15, 2021, or their single dose of a Johnson & Johnson vaccine on or before October 15, 2021, must provide documentation of receipt of a booster dose no later than January 17, 2022.
  • Individuals who received their second dose of a Pfizer or Moderna vaccine after June 15, 2021, or who received their single dose of Johnson & Johnson after November 15, 2021, must provide documentation of receipt of a booster within four weeks of eligibility, according to the criteria provided by the FDA (6 months after completing an initial two-dose Moderna vaccine, 5 months after completing the Pfizer sequence, and 2 months after receiving a one-dose Johnson and Johnson vaccine).
  • International students: Consult with the Global Education Office.
  • Exemptions: Individuals who cannot yet obtain a booster due to illness should request a medical, religious, or online remote exemption (which may have an end date) and upload this to the vaccination verification site.
  • Medical and religious exemptions validated in Fall 2021 (see your email confirmation) are also valid for Spring 2022 unless an end date was specified in the granting of a limited medical exemption. Students must apply for a remote online exemption every semester.
  • UNM Requirement on Masking in Indoor Spaces 
  • All students, staff, and instructors are required to wear face masks in indoor classes, labs, studios, and meetings on UNM campuses, see the masking requirement. Students who do not wear a mask indoors on UNM campuses can expect to be asked to leave the classroom and to be dropped from a class if failure to wear a mask occurs more than once in that class. Students and employees who do not wear a mask in classrooms and other indoor public spaces on UNM campuses are subject to disciplinary actions. Medical/health grade masks are the best protection against the omicron variant and these masks should be used, rather than cloth.
  • COVID-19 Symptoms and Positive Test Results:
  • Please do not come to a UNM campus if you are experiencing symptoms of illness, or have received a positive COVID-19 test (even if you have no symptoms). Contact your instructors and let them know that you should not come to class due to symptoms or diagnosis. Students who need support addressing a health or personal event or crisis can find it at the Lobo Respect Advocacy Center


  • In accordance with University Policy 2310 and the Americans with Disabilities Act (ADA), academic accommodations may be made for any student who notifies the instructor of the need for an accommodation. It is imperative that you take the initiative to bring such needs to the instructor’s attention, as I am not legally permitted to inquire. Students who may require assistance in emergency evacuations should contact the instructor as to the most appropriate procedures to follow. Contact Accessibility Resource Center at 277-3506 for additional information.
  • UNM is committed to providing courses that are inclusive and accessible for all participants. As your instructor, it is my objective to facilitate an accessible classroom setting, in which students have full access and opportunity. If you are experiencing physical or academic barriers, or concerns related to mental health, physical health and/or COVID-19, please consult with me after class, via email/phone or during office hours. You are also encouraged to contact the Accessibility Resource Center at or by phone 277-3506.


  • This is a three-credit-hour course. Class meets for two 75-minute sessions of direct instruction for fifteen weeks during the semester. Students are expected to complete a minimum of six hours of out-of-class work (or homework, study, assignment completion, and class preparation) each week.

Title IX statement

  • In an effort to meet obligations under Title IX, UNM faculty, Teaching Assistants, and Graduate Assistants are considered “responsible employees” by the Department of Education (see page 15 of requires that any report of gender discrimination that includes sexual harassment, sexual misconduct and sexual violence made to a faculty member, TA, or GA must be reported to the Title IX Coordinator at the Office of Equal Opportunity ( For more information on the campus policy regarding sexual misconduct, see:

Citizenship and/or Immigration Status

  • All students are welcome in this class regardless of citizenship, residency, or immigration status. Your professor will respect your privacy if you choose to disclose your status. As for all students in the class, family emergency-related absences are normally excused with reasonable notice to the professor, as noted in the attendance guidelines above. UNM as an institution has made a core commitment to the success of all our students, including members of our undocumented community. The Administration’s welcome is found on our website:

Support in Receiving Help and in Doing What is Right

  • I encourage students to be familiar with services and policies that can help them navigate UNM successfully. Many services exist to help you succeed academically and to find your place at UNM, see or ask me for information about the right resource center or person to contact. UNM has important policies to preserve and protect the academic community, especially policies on student grievances (Faculty Handbook D175 and D176), academic dishonesty (FH D100), and respectful campus (FH CO9). These are in the Student Pathfinder ( and the Faculty Handbook ( Please ask for help in understanding and avoiding plagiarism or academic dishonesty, which can both have very serious disciplinary consequences.

Land Acknowledgement

  • Founded in 1889, the University of New Mexico sits on the traditional homelands of the Pueblo of Sandia. The original peoples of New Mexico Pueblo, Navajo, and Apache since time immemorial, have deep connections to the land and have made significant contributions to the broader community statewide. We honor the land itself and those who remain stewards of this land throughout the generations and also acknowledge our committed relationship to Indigenous peoples. We gratefully recognize our history.

Our Classroom

We’re doing this because:
  • We want you to be empowered with statistics.
  • We believe everyone should get out of this course with awesome skills
  • Real-time feedback promotes efficient learning
“It encourages me to engage actively with the course material and take responsibility for my learning.”

GAISE Connections

Our six recommendations include the following:
  1. Emphasize statistical literacy and develop statistical thinking
  2. Use real data
  3. Stress conceptual understanding, rather than mere knowledge of procedures
  4. Foster active learning in the classroom
  5. Use technology for developing conceptual understanding and analyzing data
  6. Use assessments to improve and evaluate student learning

Learning without thought is labor lost. What I hear, I forget. What I see, I remember. What I do, I understand. – Confucius


Passion Driven Statistics (PDS) data

Old news

Step 0

Before our first class (Tue 1/21) please read through the following actions and install the required software on your computer and complete the brief survey. If you don’t have a computer, there are classroom computers which will be available only when the classroom is open. Video for this process (ignore the “crowdgrader” portion).
  1. Complete surveys
    1. a short Opinio pre-survey required for classroom assessment (1/20 – 2/1/2020).
  2. Install R (windows or mac) or upgrade, then Rstudio. Videos that may be helpful:
  3. Install R packages,
    1. Run RStudio
    2. Run code in R packages.
    3. Update all packages, RStudio Packages tab, click “update”, click “select all”, and “Install Updates”. Say “Yes” to restart R, but if it asks a second time, say “No”.  Say “No” to “install from sources” if it asks.
  4. Set up your computer
    1. RStudio disable notebook
    2. Operating system to be more friendly to programming.
  5. (Postpone until later: Install LaTeX (for poster at end of the semester).)

RMarkdown and knitr issues

  • R errors, unresolved, and out of time If you’re saying: “An error while knitting keeps me from turning in the assignment…”, then use code chunk option
```{r, error = TRUE}
to ignore the error and continue. This will allow you to turn in partial assignments with errors.
  • Unicode compile problems:  If you knit to pdf you may get this error: “! Package inputenc Error: Unicode char”.  ASCII is a small character set what we use to program in, Unicode is an extended character set that looks pretty (for example “straight quotes” become “curly quotes”) but causes code to break.  You get unwanted Unicode when you copy/paste from a pdf or some other source into your code.  To fix this, you have to find the Unicode and replace it with it’s ASCII equivalent.  To do this: Ctrl-F to find, search for “[^\x00-\x7F]” (without quotes), select “Regex” for regular expressions, and find the “Next” one.  As it finds instances, replace the characters manually until there are no more.  These characters will typically be curly quotes or fancy dashes.

Pre-course to-dos

Did you receive a registration error for Spring 2022? Send me an email with the following answers:
  1. What registration error did you get (copy/paste is best)?
  2. What is your UNM ID?
  3. What is your Math/Stat background (that is, do you have the pre-requisites)?
If you are waitlisted, as long as there are seats available I will override you into the course. Don’t worry.
3/1/17 – Data resources for poster:

Citing and using notes, including previous editions

Citing lecture notes: Erhardt EB, Bedrick EJ, and Schrader RM. (2020) Lecture notes for Advanced Data Analysis 2. Retrieved Mar 1, 2020, from, 136–144.

Acumen in Statistics