Fall 2022 Syllabus is below the tables
• Fall 2022
• Tuesday/Thursday: 9:30-10:45 AM at CTLB 300 map
• Stat 427.001, CRN 59508; Stat 527.001, CRN 59509
Step 0 (moved to the bottom of page)

## Goal

Learn to produce beautiful (markdown) and reproducible (knitr/quarto) reports with informative plots (ggplot2) and tables (kable) by writing code (R, tidyverse, RStudio) to answer questions using fundamental statistical methods (all one- and two-variable methods), which you’ll be proud to present (poster).

# Course content

## Weekly structure

(also see “Assessment” below)
1. Preparation (Tuesday): Reading, Video, Quiz due Tue 11:50 PM.
2. Worksheet 1 (Tuesday): Assignment due by Fri 11:50 PM.
3. Worksheet 2 (Thursday): Assignment due by Mon 11:50 PM.

## Course notes, code, data, and video lectures

Ch Chapter Title Notes R code Video lectures playlist
00 Introduction to R, Rstudio, and ggplot pdf
01 Summarizing and Displaying Data pdf
02 Estimation in One-Sample Problems pdf
03 Two-Sample Inferences pdf
04 Checking Assumptions pdf
05 One-Way Analysis of Variance pdf
• 05-1
(no videos recorded)
06 Nonparametric Methods pdf
07 Categorical Data Analysis pdf
08 Correlation and Regression pdf
09 Introduction to the Bootstrap pdf
• 09-1
10 Power and Sample size pdf
11 Data Cleaning (not used this semester, ADA2 Ch 11 refers to Ch 12 below) pdf
12 ADA2 Ch 11 Logistic Regression pdf

### Passion-Driven Statistics (PDS) data

• NESARC Sampling Design, Codebook, RData. Alcohol abuse and related conditions.
• Unique ID “IDNUM”.
• AGE is in the data but not in the codebook.

• UNM Canvas for taking quizzes (graded automatically) and submitting assignments (evaluated by TA within 1 week).
• After uploading a pdf assignment, verify with a preview of the file.
• Lectures: YouTube Video playlist (try 1.5 speed, then pause as needed).
• RStudio cheatsheets
• Erik’s example homework document: NESARC data, nicotine and depression.
1. Use these files as a model for your assignments: .qmd + .bib = .HTML. (Ignore for now)
2. These are the files that Erik develops in the assignment videos (similar to just above, though above has more details):
• `.qmd + .bib = .html`

## Timetable

Date Class Topic Reading, Video, Quiz class Worksheet, Data
08/22 00 Install software Step 0 (above) (Video numbers may be slightly different from the class number.)
08/23 Tue 01
• Quarto
• (Intro to using Quarto: qmd html)
• Data types and data organization
• Survey from Wesleyan University via email
• qmd html
• 01 Medical records
• Download qmd file to your computer, open in RStudio, edit it, print HTML to pdf, turn in assignment by Friday 11:50 to UNM Canvas.
• Class 01, Medical Records (separate)
• video: CL01 (Ignore “crowdgrader” in last minute)
• Due F 08/26
08/25 Thu 02 Codebook
• qmd html
• Class 02, Personal Codebook
• (Find this assignment contained in the Outline below)
• video: CL02
• Due M 08/29
08/30 Tue 03 R programming, data subset and numerical summaries
• qmd html
• Start using this qmd file.
• Most of your assignments will be written in this file.
• Read dataset in R, create subset of data, rename variables, numerical summaries.
• Class 03, Data subset and numerical summaries
• video: CL03a, CL03b, CL03c
• Due F 09/02
09/01 Thu 04 Plotting univariate
09/06 Tue 05 Plotting bivariate, numeric response, categorical response
• Class 05-1, Plotting bivariate, numeric response
• video: CL05-1a, CL05-1b
• Due F 09/09
• Class 05-2, Plotting bivariate, categorical response
• video: CL05-2a, CL05-2b
• Due F 09/09
09/08 Thu 06 Figure and text finer points (Related: MS Word cross-references, citations, and special topics video)
• Class 06, Figure arrangement, captions, cross-referencing
• video:
• CL06
• Due M 09/12
09/13 Tue 07 Simple linear regression, intro
• read: Ch 8.4, 8.2 R;
• video: 08-1 corr/log, 08-3 LS reg eq;
• quiz: quiz
• Quiz 07, Simple linear regression, Logarithm transformation
• Due T 09/13
• qmd html dat
• Build intuition using SLR App, interpret properties of linear regression fit.
• Class 07-1, Simple linear regression (separate)
• video: CL07-1
• Due F 09/16
• Class 07-2, Simple linear regression
• video: CL07-2
• Due M 09/16
09/15 Thu 08 Logarithm transformation
• qmd html dat
• Plot, transform, plot, and interpret.
• video: CL08-1
• Class 08-1, Logarithmic transformation, intro (separate)
• Due M 09/19
• Class 08-2, Logarithmic transformation
• video: CL08-2
• Due M 09/19
09/20 Tue 09 Correlation
• qmd html dat1 dat2
• Class 09, Correlation , intro (separate)
• video: CL09
• Due F 09/23
• Data collection in class: need rulers, UN survey
09/22 Thu 10 Categorical contingency tables
09/27 Tue 11 Parameter estimation (one-sample)
• video: see table above;
• quiz: quiz
• Quiz 11, Inference and Parameter estimation
• Due T 09/27
• Class 11, Correlation and Categorical contingency tables
• video: CL11a CL11b
• Due F 09/30
09/29 Thu 12
• qmd html
• Class 12-1, Parameter estimation (one-sample) (separate)
• video: CL12
• Due M 10/03
• Data collection in class: globe
• Class 12-2, Inference and Parameter estimation (one-sample)
• video: CL12-2a CL12-2b
• Due M 10/03
10/04 Tue 13 Hypothesis testing (two-sample)
• read: Ch 2.3-end RCh 3 R
• video: see table above;
• quiz: quiz
• Quiz 13, Hypothesis testing
• Due T 10/04
10/06 Thu 14 Paired data, assumption assessment
• qmd html dat
• Class 14, Paired data, assumption assessment (separate)
• video: CL14
• Due M 10/10
10/11 Tue 15
• Class 15, Hypothesis testing (one- and two-sample)
• video: CL15
• Due F 11/04
10/13 Thu Fall Break Spurious Correlations BBC Radio 4: More or Less, “sampling”, 9 min audio
10/18 Tue 16 ANOVA, post-hoc comparisons
• read: Ch 2.2.1, Ch 3.4 & 3.6, Ch 4, Ch 5;
• video: see table above;
• quiz: quiz
• Quiz 16, ANOVA, Pairwise comparisons
• Due T 10/18
• qmd html dat
• Class 16, ANOVA, Pairwise comparisons (separate)
• video: CL16
• Due F 10/21
10/20 Thu 17
• Class 17, ANOVA and Assessing Assumptions
• video: CL17
• Due M 10/24
10/25 Tue 18 Nonparametric methods
• read: Ch 6, Ch 7.2-7.4, Ch 10;
• video: see table above;
• quiz: quiz
• Quiz 18, Nonparametric methods, Binomial and Multinomial tests
• Due T 10/25
• qmd html
• Class 18, Nonparametric methods (separate)
• video: CL18
• Due F 10/28
10/27 Thu 19 Binomial and multinomial proportion tests
• qmd html dat
• Class 19, Binomial and Multinomial tests (separate)
• video: CL19
• Due M 10/31
11/01 Tue 20 Two-way categorical tables,  simple linear regression, inference
• read: Ch 7.8-end, Ch 8.5-8.7;
• video: see table above;
• quiz: quiz
• Quiz 20, Two-way categorical tables
• Due T 11/01
• qmd html dat
• Class 20-1, Two-way categorical tables (separate)
• video: CL20-1
• Due F 11/04
• qmd html dat
• Regression of height vs hand span using data from our class.
• Class 20-2, Simple linear regression (separate)
• video: CL20-2
• Due F 11/04
11/03 Thu 21
• Class 21, Two-way categorical and simple linear regression
• video: CL21a CL21b
• Due M 11/07
11/08 Tue 22 Logistic regression, intro
• video: see table above;
• quiz: quiz
• Quiz 22, Logistic regression
• Due T 11/08
• qmd html dat
• Class 22, Logistic regression (separate)
• video: CL22
• Due F 11/11
11/10 Thu 23 Summary of Methods we’ve covered
• Class 23, Logistic regression
• video: CL23
• Due M 11/14
11/15 Tue 24 Poster Preparation: research questions, data sources, analyses
• Class 24, Poster Prep
• video: CL24a CL24b
• Due M 11/21 (try to finish early!)
11/17 Thu 25 Poster Preparation: literature review, references, discussion, future work
11/22 Tue 26 Poster Preparation: complete content
11/24 Thu  Thanksgiving break
11/29 Tue 27 Poster Preparation: into poster template
• Inexpensive poster printing
• Minuteman Press, Eubank
• 1631 Eubank Boulevard NE, Suite D, Albuquerque, NM 87112
• (505)881-0164
• Open Mon-Fri 9a-5p, closed Thanksgiving, open Fri 11/25 10a-2p
• Submit poster to website
• Project name: “UNM ADA1 class poster”
• Due Date: 12/05/22 (at latest, try to finish a little early so you can print last week before presentations)
• Additional Details: “3’x4′ portrait poster on bond paper”
• File #1: Name the poster pdf with your name in the filename, such as “FirstLast_ADA1_poster.pdf”
• Arrange to pick up the poster.
12/01 Thu Poster Preparation: reviewed by an instructor
12/06 Tue 28 Poster Presentations: Graduate students
• Location: SMLC Atrium (large room, ground floor)
• Time: 9:30-10:45 (class time)
• Poster rubric
• Each person evaluates 3-4 posters.
• Must be present on both days.
12/08 Thu 29 Poster Presentations: Undergraduate students
1. EvalKit course evaluation: print a pdf of your email confirmation that you’ve completed the EvaluationKIt Survey and upload that to UNM Canvas. (Due T 12/13)
2. PDS Wesleyan U Qualtrics survey (email), no receipt required
12/13 Tue Finals week (no final) Congratulations on a great semester!
(I reserve the right to continue to modify the schedule and improve the materials throughout the semester.)

# Syllabus

• Description: Statistical tools for scientific research, including parametric and non-parametric methods for ANOVA and group comparisons, simple linear and multiple linear regression, and basic ideas of experimental design and analysis. Emphasis placed on the use of statistical packages such as R. Course cannot be counted in the hours needed for graduate degrees in Mathematics and Statistics.
• Prerequisite: Math 1350 [Stat 145] (or other intro stats course)
• Semesters offered: Fall
• Lecture: Stat 427.001, CRN 59508; Stat 527.001, CRN 59509; TR 1530-1645; Location: Zoom
• Email: Please include “ADA1” in the subject line of all emails, do not send messages via UNM Canvas.

## Instructors

• Professor
• Erik Erhardt <erike@stat.unm.edu>, he/him
• Teaching Assistants
• Mingyue Liu <mingyueliu@unm.edu>, she/her
• Peer Learning Facilitators (PLF)
• Arwyn Lewis, she/her
• Alexis P Amodio-Cardwell, she/her

### Office hours

See email “ADA1, Stat 427/527, Announcements” from 8/26/22 for Zoom links and instructions. UNM Authentication instructions.
 Time Mon Tue Wed Thu Fri Sat Sun 8 AM 9 AM BF Class BF Class 10 AM EE Class ML Class EE 11 AM EE EE ML EE EE 12 PM BF BF 1 PM 2 PM ML 3 PM ML 4 PM BF 5 PM ML BF 6 PM ML 7 PM 8 PM 9 PM
• We are also all available by appointment by email if these many hours do not work for you.

## Student learning outcomes

At the end of the course, you will be able to: (student results: R, all years20152014, 20132012) General outcomes:
1. Organize knowledge in graphs, tables, and code to support concise, comprehensible, and scientifically defensible written interpretations to produce knowledge within a reproducible research environment.
2. Distinguish a testable scientific hypothesis or data-supported interpretation from an opinion.
3. Understand from a data story the goals of the study and apply the correct statistical procedure.
4. Explain the scientific aspects of a problem to nonscientists in a fashion that enhances understanding and decision making.
Topical outcomes:
1. Define parameters of interest and hypotheses in words and notation.
2. Summarize data visually, numerically, and descriptively and interpret the observed characteristics. Calculate and interpret numerical summaries such as mean, variance, five-number summary, confidence intervals, and p-values, and create visual summaries such as bar plots, scatter plots, and histograms. (Never pie charts!)
3. Distinguish between statistical significance and scientific relevance.
4. Use statistical software, such as R, to read and manage data, create informative plots, report numerical summaries, and apply statistical models, by recommended programming practice including abstraction and documentation.
5. Understand the differences and limitations of controlled experiments and observational studies. Design experiments to infer causal treatment effects. Analyze observational data to infer associations between measured variables.
6. Identify and explain the statistical methods, assumptions, and limitations used in reported studies in scientific literature or popular media.
7. Evaluate and criticize published studies, the work of peers, and your own work and assess what was done well, what could be done better, and examine whether their conclusions are supported using statistical principles.
8. Make evidence-based decisions by constructing and deciding between testable hypotheses using appropriate data and methods.
9. Discover relationships and make predictions through model development and selection.

## Meeting the learning outcomes

You will acquire new information in this class, but the emphasis is on comprehending, integrating, and applying information. Rote factual memorization is the lowest form of learning. Effective learning occurs by explaining, integrating, applying, and analyzing facts, hypotheses, and theories. Learning in this class occurs by:
1. Doing – completion of exercises that require analysis of data to answer questions and test hypotheses, or researching answers to reading assignments.
2. Discussion – interaction with classmates to assemble and synthesize information utilizing the collective skills and knowledge base of the group.
3. Listening, acting, and reflecting – activities during class time provide insights into information not available in readings and includes review difficult material to aid comprehension. Note-taking permits later reflection on lecture content. Listening to the professor lecture is the least effective learning tool for both students, however, and you should plan on coming to every class prepared to participate in active and reflective learning opportunities.

## Assessment

• Quizzes will be due each Tuesday before class (for fully face-to-face semesters).  Purpose: to assess reading and video comprehension and assure you’re prepared to actively participate in class activities with minimal lecture. (About 12, 12% of final grade.)  Most weeks plan for 1-2 hours reading and video, 20-minute quiz. Quizzes are not timed, they can be taken twice, and the higher of the two scores is used for grade calculation.
• Viewing quiz solutions after the due date in UNM Canvas is not intuitive.  Click on the “Begin” button (this is the non-intuitive part since you are not actually beginning the quiz), then click “View All Attempts” to see the scores.  Finally, click the score in the “Calculated Grade” column to see the feedback for each question of the quiz.
• Worksheet assignments.  Purpose: to struggle and find success in class with the concepts and skills. (About 31, includes class participation, 70% of final grade) Most weeks plan to finish in class.
• Poster will be developed through semester (most assignments contribute to poster), the last couple weeks we’ll complete them, and the last week we’ll have poster presentations. Purpose: to have an overarching set of questions to answer using methods learned in the course, with a deliverable you can be proud of! (1 poster and presentation, 12% poster, 2% presentation, and 2% evaluations of others of final grade.)  In the last couple weeks, assembling this poster may take 5-10 hours, using a template provided to you.
• Course surveys are due at the end of the course (EvalKit).  (About 2, 2% of final grade.)
• Roughly speaking, the lowest 2-weeks worth of assignments are dropped, so your lowest 2 quizzes and 4 worksheet assignments are not included in the calculation of your grade (this could include a worksheet assignment that spanned a full week).

### Collaboration and citation

For homework, I encourage you to work together. Please discuss the data, code, and problems with one another, but do your own exploration and write up. We expect everyone to submit substantially different homework, and we will enforce this under the honor code. The small benefit you might get from plagiarism is not worth the severe penalty (of lost trust, being reported to the dean, no points for the assignment, etc.). As in life, please use any resources available to you. Projects and some homework will explicitly encourage you to use resources on the internet, but showing extra initiative will always be appreciated. You may find R programming tough at first, so feel free to discuss your problems with other classmates or meet with or email questions to me or the TAs. I encourage you to use the ideas of others, but make them your own, giving credit. For projects have a formal bibliography, for homework cite casually, and for code simply copy the URL in as a comment (which is doubly helpful for finding the resource again).  You won’t be the first person to do anything in this class, so give credit where it’s due.

## Statements

### COVID-19 Health and Awareness

UNM is a mask friendly, but not a mask required, community. To be registered or employed at UNM, Students, faculty, and staff must all meet UNM’s Administrative Mandate on Required COVID-19 vaccination. If you are experiencing COVID-19 symptoms, please do not come to class. If you have a positive COVID-19 test, please stay home for five days and isolate yourself from others, per the Centers for Disease Control (CDC) guidelines. If you do need to stay home, please email me; I can work with you to provide alternatives for course participation and completion. UNM faculty and staff know that these are challenging times. Please let us know that you need support so that we can connect you to the right resources and please be aware that UNM will publish information on websites and email about any changes to our public health status and community response.

### Accommodations

UNM is committed to providing equitable access to learning opportunities for students with documented disabilities. As your instructor, it is my objective to facilitate an inclusive classroom setting, in which students have full access and opportunity to participate. To engage in a confidential conversation about the process for requesting reasonable accommodations for this class and/or program, please contact Accessibility Resource Center at arcsrvs@unm.edu or by phone at 505-277-3506.  Support: Contact me at my email or in office hours and contact Accessibility Resource Center (https://arc.unm.edu/) at arcsrvs@unm.edu (505) 277-3506.

### Credit-hours

This is a three credit-hour course. Class meets for three 50-minute sessions of direct instruction for fifteen weeks during the Fall 2022 semester. Please plan for a minimum of six hours of out-of-class work (or homework, study, assignment completion, and class preparation) each week.

### Title IX statement

In an effort to meet obligations under Title IX, UNM faculty, Teaching Assistants, and Graduate Assistants are considered “responsible employees.” This designation requires that any report of gender discrimination which includes sexual harassment, sexual misconduct and sexual violence made to a faculty member, TA, or GA must be reported to the Title IX Coordinator at the Office of Equal Opportunity (oeo.unm.edu). For more information on the campus policy regarding sexual misconduct, see: https://policy.unm.edu/university-policies/2000/2740.html

### Citizenship and/or Immigration Status

All students are welcome in this class regardless of citizenship, residency, or immigration status.  Your professor will respect your privacy if you choose to disclose your status. As for all students in the class, family emergency-related absences are normally excused with reasonable notice to the professor, as noted in the attendance guidelines above. UNM as an institution has made a core commitment to the success of all our students, including members of our undocumented community. The Administration’s welcome is found on our website: http://undocumented.unm.edu/.

### Land Acknowledgement

Founded in 1889, the University of New Mexico sits on the traditional homelands of the Pueblo of Sandia. The original peoples of New Mexico Pueblo, Navajo, and Apache since time immemorial, have deep connections to the land and have made significant contributions to the broader community statewide. We honor the land itself and those who remain stewards of this land throughout the generations and also acknowledge our committed relationship to Indigenous peoples. We gratefully recognize our history.

### Respectful and Responsible Learning

We all have shared responsibility for ensuring that learning occurs safely and equitably. UNM has important policies to preserve and protect the academic community, especially policies on student grievances (Faculty Handbook D175 and D176), academic dishonesty (FH D100), and respectful campus (FH CO9). These are in the Student Pathfinder (https://pathfinder.unm.edu) and the Faculty Handbook (https://handbook.unm.edu). Please ask for help in understanding and avoiding plagiarism or academic dishonesty, which can both have very serious consequences.

### Connecting to Campus and Finding Support

UNM has many resources and centers to help you thrive, including opportunities to get involvedmental health resourcesacademic support including tutoringresource centers for people like you, free food at Lobo Food Pantry, and jobs on campus. Your advisor, staff at the resource centers and Dean of Students, and I can help you find the right opportunities for you.

## Support in Receiving Help

Students who ask for help are successful students. I encourage students to be familiar with services and policies that can help them navigate UNM successfully. Many services exist to help you succeed academically, such as peer tutoring at CAPS and http://mentalhealth.unm.edu. There are plenty of ways to find your place and your pack at UNM: see the “student guide” tab on my.unm, students.unm.edu, or ask me for information about the right resource center or person to contact.

# Our Classroom

We’re doing this because:
• We want you to be empowered with statistics.
• We believe everyone should get out of this course with awesome skills
• Real-time feedback promotes efficient learning
“It encourages me to engage actively with the course material and to take responsibility for my learning.”

## GAISE Connections

Our six recommendations include the following:
1. Teach statistical thinking.
• Teach statistics as an investigative process of problem-solving and decision-making.
• Give students experience with multivariable thinking.
2. Focus on conceptual understanding.
3. Integrate real data with a context and purpose.
4. Foster active learning.
5. Use technology to explore concepts and analyze data.
6. Use assessments to improve and evaluate student learning.

Learning without thought is labor lost. What I hear, I forget. What I see, I remember. What I do, I understand. – Confucius

# Archive

### Pre-course to-dos

Did you receive a registration error for Fall 2020? Send me an email with the following answers:
1. What registration error did you get (copy/paste is best)?
2. What is your UNM ID?
3. What is your Math/Stat background (that is, do you have the prerequisites)?
If you are waitlisted, as long as there are seats available I will override you into the course. Don’t worry.

### Step 0

Before our first “class” (Mon 8/22) please read through the following actions and install the required software on your computer.
1. Install:
1. R (windows or mac) or upgrade (Install R video (5 min)),
2. RStudio, and
3. Quarto.
2. Install R packages.
1. Follow these instructions: R packages.  (Ignore warning about rtools or any packages unavailable.)
2. In RStudio, open Packages tab, click on “Update”, Select All, Install Updates (“No” to restart, “No” to compile from source).
3. Install erikmisc package (also at the end of “Install R packages”, above).
1. Submit these the two lines to the R console:
1. `install.packages("devtools")`
2. `devtools::install_github("erikerhardt/erikmisc")`
1. If it asks to update packages (it should not ask this if you updated packages above), press 3 [Enter] for “None”.
2. Make sure it works by printing the logo:
1. `library(erikmisc)`
2. `erikmisc_logo()`
1. RStudio disable notebook
2. Operating system to be more friendly to programming.
If you have a Chromebook or no laptop, consider using RStudio Cloud > Individuals,  and when installing packages remove the `type="binary"` option.

• Smart Questions” guide (note “hackers build things, crackers break them”)
• Follow this Rubric when emailing a question:
• Send a new email for each new question.  Use “Reply” to continue a conversation on a question (do not start a new email, again).
• Include “ADA2” as the first word of the subject line in new emails (if replying, use reply), with the rest of the subject indicating the assignment and type of problem.
• Begin the email with a short question summary (that is, don’t bury your question in the middle of the third paragraph).  Then, begin the detail of your question in the second paragraph.
• When possible, include commented code in the email body — Comments (starting with # symbol) should indicate where the problem is, what the expected behavior is, and what steps are necessary to reproduce the problem.
• Attach your qmd file so that the instructor can reproduce the problem.   If attaching code, please include all the files necessary to run your code (data, etc.)
• [Attaching code supersedes this: Code should include a “Minimum representative test case” (http://www.catb.org/esr/faqs/smart-questions.html#code)]
• Assume the best. Your instructors want to help and we will do our best. Do not abuse your helpers even if you feel frustrated.

## Course introduction materials

Problems installing PDS package?  Solution. If you had problems installing the PDS package, no problem; here’s how to get the data: 1. Download the “.RData” file above for your dataset. 2. Where I have “library(PDS)” in my code, change it to the two lines below.  You’ll need to update the “PATH_TO_FILE” below to the path on your computer’s hard drive, and “filename” needs to be changed to the name of the file. This will directly read the data file.

```# library(PDS)
setwd("/PATH_TO_FILE")