S4R S19

UNM Stat 145 special: Statistics for Research (S4R)

Spring 2019 Syllabus is below timetable. Spring 2019 schedule TR 1530-1645, CTLB 330, Stat 145.014, CRN 30479

Goal

This Is Statistics

Our goal is to increase the number and diversity of students exposed to meaningful and empowering data analysis experiences and to inspire the pursuit of advanced data-driven experiences and opportunities for everyone! Learn to produce beautiful (markdown) and reproducible (knitr) reports with informative plots (ggplot2) and tables (kable) by writing code (R, tidyverse, Rstudio) to answer questions using fundamental statistical methods (all one- and two-variable methods), which you’ll be proud to present (poster).


News

Information about the coming week will appear here if necessary; usually there won’t be any.

Course content

Course book and videos

Book: Passion Driven Statistics Erik’s example assignment document NESARC data, nicotine and depression: .Rmd + .bib = .html

Passion Driven Statistics (PDS) data

I encourage you to use one of the AddHealth datasets or NESARC.  Use AddHealth W1 if you want to understand adolescents when they were young and AddHealth W4 if you want to understand adult relationships.  NESARC is also interesting for substance abuse issues. Data available in the PDS package with the command: library(PDS).

Weekly structure (also see Assessment below)

  1. Pre-class (Tuesday): Reading, Video, Quiz (due before class — two attempts, higher score used, and solutions become available Tue 3:30pm after the quiz is due)
  2. In-class: Activities in class Tuesday and Thursday.  Tuesday’s assignment is due by Thursday 3:30pm of the same week, submitted to UNM Learn (evaluated by TA within 1 week).  Often finished in class.
  3. Post-class (Thursday): Thursday’s assignment will be left to complete as Homework (due following Thursday by 3:30pm).  Occasionally, finished in class, usually not.
  • UNM Learn for quizzes and submitting in-class and homework assignments.
Office hours
  • Erik: M/T 1-2, and by appointment in SMLC 312
  • Kelli: M 11-12, W 10-12 in SMLC 306
  • Leah: M/W 3-4 in SMLC 319

Timetable

Wk-Date Cl Topic Reading, Video, Quiz Week’s Preparation In-class Worksheet, Homework, Data
00-01/15 00 Install software, survey Step 0 – software install Complete the Learning Studio Opinio pre-semester survey required for classroom assessment. Dierker Pre-survey (sent by email end of first week)
01-01/15 01 Intro, RStudio and RMarkdown, poster
  • Introduction
  • RStudio + RMarkdown
    • RStudio config:
    • Menu: Tools / Global options /
      • General / Save workspace: Never
      • Sweave / Weave Rnw: knitr
    • Disable notebooks
  • Datasets + Codebooks
  • Learn: 01 Intro to using RMarkdown: Rmd html
01-01/17 02 Rmd, codebook  
02-01/22 03 Datasets, Codebooks, Personal Codebook  
02-01/24 04 Citations
  • (CAPS visit scheduled)
I recommend starting next week’s assignments over the weekend since the Literature Review can take a long time.
(UNM Google Scholar)
03-01/29 05 Research Questions
03-01/31 06 Literature Review    
04-02/05 07 Working With Data, Data Management
  • Read: PDS Ch 6 Working with Data PDS Ch 7 Data Management
  • Video: PDS Video: 04. Working with Data PDS Video: 05. Data Management ADA1 Subset variables
  • Quiz: PDS Quiz 06 Working With Data PDS Quiz 07 Data Management
04-02/07 08 Coding missing and factor labels
05-02/12 09 continued continued
05-02/14 10 continued continued
06-02/19 11 Graphing Univariate
  • Read: PDS Ch 8 Graphing: One Variable at a Time
  • Video: PDS Video: 06. Graphing: One Variable at a Time
  • Quiz: PDS Quiz 08a Frequency Tables PDS Quiz 08b Graphing Variables
06-02/21 12 continued     continued  
07-02/26 13 Graphing Bivariate
  • Read: PDS Ch 9 Graphing Relationships
  • Video: PDS Video: 07. Graphing Relationships
  • Quiz: PDS Quiz 09 Graphing Relationships
07-02/28 14 continued continued
08-03/05 15 Sampling and Designing Studies
  • Read: PDS Ch 16 Sampling and Designing Studies
08-03/07 16
09-03/12 Spring Break
09-03/14 Spring Break
10-03/19 17 Hypothesis Testing
  • Read: PDS Ch 10 Hypothesis Testing
  • Video: PDS Video: 08. Hypothesis Testing
  • Quiz: PDS Quiz 10 Hypothesis Testing
10-03/21 18
11-03/26 19 ANOVA
  • Read: PDS Ch 11 Analysis of Variance
  • Video: PDS Video: 09. Analysis of Variance
  • Quiz: PDS Quiz 11 ANOVA
11-03/28 20
12-04/02 21 Contingency tables
  • Read: PDS Ch 12 Chi-Square Test of Independence
  • Video: PDS Video: 10. Chi-Square Test of Independence
  • Quiz: PDS Quiz 12 Chi Square
12-04/04 22
13-04/09 23 Correlation and Interactions
  • Read: PDS Ch 13 Correlation Coefficient PDS Ch 14 Moderation
  • Video: PDS Video: 11. Correlation PDS Video: 12. Moderation
  • Quiz: PDS Quiz 13 Correlation PDS Quiz 14 Exploring Moderation
13-04/11 24
14-04/16 25 Linear Regression
  • Read: PDS Ch 15 Linear Regression: Summarizing the Pattern of the Data with a Line PDS Ch 17 Confounding and Multivariate Models
  • Video: PDS Video: 13. The Question of Causation PDS Video: 14. Multivariate Models and Confounding
  • Quiz: PDS Quiz 15 Regression PDS Quiz 17 Confounding
14-04/18 26
15-04/23 27 Poster Presentation
  • Read: PDS Ch 18 Poster Presentation
  • Video: PDS Video: 15. Writing Your Poster Presentation
  • Quiz: none
Work on designing poster content at the bottom of your project document. Completed poster content in project document due before Class 29. Poster template pptx pdf Poster content examples
15-04/25 28 Have instructor or TA sign off on poster content in your project document.  Finish details and put into poster template.
16-04/30 29 Required surveys (4 final grade points), all due 5/3/19.
  1. Survey Dierker Post-course survey, qualtrics by email for course pilot evaluation.
  2. Complete the Learning Studio Opinio post-semester survey required for classroom assessment.
  3. UNM EvalKit survey, required for all students of all courses.
Finish work on poster and submit pdf to be printed.  Poster pdf due end of day today.  Send to the printer. Have a peer mentor approve your poster for printing and presentation. Congratulations! ARI Graphix, 4716 McLeod NE, Albuquerque, NM 87109 505-884-0862 $10.80+tax poster printing Open Mon-Fri 7:30-5:00 Do not use their website! Email plotting@abqrepro.com, Subject: UNM S4R class poster Text: indicate to print “in color on bond paper”. Attach: Poster pdf with your name in the filename, such as “FirstLast_S4R_poster.pdf”. Try to send by Tuesday end of class for the poster to be ready to print on Wednesday. Arrange to pick up the poster; you’ll need to pay there before they print. They can usually print right away, but it is possible you would need to give them a couple of hours. Price less than $1/sq ft for Spring 2019.
16-05/02 30 POSTERS Poster sessions in SMLC Atrium Poster presentation Poster Schedule (be on time): 3:30-3:40 Organization 3:40-4:20 Group 1 4:30-5:10 Group 2 Congratulations on a great semester! Poster rubric
17-05/06 Finals week No final!
 
 
  (I reserve the right to continue to improve the materials throughout the semester.)

Syllabus

  • Description: Techniques for the visual presentation of numerical data, descriptive statistics, introduction to probability and basic probability models used in statistics, introduction to sampling and statistical inference, illustrated by examples from a variety of fields.  In this special Statistics for Research (S4R) version, we will emphasize the skills of data analysis, visualization, and communication for undergraduate research.
  • Prerequisite: See UNM catalog
  • Semesters offered: Spring 2019
  • Lecture: Spring 2019 schedule
  • TR 1530-1645, CTLB 330, Stat 145.014, CRN 30479
  • Location: CTLB 330 (building 55, northeast of Zimmerman) Video
  • Office hours: Mon/Tue 13:00-14:00, and by appointment in SMLC 312
  • email: “Erik B. Erhardt” <erike@stat.unm.edu>, please include “S4R” with a descriptive subject line, such as “S4R Homework 02 plot”
  • Textbook: Required custom book is available for free on this webpage: Passion Driven Statistics.
  • Laptops running R: I encourage you to bring a laptop to class each day so you can try the R programming exercises in class. If you don’t have one, no problem, there are some laptops in class and teamwork is encouraged — sit next to someone friendly who likes to share.
  • Saving data: If you’re using classroom computers, use Flashdrives or UNM’s OneDrive (available in LoboMail) for saving files.  The CTLB computers do not connect to your standard UNM drive space.

Teaching Assistants and Peer Mentors

Stat grad students TAs

  • Kelli Kasper <kkasper@unm.edu>, office hours M 11-12, W 10-12 in SMLC 306.

Peer Mentors, SEP

  • Leah Puglisi <lhpuglisi@unm.edu>, former student, office hours M W 3-4 in SMLC 319.

Student learning outcomes

  1. Students will learn to use a reproducible research workflow.
  2. Students will improve their technology expertise.
  3. Students will learn to work with large data sets.
  4. Students will learn to create and present graphs for both univariate and multivariate data.
  5. Students will learn how to construct and test hypotheses.

Meeting the learning outcomes

You will acquire new information in this class, but the emphasis is comprehending, integrating, and applying information. Rote factual memorization is the lowest form of learning. Effective learning takes place by explaining, integrating, applying, and analyzing facts, hypotheses, and theories. Learning in this class occurs by:
  1. Doing – completion of exercises that require analysis of data to answer questions and test hypotheses, or researching answers to reading assignments.
  2. Discussion – interaction with classmates to assemble and synthesize information you’d utilizing the collective skills and knowledge base of the group.
  3. Listening, acting, and reflecting – activities during class time provide insights into information not available in readings and includes review difficult material to aid comprehension. Note taking permits later reflection on lecture content. Listening to the professor lecture is the least effective learning tool for both students, however, and you should plan on coming to every class prepared to participate in active and reflective learning opportunities.

Assessment

  • Quizzes are due each Tuesday before class.  Purpose: to assess reading and video comprehension and assure you’re prepared to actively participate in class activities with minimal lecture. (There are 17, 20% of final grade, the lowest 2 are dropped.)  Most weeks plan for 1 hour reading and video with a 20 minute quiz. Quizzes are not timed, they can be taken twice, and the higher of the two scores is used for grade calculation.
    • Viewing quiz solutions after the due date in UNM Learn is not intuitive.  Click on the “Begin” button (this is the non-intuitive part, since you are not actually beginning the quiz), then click “View All Attempts” to see the scores.  Finally, click “Calculated Grade” to see the feedback for each question of the quiz.
  • In-class assignments are assigned each Tuesday and due before the Thursday class at 3:30pm, submitted to UNM Learn.  Purpose: to struggle and find success in class with the concepts and skills. (About 12, includes class participation, 20% of final grade, the lowest 2 are dropped.) Most weeks plan to finish in class.
  • Homework (HW) assignments are assigned each Thursday and due the following Thursday, submitted to UNM Learn. Purpose: to apply concepts and skills to your class poster project. (About 12, 40% of final grade, the lowest few are dropped.) Most weeks plan on 1-4 hours per assignment with a substantial start in class.
  • Poster will be developed through semester (most HW assignment contribute to poster), the last couple weeks we’ll complete them, and the last week we’ll have poster presentations. Purpose: to have an overarching set of questions to answer using methods learned in the course, with a deliverable you can be proud of! (1 poster and presentation, 12% poster, 2% presentation, and 2% evaluations of others of final grade.)  In the last couple weeks, assembling this poster may take 5-10 hours using a template provided to you.
  • Course surveys are due at the beginning and end of the course. Purpose: to participate in national project-based learning projects and improve the course.  (About 4, 4% of final grade.)
All assignments in this class are electronic and submitted to UNM Learn for grading. Late assignments will not be accepted.
  • All R code for the assignment should be included with the part of the problem it addresses (for code and output use a fixed-width font, such as Courier); this will happen automatically by using RMarkdown.
  • Do NOT use your R code and output as your answer to the problem, but include them to show me how you arrived at your answer. Your prose solution should be provided to interpret the output.  Output without explanation will not be given credit.

Collaboration and citation

For homeworks I encourage you to work together. Please discuss the data, code, and problems with one another, but do your own exploration and write up. We expect everyone to hand in substantially different homeworks, and we will enforce this under the honor code. The small benefit you might get from plagiarism is not worth the severe penalty (of lost trust, being reported to the dean, no points for the assignment, etc.). As in life, please use any resources available to you. Projects and some homeworks will explicitly encourage you to use resources on the internet, but showing extra initiative will always be appreciated. You may find R programming tough at first, so feel free discuss your problems with other classmates or meet with or email questions to the TAs or me.  Meeting in person is often much more productive than questions by email.  If emailing, include your Rmd file and any required files (such as your .bib file) and a description of what you’re trying to do and where your error or trouble is. I encourage you to use the ideas of others, but make them your own, giving credit. For projects have a formal bibliography, for homework cite casually, and for code simply copy the URL in as a comment (which is doubly helpful to you for finding the resource again).

Absences policy

I will follow the UNM absences policy with two unexcused absences.  This means I can drop you from the class if you have a third absence; this paragraph serves as your warning.  No one wants that, but I have found that I need to take attendance in freshman courses otherwise this policy is abused. If we all respect ourselves and each other then we won’t need attendance sheets and you’ll all achieve more.

Statements

Disability statement

If you have a documented disability that will impact your work in this class, please contact me to discuss your needs. You’ll also need to register with the Accessibility Resource Center in 2021 Mesa Vista Hall (building 56) across the courtyard east from the SUB.

Title IX statement

In an effort to meet obligations under Title IX, UNM faculty, Teaching Assistants, and Graduate Assistants are considered “responsible employees” by the Department of Education (see pg 15).   This designation requires that any report of gender discrimination which includes sexual harassment, sexual misconduct and sexual violence made to a faculty member, TA, or GA must be reported to the Title IX Coordinator at the Office of Equal Opportunity. For more information on the campus policy regarding sexual misconduct.

Citizenship and/or Immigration Status

All students are welcome in this class regardless of citizenship, residency, or immigration status. Your professor will respect your privacy if you choose to disclose your status. As for all students in the class, family emergency-related absences are normally excused with reasonable notice to the professor, as noted in the attendance guidelines above. UNM as an institution has made a core commitment to the success of all our students, including members of our undocumented community. The Administration’s welcome is found on the UNM website: http://undocumented.unm.edu/.

Our Classroom

We’re doing this because:
  • We want you to be empowered with statistics.
  • We believe everyone should get out of this course with awesome skills
  • Real-time feedback promotes efficient learning
“It encourages me to engage actively with the course material and take responsibility for my learning.”

GAISE Connections

Our six recommendations include the following:
  1. Teach statistical thinking.
    • Teach statistics as an investigative process of problem-solving and decision-making.
    • Give students experience with multivariable thinking.
  2. Focus on conceptual understanding.
  3. Integrate real data with a context and purpose.
  4. Foster active learning.
  5. Use technology to explore concepts and analyze data.
  6. Use assessments to improve and evaluate student learning.

Learning without thought is labor lost. What I hear, I forget. What I see, I remember. What I do, I understand. – Confucius
 

Archive

Pre-course to-dos

Did you receive a registration error for Spring 2019? Send me an email with the following answers: 1. What registration error did you get (copy/paste is best)? 2. What is your UNM ID? 3. What is your Math/Stat background (that is, do you have the pre-reqs)? If you are waitlisted, as long as there are seats available I will override you into the course. Don’t worry. Step 0: Before our first class (Tue 1/15) please read through the following actions and install the required software on your computer and complete the brief surveys. If you don’t have a computer, there are classroom computers which will be available only when the classroom is open.
  1. Install R and RStudio:
    1. R for programming
      1. Windows (Download R 3.5.x for Windows link) or
      2. Mac (R-3.5.x.pkg link); or
      3. upgrade if you already have R
    2. Rstudio Desktop for better R experience
      1. Installers at bottom, choose Windows or Mac OSX.
    3. Videos that may be helpful for installation:
      1. Install R on Mac (2 min).
      2. Install R for Windows (3 min).
      3. Install R and RStudio on Windows (5 min).
  2. Install R packages (copy/paste CODE into console and press [Enter]; this may take 20-30 minutes), also update all packages within RStudio.
  3. Install Zotero or Mendeley (recommended for your own laptop) for bibliography management.
Saving data: If you’re using classroom computers, use Flashdrives or UNM’s OneDrive (available in LoboMail) for saving files.  The CTLB computers do not connect to your standard UNM drive space. I recommend using a very systematic folder structure, such as S4R/HW, S4R/Class, S4R/Reading, S4R/Poster, etc.  Do not just work on files in your downloads folder or your desktop; respect your data and code!
Unicode compile problems:  If you knit to pdf you may get this error: “! Package inputenc Error: Unicode char”.  ASCII is a small character set what we use to program in, Unicode is an extended character set that looks pretty (for example “straight quotes” become “curly quotes”) but causes code to break.  You get unwanted Unicode when you copy/paste from a pdf or some other source into your code.  To fix this, you have to find the Unicode and replace it with it’s ASCII equivalent.  To do this: Ctrl-F to find, search for “[^\x00-\x7F]” (without quotes), select “Regex” for regular expressions, and find the “Next” one.  As it finds instances, replace the characters manually until there are no more.  These characters will typically be curly quotes or fancy dashes.  

Acumen in Statistics