ADA2

May 15th, 2013

UNM Stat 428/528: Advanced Data Analysis II (ADA2)

Spring 2014 Syllabus is below table.

Spring 2014 schedule:
Time: TR ????-????
Location: ????
Stat 428, CRN ????
Stat 528, CRN ????

Did you receive a registration error for Spring 2014? Send me an email with the following answers:
1. What registration error did you get (copy/paste is best)?
2. What is your UNM ID?
3. What is your Math/Stat background (that is, do you have the pre-reqs)?

News:
5/16 The notes have been updated for Spring 2014.


Tentative Timetable

Wk-Date Ch Topic Slides Code Data pts HW sol
Data
Read
HW
Due
Plot
01-01/21 Two-hour delay, cancellation
01-01/23 01 R, Review Ch 01 R
d1 d2
HW01 sol
dat
01/29
02-01/28
02-01/30 02 Introduction to
Multiple Linear Regression
Ch 02 R
d1 d2
HW02 sol
dat
02/05
03-02/04
03-02/06 03 A Taste of Model Selection
for Multiple Linear Regression
Ch 03 R
d1
HW03 sol
dat
02/12
04-02/11 04 Experimental Design:
One and Two Factor Designs
Ch 04
04-02/13 05 Paired Experiments and
Randomized Block Designs
Ch 05 Coef R
d1 d2 d3 d4
HW05 sol
dat
02/21
05-02/18
05-02/20
06-02/25 06 Discussion of
Observational Studies
Ch 06 R d1
06-02/27 07 Analysis of Covariance:
Comparing Regression Lines
Ch 07 R
d1 d2 d3
HW07 sol
dat
03/07
07-03/04
07-03/06
08-03/11 08 Polynomial Regression Ch 08 R
d1 d2
08-03/13
09-03/18 Spring Break
09-03/20 Spring Break
10-03/25 09 Response Models with
Factors and Predictors
Ch 09 R
d1
HW09 sol
(dat = HW05)
04/02
10-03/27
11-04/01 10 Model Selection for
Multiple Regression
Ch 10 R
d1
11-04/03 11 Logistic Regression Ch 11 R
d1 d2 d3 d4
HW11 sol
dat
04/16
12-04/08
12-04/10
13-04/15 12 An Introduction to Multivariate Methods Ch 12 R
13-04/17 13 Principal Components Analysis (PCA) Ch 13 R
d1 d2 d3 d4
HW13 sol
dat
04/25
14-04/22
14-04/24 14 Cluster Analysis Ch 14 R
d1 d2
15-04/29 15 Multivariate Analysis of Variance
(MANOVA)
Ch 15 R
d1
15-05/01 16 Discriminant Analysis Ch 16 R
d1
wcloud
images
16-05/06 17 Classification Ch 17 R HW17 sol
R dat
Tue 05/07
by 3pm slid
under my
office door
Math&Stat
(SMLC 312)
16-05/08 Discussion of HW17 solutions
17-05/13 Finals Week

I recommend printing (two to a page) only the upcoming chapter the day before class because future chapters are subject to edits.

Notes from Spring 2014 using R: ADA2_notes.pdf includes all chapters in one document.
Creative Commons License Lecture notes for Advanced Data Analysis 2 (ADA2) Stat 428/528 University of New Mexico is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Based on a work at http://statacumen.com/teach/ADA2/ADA2_notes.pdf.

Citing lecture notes, example: Bedrick EJ, Schrader RM, and Erhardt EB. (2013) Lecture notes for Advanced Data Analysis 2. Retrieved Mar 1, 2013, from statacumen.com/teach/ADA2/ADA2_notes.pdf, 136–144.

Notes from Spring 2013 using R: ADA2_notes_S13.pdf includes all chapters in one document.
Creative Commons License Lecture notes for Advanced Data Analysis 2 (ADA2) Stat 428/528 University of New Mexico is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Based on a work at http://statacumen.com/teach/ADA2/ADA2_notes_S13.pdf.

Notes from Spring 2012 using SAS: ADA2_notes_S12.pdf includes all chapters in one document.
Creative Commons License Lecture notes for Advanced Data Analysis 2 (ADA2) Stat 428/528 University of New Mexico is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Based on a work at http://statacumen.com/teach/ADA2/ADA2_notes_S12.pdf.


Syllabus

Description: A continuation of 427/527 that focuses on methods for analyzing multivariate data and categorical data. Topics include MANOVA, principal components, discriminant analysis, classification, factor analysis, analysis of contingency tables including log-linear models for multidimensional tables and logistic regression.
Prerequisite: Stat 427 (ADA1)
Semesters offered: Spring
Lecture: Stat 428/528.001 (CRN 25445 or 25449), TR 9:30–10:45, Hibben 105 (LoboWeb says we’re in GSM 230, we’re not)
Spring 2013 office hours: MSLC 312, Tue 11:00-12:00 and Thu 14:00-15:00
email: ”Erik B. Erhardt” <erike@stat.unm.edu>, please include “ADA2″ in subject line
Textbook: Peter Dalgaard, “Introductory Statistics with R“, Second Edition, 2008, ISBN: 978-0-387-79053-4. The book is not required, but it will provide a backup for what you learn in class.

Get started before class:
Step 0: Set up R with Rstudio
(1) Download R for windows or mac, (2) install Rstudio, and (3) install a package we’ll use with the following R command: install.packages("ggplot2").
style matters. There is a lot of online help on R, such as at UCLA. Usually try searching for “R [mytopic]” and you’ll get lots of results.

R tutorials: TryR (gentle), Kelly Black
Cookbook for R for helpful examples, visualization tutorialsdiagrams.

Teaching Assistants

Not sure who this is, yet.  Might only be a grader.

Assessment

Rubrics guide self-assessment of homework and code.

Homework is designed to encourage you to review the material we’ve learned, synthesize new information from the R help pages or the web, and apply (and learn!) your new skills. Expect to spend 4-5 hours a week (outside of class!) to do well, and maybe double that to do outstandingly well. Start working on the homework when it is assigned, not the weekend before it’s due.

Homework is due 1 week (or 2 classes, whichever is shorter) after we complete each chapter. Homework grade based on points for homework, with a 10% bonus for excellent code (three 5s) based on the rubric.

Header for homework assignments:

First Last
ADA2 Stat 428 (or 528)
HW ##
MM/DD/YYYY

All R code for the assignment should be included in an appendix at the end of the document.

Please hand in a physical version of your homework – a grader will write comments on it and give it back to you. An electronic version will be accepted under exception circumstances (almost never).

Late assignments will be penalized 20% if handed in (or slid under my office door) by 5pm the following day, and will not be accepted after that.

Final grade is the proportion correct of HW points, possibly with a safety cushion built-in (such as by reducing the denominator).

Disability statement

If you have a documented disability that will impact your work in this class, please contact me to discuss your needs. You’ll also need to register with the Accessibility Resource Center in 2021 Mesa Vista Hall (building 56) across the courtyard east from the SUB.

Random stuff:
UNM R programming group, organized and taught by Christian Gunning, meeting at 12:00pm on Friday in the PIBBS space in Castetter Hall.

UNM has license for free online access to the definitive books for the Lattice and ggplot2 graphing platforms. Note you must be on campus or logged in through the UNM proxy to access these.

style matters. There is a lot of online help on R, such as at UCLA. Usually try searching for “R [mytopic]” and you’ll get lots of results.  ggplot2 plotting cookbook.

R reference card by Jonathan Baron.

Translate between MATLAB and R.

Figure checklist.  Choosing the right chart.

Raster vs vector graphics.

Statistics pre-req refresher from Khan Academy.




Table of selected statistical methods

The data and design determines which method you use: original or UCLA.

Here’s a table of methods with the applicable semester of ADA and Chapter.

Number
of
Dependent
Variables
Number
of
Independent
Variables
Type
of
Dependent
Variable(s)
Type
of
Independent
Variable(s)
Measure Test(s) ADA-Ch
1 0
(1 population)
continuous normal not applicable
(none)
mean one-sample
t-test
1-02
continuous
non-normal
median one-sample
median
1-06
categorical proportions Chi Square
goodness-of-fit,
binomial test
1-07
1
(2 independent
populations)
normal 2 categories mean 2 independent
sample t-test
1-03
non-normal medians Mann Whitney,
Wilcoxon
rank sum test
1-06
categorical proportions Chi square test
Fisher’s Exact test
1-07
0
(1 population
measured twice)
or
1
(2 matched
populations)
normal not
applicable/
categorical
means paired t-test 1-02
non-normal medians Wilcoxon
signed ranks test
1-06
categorical proportions McNemar,
Chi-square test
1-07
1
(3 or more
populations)
normal categorical means one-way ANOVA 1-05
non-normal medians Kruskal Wallis 1-06
categorical proportions Chi square test 1-07
2 or more
(e.g., 2-way ANOVA)
normal categorical means Factorial ANOVA 2-05
non-normal medians Friedman test not
categorical proportions log-linear,
logistic regression
2-11
0
(1 population
measured
3 or more
times)
normal not applicable means Repeated measures
ANOVA
not
1 normal continuous correlation,
simple linear
regression
1-08
non-normal non-parametric
correlation
1-08
categorical categorical
or continuous
logistic regression 2-11
continuous discriminant
analysis
2-16
2 or more normal continuous multiple linear
regression
2-02
non-normal
categorical logistic regression 2-11
normal mixed categorical
and continuous
Analysis of Covariance,
General Linear Models
(regression)
2-09
non-normal
categorical logistic regression 2-11
2 2 or more normal categorical MANOVA 2-15
2 or more 2 or more normal continuous multivariate
multiple linear
regression
not
2 sets of
2 or more
0 normal not applicable canonical correlation not
2 or more 0 normal not applicable factor analysis not
0 or more mixed categorical
and continuous
principal component
analysis
(w/multiple regression)
2-13
categorical cluster analysis 2-13
discriminant analysis 2-16
classification 2-17
Comments are closed.