ADA2: Class 08, Ch 05a Paired Experiments and Randomized Block Experiments

Advanced Data Analysis 2, Stat 428/528, Spring 2023, Prof. Erik Erhardt, UNM

Author

Your Name

Published

December 17, 2022

Randomized Complete Block Design (RCBD)

Following the in-class assignment this week, perform a complete RCBD analysis.

  1. (2 p) Reshape and plot the data, describe relationships of Sales between Items and Restaurants
  2. (0 p) Fit model
  3. (3 p) Assess model assumptions
  4. (2 p) State and interpret the hypothesis test for difference in Item mean sales
  5. (2 p) If appropriate, perform pairwise comparisons with Tukey HSD correction
  6. (1 p) What is your recommendation to the Franchise?

Data

library(erikmisc)
── Attaching packages ─────────────────────────────────────── erikmisc 0.1.20 ──
✔ tibble 3.1.8     ✔ dplyr  1.1.0
── Conflicts ─────────────────────────────────────────── erikmisc_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
erikmisc, solving common complex data analysis workflows
  by Dr. Erik Barry Erhardt <erik@StatAcumen.com>
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats   1.0.0     ✔ readr     2.1.4
✔ ggplot2   3.4.1     ✔ stringr   1.5.0
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# read the data
dat_food <- read.table(text="
Restaurant Item1 Item2 Item3
A          31    27    24
B          31    28    31
C          45    29    46
D          21    18    48
E          42    36    46
F          32    17    40
", header = TRUE) %>%
  as_tibble()

(2 p) Reshape and plot the data, describe relationships of Sales between Items and Restaurants

The code below will get you started with reshaping the data. The rest is up to you!

dat_food_long <-
  dat_food %>%
  pivot_longer(
    cols      = starts_with("Item")
  , names_to  = "Item"
  , values_to = "Sales"
  ) %>%
  mutate(
    Item       = factor(Item)
  , Restaurant = factor(Restaurant)
  )

str(dat_food_long)
tibble [18 × 3] (S3: tbl_df/tbl/data.frame)
 $ Restaurant: Factor w/ 6 levels "A","B","C","D",..: 1 1 1 2 2 2 3 3 3 4 ...
 $ Item      : Factor w/ 3 levels "Item1","Item2",..: 1 2 3 1 2 3 1 2 3 1 ...
 $ Sales     : int [1:18] 31 27 24 31 28 31 45 29 46 21 ...
# Group means
m_dat_b <-
  dat_food_long %>%
  group_by(Item) %>%
  summarize(
    m = mean(Sales)
  )
m_dat_b
# A tibble: 3 × 2
  Item      m
  <fct> <dbl>
1 Item1  33.7
2 Item2  25.8
3 Item3  39.2
m_dat_c <-
  dat_food_long %>%
  group_by(Restaurant) %>%
  summarize(
    m = mean(Sales)
  )
m_dat_c
# A tibble: 6 × 2
  Restaurant     m
  <fct>      <dbl>
1 A           27.3
2 B           30  
3 C           40  
4 D           29  
5 E           41.3
6 F           29.7

(0 p) Fit model

(3 p) Assess model assumptions

(2 p) State and interpret the hypothesis test for difference in Item mean sales

(2 p) If appropriate, perform pairwise comparisons with Tukey HSD correction

(1 p) What is your recommendation to the Franchise?