1 Tire wear experiment

A fleet manager wishes to compare the wearability of 4 brands of tire: A, B, C, and D. Four cars are available for the experiment and 4 tires of each brand are available, for a total of 16 tires. The idea is to mount 4 tires on each of the 4 cars, ask the driver of each of the cars to drive his/her car for 20,000 miles and then to measure tread loss. We will measure tread loss in mils (0.001 inches). We will designate the 4 cars as cars c1, c2, c3, and c4.

We consider 3 experimental designs.

2 Design 1: Brands are randomized to Cars

Naive design.

       Car
       c1 c2 c3 c4
       -----------
Brand   A  B  C  D
        A  B  C  D
        A  B  C  D
        A  B  C  D
       -----------

2.1 (1 p) What is the obvious flaw in this design?

2.1.1 Solution

[answer, 1 sentence describing the problem.]

3 Design 2: Completely randomized design (CRD)

Another design that negates the confounding is to use a completely randomized design (CRD). This entails numbering the 16 tires, drawing at random the numbers and assigning tires to cars in a completely random manner. The following table illustrates a possible result of doing this.

       Car
       c1 c2 c3 c4    c1 c2 c3 c4
       -----------    -----------
Brand   C  A  C  A    12 14 10 13
        A  A  D  D    17 13 11  9
        D  B  B  B    13 14 14  8
        D  C  B  C    11 12 13  9
       -----------    -----------

Bring this data in R.

# Plan:
# Read the data values and the treatment labels,
# reshape both into long format and combine.

d2.dat <- read.table(text="
c1 c2 c3 c4
12 14 10 13
17 13 11  9
13 14 14  8
11 12 13  9
", header = TRUE)

d2.trt <- read.table(text="
c1 c2 c3 c4
C  A  C  A
A  A  D  D
D  B  B  B
D  C  B  C
", header = TRUE, as.is = TRUE)

library(reshape2)
d2.dat.long <- melt(d2.dat, variable.name = "Car", value.name = "Wear")
d2.trt.long <- melt(d2.trt, id.var = NULL, value.name = "Brand")

d2.all <- cbind(d2.dat.long, Brand = d2.trt.long$Brand)
str(d2.all)
'data.frame':   16 obs. of  3 variables:
 $ Car  : Factor w/ 4 levels "c1","c2","c3",..: 1 1 1 1 2 2 2 2 3 3 ...
 $ Wear : int  12 17 13 11 14 13 14 12 10 11 ...
 $ Brand: Factor w/ 4 levels "A","B","C","D": 3 1 4 4 1 1 2 3 3 4 ...

The appropriate analysis for this experiment is the one-way ANOVA.

# Group means
library(plyr)
m.d2 <- ddply(d2.all, "Brand", summarize, m = mean(Wear))
m.d2
  Brand     m
1     A 14.25
2     B 12.25
3     C 10.75
4     D 11.00
# Plot the data using ggplot
library(ggplot2)
p <- ggplot(d2.all, aes(x = Brand, y = Wear))
# plot a reference line for the global mean (assuming no groups)
p <- p + geom_hline(yintercept = mean(d2.all$Wear),
                    colour = "black", linetype = "dashed", size = 0.3, alpha = 0.5)
# boxplot, size=.75 to stand out behind CI
#p <- p + geom_boxplot(size = 0.75, alpha = 0.5)
# points for observed data
p <- p + geom_point(aes(shape = Car, colour = Car), position = position_jitter(w = 0.2, h = 0), alpha = 1, size = 2)
# diamond at mean for each group
p <- p + stat_summary(fun.y = mean, geom = "point", shape = 18, size = 6,
                      colour = "red", alpha = 0.8)
# confidence limits based on normal distribution
p <- p + stat_summary(fun.data = "mean_cl_normal", geom = "errorbar",
                      width = .2, colour = "red", alpha = 0.8)
p <- p + labs(title = "Design 2: Tire Wear") + ylab("Wear (mil)")
print(p)

3.1 Fit model

fit.d2 <- lm(Wear ~ Brand, data = d2.all)

3.2 (1 p) Are the assumptions for the one-way ANOVA met?

15  2  3 
 1 16 15 

    Anderson-Darling normality test

data:  fit.d2$residuals
A = 0.25561, p-value = 0.6783

3.2.1 Solution

[answer]

3.3 (1 p) Can we infer a difference in mean wear levels between the 4 Brands?

library(car)
Anova(fit.d2, type=3)
Anova Table (Type III tests)

Response: Wear
            Sum Sq Df  F value    Pr(>F)    
(Intercept) 812.25  1 193.9701 9.051e-09 ***
Brand        30.69  3   2.4428    0.1145    
Residuals    50.25 12                       
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

3.3.1 Solution

[answer]

4 Design 3: Randomized Complete Block Design (RCBD)

In this case, each car tests all four brands. Thus one tire from each brand is selected at random and randomly allocated to the 4 wheels of car c1. Then one tire from each brand is selected and the four are randomly allocated to car c2, and so forth. Here are the results of that design.

       Car
       c1 c2 c3 c4    c1 c2 c3 c4
       -----------    -----------
Brand   B  D  A  C    14 11 13  9
        C  C  B  D    12 12 13  9
        A  B  D  B    17 14 11  8
        D  A  C  A    13 14 10 13
       -----------    -----------

Read in the data.

d3.all <- read.table(text="
Car Wear Brand
c1  14   B
c1  12   C
c1  17   A
c1  13   D
c2  11   D
c2  12   C
c2  14   B
c2  14   A
c3  13   A
c3  13   B
c3  11   D
c3  10   C
c4   9   C
c4   9   D
c4   8   B
c4  13   A
", header = TRUE)

Means and plots by Brand and by Car.

# Group means
library(plyr)
m.d3.b <- ddply(d3.all, "Brand", summarize, m = mean(Wear))
m.d3.b
  Brand     m
1     A 14.25
2     B 12.25
3     C 10.75
4     D 11.00
m.d3.c <- ddply(d3.all, "Car", summarize, m = mean(Wear))
m.d3.c
  Car     m
1  c1 14.00
2  c2 12.75
3  c3 11.75
4  c4  9.75
par(mfrow=c(1,2))
boxplot(split(d3.all$Wear,d3.all$Brand))
boxplot(split(d3.all$Wear,d3.all$Car))

par(mfrow=c(1,1))
# Plot the data using ggplot
library(ggplot2)
p <- ggplot(d3.all, aes(x = Brand, y = Wear))
# plot a reference line for the global mean (assuming no groups)
p <- p + geom_hline(yintercept = mean(d3.all$Wear),
                    colour = "black", linetype = "dashed", size = 0.3, alpha = 0.5)
# boxplot, size=.75 to stand out behind CI
#p <- p + geom_boxplot(size = 0.75, alpha = 0.5)
# points for observed data
p <- p + geom_point(aes(shape = Car, colour = Car), position = position_jitter(w = 0.2, h = 0), alpha = 1, size = 2)
# colored line for each Care
p <- p + geom_line(aes(group = Car, colour = Car), alpha = 0.5)
# diamond at mean for each group
p <- p + stat_summary(fun.y = mean, geom = "point", shape = 18, size = 6,
                      colour = "red", alpha = 0.8)
# confidence limits based on normal distribution
p <- p + stat_summary(fun.data = "mean_cl_normal", geom = "errorbar",
                      width = .2, colour = "red", alpha = 0.8)
p <- p + labs(title = "Design 3: Tire Wear") + ylab("Wear (mil)")
print(p)

4.1 (2 p) Briefly, what relationships are there between Wear and Brand or Car?

Refer to the numerical and graphical summaries above.

4.1.1 Solution

[answer]

4.2 Fit model

fit.d3 <- lm(Wear ~ Brand + Car, data = d3.all)

4.3 (1 p) Are the assumptions for the RCBD met?

15 10 16 
 1 15 16 

    Anderson-Darling normality test

data:  fit.d3$residuals
A = 0.35174, p-value = 0.4226

4.3.1 Solution

[answer]

4.4 (1 p) Can we infer a difference in mean wear levels between the 4 brands?

It is appropriate to test whether there are differences between Brands controlling for the effect of Car. This is the additive model. Note that because the Car blocks are part of the experimental design, they should remain in the model regardless of whether the block is significant or not.

library(car)
Anova(fit.d3, type=3)
Anova Table (Type III tests)

Response: Wear
            Sum Sq Df  F value    Pr(>F)    
(Intercept) 598.94  1 466.2000 4.617e-09 ***
Brand        30.69  3   7.9622  0.006685 ** 
Car          38.69  3  10.0378  0.003133 ** 
Residuals    11.56  9                       
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

4.4.1 Solution

[answer]

4.5 (2 p) Perform the pairwise comparisons and summarize numerically and graphically.

# multcomp has functions for multiple comparisons
library(multcomp)
# Use the ANOVA object and run a "General Linear Hypothesis Test"
# specifying a linfct (linear function) to be tested.
# The mpc (multiple comparison) specifies the factor and method.
# Here: correcting over Treatment using Tukey contrast corrections.
glht.d3.b <- glht(aov(fit.d3), linfct = mcp(Brand = "Tukey"))
summary(glht.d3.b)

     Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts


Fit: aov(formula = fit.d3)

Linear Hypotheses:
           Estimate Std. Error t value Pr(>|t|)   
B - A == 0  -2.0000     0.8015  -2.495  0.12754   
C - A == 0  -3.5000     0.8015  -4.367  0.00831 **
D - A == 0  -3.2500     0.8015  -4.055  0.01264 * 
C - B == 0  -1.5000     0.8015  -1.872  0.30411   
D - B == 0  -1.2500     0.8015  -1.560  0.44503   
D - C == 0   0.2500     0.8015   0.312  0.98878   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
# plot the summary
plot(summary(glht.d3.b), sub="Tukey-adjusted Treatment contrasts")

Summarize the results in a table like this, where the effect of the Brands are sorted and the bars indicate pairs that are not statistically different. Then summarize in words.

Example:
Brand:     Z   Y   W   X
          -------
                  -------

4.5.1 Solution

[answer]

5 Design 4: Your idea!

5.1 (1 p) How can this experiment be further improved by design?

There are further factors that we haven’t yet considered inherent in this experiment.

Bonus if you can name this experimental design.

5.1.1 Solution

[answer]