Homework #6
Due: April 3, 2015 myDataFolder <-
1. One-Way ANOVA, Use Fantasy Baseball
FantasyBaseball <- read.table(paste(myDataFolder, 'FantasyBaseball.txt', sep=''), header=TRUE, sep="\t")
a) Produce side-by-side boxplots to compare the selection times for each participant. Comment on these. Also, calculate the average selection time for each participant using the aggregate() function. boxplot(Time~Person, data=FantasyBaseball) According to the boxplot, JW has the highest average selection time and TS has the lowest average selection time among the participants.
To calculate the average selection time for each participant, aggregate(Time~Person, data=FantasyBaseball, mean)
## Person Time
## 1 AR 68.29167
## 2 BK 47.95833
## 3 DJ 69.62500
## 4 DR 80.12500
## 5 JW 163.87500
## 6 MF 63.83333
## 7 RL 67.12500
## 8 TS 19.33333
b) Check model assumption for a one-way ANOVA.
FantasyBaseballModel<-aov(Time~Person, data=FantasyBaseball) par(mfrow=c(2,2)) plot(FantasyBaseballModel)
In the Residuals vs. Fitted, as there is not obvious patterns, we may say assumptions for the zero mean and constant variance hold.
In the Normal Q-Q, some of plots do not follow the diagonal line, thus we can assume the assumption for Normality is violated.
We assume the independence and random sampling hold.
c) Conduct a one-way ANOVA analysis to assess whether the data provide evidence that average as far apart as these would be unlikely to occur by chance alone if there really were no differences among the participants in terms of their selection times. Report the ANOVA table, test statistic, and p-value. Also, summarize your conclusion in a sentence. summary(FantasyBaseballModel) ## Df Sum Sq Mean Sq F value Pr(>F)
## Person 7 287196 41028 10.89 1.79e-11 ***
## Residuals 184 693126 3767
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
As the P-value (1.79e-11) is very small and close to 0, we can reject the null hypothesis and assume there are significant in the selection times of the participants.
d) Use Turkey's HSD to assess which participants' average selection times differ significantly from which others, setting the family-wise significance level at 0.10. Use an underline diagram to show the differences.
TukeyHSD(FantasyBaseballModel, conf.level=0.9, ordered=TRUE)
## Tukey multiple comparisons of means
## 90% family-wise confidence level
## factor levels have been ordered
##
## Fit: aov(formula = Time ~ Person, data = FantasyBaseball)
##
## $Person
## diff lwr upr p adj
## BK-TS 28.625000 -21.0778517 78.32785 0.7403766
## MF-TS 44.500000 -5.2028517 94.20285 0.1971424
## RL-TS 47.791667 -1.9111850 97.49452 0.1299887
## AR-TS 48.958333 -0.7445184 98.66119 0.1109508
## DJ-TS 50.291667 0.5888150 99.99452 0.0919671
## DR-TS 60.791667 11.0888150 110.49452 0.0166497
## JW-TS 144.541667 94.8388150 194.24452 0.0000000
## MF-BK 15.875000 -33.8278517 65.57785 0.9861251
## RL-BK 19.166667 -30.5361850 68.86952 0.9598959
## AR-BK 20.333333 -29.3695184 70.03619 0.9451692
## DJ-BK 21.666667 -28.0361850 71.36952 0.9242180
## DR-BK 32.166667 -17.5361850 81.86952 0.6103848
## JW-BK 115.916667 66.2138150 165.61952 0.0000000
## RL-MF 3.291667 -46.4111850 52.99452 0.9999996
## AR-MF 4.458333 -45.2445184 54.16119 0.9999967
## DJ-MF 5.791667 -43.9111850 55.49452 0.9999802
## DR-MF 16.291667 -33.4111850 65.99452 0.9838628
## JW-MF 100.041667 50.3388150 149.74452 0.0000017
## AR-RL 1.166667 -48.5361850 50.86952 1.0000000
## DJ-RL 2.500000 -47.2028517 52.20285 0.9999999
## DR-RL 13.000000 -36.7028517 62.70285 0.9958565
## JW-RL 96.750000 47.0471483 146.45285 0.0000042
## DJ-AR 1.333333 -48.3695184 51.03619 1.0000000
## DR-AR 11.833333 -37.8695184 61.53619 0.9977053
## JW-AR 95.583333 45.8804816 145.28619 0.0000057
## DR-DJ 10.500000