Testing Agreement for Multi-Item Scales With the Indices

The most popular index of agreement has been r WG(J) ; more recently, the AD M(J) index also has been used. This study addresses two problems: first, how to test the statistical significance of r WG(J) and AD M(J) and, second, how to infer from the indices that were evaluated for each group about the agreement of the ensemble of groups. The authors extend the inference based on either r WG(J) or AD M(J) by focusing on multiple-item scales and on the whole ensemble of groups. Their method is based on simulations, as was done by Dunlap, Burke, and Smith-Crowe (2003) and by Cohen, Doveh, and Eick (2001). The tests are illustrated on the data of Bliese, Halverson, and Schriesheim (2002) pertaining to a sample of 2,042 U.S Army soldiers in 49 U.S. Army companies. Software for our procedures is available both as a SAS code and in the Multilevel Modeling in R package (Bliese, 2006).