You are here:

Comparability of paper-and-pencil and computer-based cognitive and non-cognitive measures in a low-stakes testing environment
DISSERTATION

, James Madison University, United States

James Madison University . Awarded

Abstract

Computerized versions of paper-and-pencil tests (PPT) have emerged over the past few decades, and some practitioners are using both formats concurrently. But computerizing a PPT may not yield equivalent scores across the two administration modes. Comparability studies are required to determine if the scores are equivalent before treating them as such. These studies ensure fairer testing and more valid interpretations, regardless of the administration mode used. The purpose of this study was to examine whether scores from paper-based and computer-based versions of a cognitive and a non-cognitive measure were equivalent and could be used interchangeably. Previous research on test score comparability used simple methodology that provided insufficient evidence for the score equivalence. This study, however, demonstrated a set of methodological best practices, providing a more complex and accurate analysis of the degree of measurement invariance that exists across groups. The computer-based test (CBT) and PPT contained identical content and varied only in administration mode. Participants took the tests in only one format, and the administration was under low-stakes conditions. Confirmatory factor analyses were conducted to confirm the established factor structure for both the cognitive and the noncognitive measures, and reliability and mean differences were checked for each subscale. Scalar, metric, and configural invariance were tested across groups for both measures. Because of the potential impact on measurement invariance, differential item functioning (DIF) was tested for the cognitive measure, and those items were removed from the data set; measurement invariance across test modes was again evaluated.

Results indicate that both the cognitive and the non-cognitive measures were metric invariant (essentially tau-equivalent) across groups, and the DIF items did not impact the degree of measurement invariance found for the cognitive measure. Therefore, the same construct was measured to the same degree, but scores are not equivalent without rescaling. Measurement invariance is a localized issue, thus, comparability must be evaluated for each instrument. Practitioners cannot assume that the scores obtained from the PPT and CBT will be equivalent. How the test scores are used will determine what changes must be made with tests that have less than strict measurement invariance.

Citation

Rowan, B.E. Comparability of paper-and-pencil and computer-based cognitive and non-cognitive measures in a low-stakes testing environment. Ph.D. thesis, James Madison University. Retrieved March 28, 2024 from .

This record was imported from ProQuest on October 23, 2013. [Original Record]

Citation reproduced with permission of ProQuest LLC.

For copies of dissertations and theses: (800) 521-0600/(734) 761-4700 or https://dissexpress.umi.com

Keywords