Publication Date:
Author(s): Eric Loken, Zita Oravecz, Conrad Tucker, Fridolin Linder
Publication Type: Conference Proceeding
Journal Title: ASEE Annual Conference and Exposition, Conference Proceedings
Volume: 122
Issue: 122
Abstract:

Undergraduate STEM programs are faced with the daunting challenge of managing instruction and assessment for classes that enroll thousands of students per year, and the bulk of student assessment is often determined by multiple choice tests. Instructors try to monitor the reliability metrics and diagnostics for item quality, but rarely is there a more formal evaluation of the psychometric properties of these assessments. College assessment strategies seem to be dominated by a common-sense view of testing that is generally unconcerned about precision of measurement. We see an opportunity to have an impact on undergraduate science instruction by incorporating more rigorous measurement models for testing, and using them to assist instructional goals and assessment. We apply item response theory to analyze tests from two undergraduate STEM classes, a resident instruction physics class and a Massive Open Online Course (MOOC) in geography. We evaluate whether the tests are equally informative across levels of student proficiency, and we demonstrate how precision could be improved with adaptive testing. We find that the measurement precision of multiple choice tests appears to be greatest in the lower half of the class distribution, a property that has consequences for assessment of mastery and for evaluating testing interventions.