Assessment Instrument Project

Department of Physics
North Carolina State University, Raleigh, NC 27695-8202

This research was supported, in part, by the National Science Foundation. Opinions expressed are those of the authors and not necessarily those of the Foundation.

The primary goal of the Assessment Instruments Project is to create a series of valid, reliable tests that can be used in pre/post research designs as well as by classroom teachers.

Our research group has been involved in the rigorous development and evaluation of instruments for uncovering student misconceptions in kinematics graph interpretation (TUG-K), direct current circuits (DIRECT), and ray optics. We are planning to continue this effort by developing more instruments dealing with additional topics from introductory physics, including thermodynamics, electrostatics, waves, and measurement/error analysis. Although these assessment tools would be useful to researchers such as ourselves and others, their real value will be in the classrooms of teachers who are trying new instructional methods and want to evaluate their students’ understanding.

We will be continuing our methodology of surveying teachers and researchers, analyzing interview transcripts, and administering large scale multiple choice testing. Interviewing allows close examination of the thought processes of students, thus providing high resolution for discerning how students think about physics concepts. On the other hand, multiple choice tests are easily administered and can be more quickly graded. Furthermore, the testing of large numbers of students allows statistical analysis and generalization of findings. By combining these basic research techniques the strengths of both can be exploited. A large group of high school and college teachers and physics education specialists have been assisting with the development and field testing of materials.

Statistical results to date:



Desired Value





0 to 100


40.5 ± 0.9

48.0 ± 0.5

Average score

0 to 1

³ 0.70



Reliability of whole test
Point-Biserial Coefficient

-1 to + 1

³ 0.20



Reliability of individual items (averaged across the whole test)
Ferguson’s Delta

0 to 1

³ 0.90



Ranking ability of the test
Item Discrimination Index

-1 to + 1

³ 0.30



Ability of an item to differentiate between high and low scoring students


*A mean of 50% maximizes the possible spread of scores–desirable for a research instrument, but probably not for a regular classroom test.

For details on how to obtain our tests as well as links to many other research-based assessment instruments, visit our test info page.

