Award-Winning Research May Help Boost Reliability of Standardized Tests

U.S. school teachers and their students seem caught in an endless loop of standardized tests.
Teachers spend countless hours preparing, administering and grading student assessments, while students are locked in a seemingly never-ending cycle of gearing up for and taking these tests. Assessments include not only college entrance exams but also the numerous tests required by the state of Florida,
Yet, despite all this time and attention, the assessments provide little information beyond a single score; they yield scant details about the specific areas a student needs the most help.
UF College of Education’s Ren Liu and Anne Corinne Huggins-Manley are among the scholars trying to change that.
They recently published an award-winning research paper that shows how to retrofit current assessments to give teachers and students more meaningful and actionable information. Huggins-Manley is an assistant professor of the college’s Research and Evaluation Methodology (REM) program serves as adviser to Liu, a doctoral candidate in REM.

Ren Liu and Anne Corinne Huggins-Manley

Their article recently won the annual best paper award from the Florida Educational Research Association (FERA). The award was for its importance to the field of education, soundness of the research, and the quality of Liu’s presentation at the organization’s annual conference., said Robert Dedrick, professor of education at the University of South Florida and chair of the FERA award committee.
“Retrofitting Diagnostic Classification Models to Responses from IRT (Item Response Theory)-Based Assessment Forms,” the paper also was recently published in the academic journal, Educational and Psychological Measurement.
The scholars’ research provides a step-by-step method to revamp assessment tests using various tools and formulas of psychometrics, the science of measuring mental capacities and processes. Their 27-page paper is filled with the work of modern educational researchers: statistical models, tables, formulas and scatter plots.
“Developing and scoring assessments using new diagnostic measurement approaches is an increasingly important area in education research,” Liu said. “It can provide insights that allow teachers and their students to better understand specific areas where a student may need help.”
Consider this hypothetical example. Jane, a high school junior, scores an 1,200 on her SAT college-admission test. Her score of 700 on the math portion of the test was pretty good. But the results fail to show that while she has mastered geometry she struggles in algebra. By offering more precise measurements, retrofitting the test can provide more useful information to show the student and teacher areas to take action to improve Jane’s content knowledge, Liu said.
“Retrofitting also can inform teachers about areas to improve curriculum design and material preparation,” Liu said.
But why retrofit tests at all? Why not develop them from the ground up to yield such information?
“The reason we need to retrofit tests is that current tests are not developed under the diagnostic measurement framework,” Liu said. “Diagnostic measurement framework is a new psychometric tool that has developed a lot in research, but not yet ingrained in practice.”
In the future, there will be more tests developed and scored under the diagnostic measurement framework, he said. But now, given that there are limited number of assessments developed this way, scholars are having to retrofit scores of current assessments to obtain the more precise information.

Source: Ren Liu
Writer: Charles Boisseau, 352-273-4449

Resources

Contact Us