"The writer's [teacher's] job is not to judge, but to understand."
Last week, I returned to writing about the Teaching-Learning Cycle (T-LC) by considering whether or not we can reclaim assessment from high-stakes tests. Recall that evaluation follows assessment in the T-LC: What can learners do? What are they trying to do? What comes next? Staying with the theme of standardized tests, I want to explore the idea of using multiple-choice items to evaluate learning.
My favorite example of this comes from the 1983 NAEP math survey. I had recently started teaching middle school math when the results for the item shown to the right came out. Only 24% of the national sample of thirteen-year-olds answered this item correctly. Nearly as many of the learners answered (c) 31.33. This is the first time I remember examining the other possibilities and wondering, “What were the kids that selected these other responses thinking?” [This question continues to fascinate me and resulted in this article that I wrote with Dr. Pam Wells.]
Using the evaluation framework, I assume that most of those answering 12 can find the remainder of the division problem and are trying to use it to answer the question. The kids answering 31 might be finding the correct number of buses by estimating the quotient or rounding it to the nearest whole number. And the 31.33 answers suggest that these thirteen-year-olds can do the division correctly but forgot to keep the context in mind. In fact, this focus on making sense of the situation seems to be what comes next for all of those who chose (a), (b), or (c).
You will notice that I did a fair amount of equivocating in my evaluation of the different responses in the previous paragraph. This is the problem with multiple-choice items – we cannot be sure why the test-taker selected one response over the other. Was it based on understanding? Or was it a guess? Or was it an intentional effort to undermine the assessment process? Only by looking at a group of responses can we begin to reduce such errors and be more confident in our evaluation of learners’ thinking. That is why I do not like using multiple-choice items as summative assessments but use them to track the progress of classes as a whole on content we are exploring.
Still, I believe I can improve the question so that it is easier to more accurately evaluate where learners are fluent and where they are approximating. One possible way is to replace 12 (which I consider the most unreasonable answer) with “I’m not sure so any choice would be a guess.” This does not add to my evaluation burden and because the item is formative the learner is more likely to be honest if they know I will use the information to inform instruction and improve their likelihood of later success.
For even more data, do as Karen Bailey suggests (in this PowerPoint) and simply add “Explain” to the end of the item. Some possible rationales are shown to the left. These responses are fascinating and certainly can improve our ability to evaluate their thinking and tailor future instruction to support their continued growth.
It will require more evaluation effort to look through and make sense of open-ended responses and teachers need to decide if it is justified. I once heard Grant Wiggins ask of assessments, “Is the juice worth the squeeze.” But this question should apply to the test-takers as well as the evaluators. Why should we subject kids to an assessment if the task does not truly measure what was intended?
This is one of the first posts that truly did get at my reason for starting this blog. I began this post thinking I would defend multiple-choice items as a reasonable way to evaluate groups of learners from a formative perspective. While I still believe evaluation can be done using a selected response format, I’m not sure identifying what learners can do and are trying to do is as easy as I thought. What do you think?