In the November 2015 Language Arts, Rachael Gabriel examines problems with how teachers are evaluated. Her research team reviewed the Measures of Effective Teaching (MET) project, an extensive work that studied the techniques of 3,000 teachers to determine which correlated with high VAMs (value-added measures). While the project identified many useful activities, Gabriel argues it has been used to support teacher evaluation rubrics that err by measuring quantity, not quality. She writes:
The major challenge of performance assessment via observation is that indicators are counted as if their presence or absence indicates quality. For example, one feature of classroom discourse that is often included in commercially available rubrics for observations is the use of open-ended and/or higher-order questions. Though the presence of higher-order questions . . . has been associated with increased engagement and achievement, its absence does not indicate lack of quality. . . . When analyzing MET project videos, we found higher-order questions in low-performing classrooms on every measure of the MET study, and high-scoring classrooms that had no evidence of higher-order questions.
Other examples of this abound:
When it comes to opportunities to develop literacy, it isn’t the fact of allotted time for independent reading or writing, but rather the nature and use of that time that determines its value as a practice.
For example, several videos of MET project classrooms included time spent writing for five minutes or more, but the writing tasks often involved filling in blanks of a formulaic paragraph structure or copying notes from the board into a graphic organizer. Neither of these tasks involves a robust opportunity to develop literacy because students are not generating original language, employing a writing strategy, writing for a purpose, or writing to an audience. However, in observation, especially brief observation, it may appear that students are all engaged in writing, and this instrumental engagement may be viewed as evidence of effectiveness because students are quietly complying with a writing- based activity.
Why does a rubric of activities fail to indicate quality?
It could be that every observable feature or “best practice” involves a compromise and thus cannot be viewed in isolation as evidence of effectiveness or not. For example, calling on an equal number of boys and girls may extend the length of discussion and limit time for independent practice. Similarly, pursuing a back-and-forth discussion to support a student’s understanding might limit other students’ participation. A teacher could invest in one indicator of effectiveness at the expense of another. Thus, effective teaching may be about managing the dynamic balance of certain features of instruction rather than simply displaying such features.
At best, rubrics are filled with actions that are sometimes associated with effectiveness, not foolproof indicators of effectiveness. This leaves evaluators in the unenviable position of attempting to come up with feedback on a teacher’s performance based on a set of indicators that may not indicate anything. Given the importance of some features, the assumption may be that more is better, thus teachers are encouraged to ask more open-ended questions, engage students in more meaningful conversations, or encourage more participation. The inclusion of such indicators to mark the highest levels of proficiency on a rubric may inspire instrumental compliance rather than thoughtful integration. Unfortunately, encouraging participation for participation’s sake may not deepen or extend learning opportunities. But, considering how participation could contribute to the goal of the lesson (how is this effective?) or how participation has been attempted (how does the teacher encourage participation?) is likely to generate useful feedback aimed at improving or expanding effective practices.
Read Rachael Gabriel’s complete article “Not Whether, but How: Asking the Right Questions in Teacher Performance Assessment”