Hello EdNews readers. I’ll be checking in occasionally with blog entries here focused on new and worthwhile research. For this first blog, I want to point you to a research brief published on Sunday by the Economic Policy Institute.
The piece, with the dry but informative title of “Problems with the Use of Student Test Scores to Evaluate Teachers,” is authored by an extremely impressive collection of accomplished researchers. If you read nothing else about education this week, please read the three-page executive summary (then continue on and read the rest!).
Before discussing this research brief, I want to re-introduce myself. I teach school policy and law at the CU Boulder School of Education, where I also direct the Education and the Public Interest Center. In my blog entries here, I will try to point readers to useful resources on the EPIC website in addition to resources – like the new Economic Policy Institute brief – from other places.
The main point of the EPI research brief is straightforward: while value-added modeling (VAM) is a technical advancement that highlights student growth, the numbers generated are nevertheless too inaccurate to be used as a primary factor in making high-stakes decisions about teachers. That is, if someone tells you that a teacher is good or bad based on a VAM calculation, you are wise to take the judgment with a sizeable grain of salt. This is the same warning that I — with far less impressive credentials — issued a couple years ago, as did the National Academy of Sciences earlier this year.
The full EPI research brief does a great job explaining how and why high-stakes VAM policies cannot be supported by VAM itself. But there’s one quote and one illustration/study that I want to pull out of the brief, to hopefully entice you to read the entire thing.
First the quote:
“There is simply no shortcut to the identification and removal of ineffective teachers” (p. 20).
As I write these blog entries throughout the year, I could probably begin each one with, “There is simply no shortcut to…” In part, this reflects the complex nature of schooling, but it also reflects the sad state of policymaking, where politicians and others are so easily enticed by the quick fix.
The replacement of ineffective teachers with effective ones is unquestionably a worthwhile policy goal, but it’s much easier said than done. A policy intended to accomplish this goal would have to reliably (a) identify the ineffective teachers (without wrongly targeting the effective ones), and (b) identify and recruit effective replacement teachers. Also, the policy should accomplish this in a more cost-effective way than alternative possibilities (but given the problems with the first part of this puzzle, we’re not yet at the point where we should worry about such comparisons).
Now for the illustration. I’ll quote from page 2 of the executive summary:
One study found that across five large urban districts, among teachers who were ranked in the top 20% of effectiveness in the first year, fewer than a third were in that top group the next year, and another third moved all the way down to the bottom 40%. Another found that teachers’ effectiveness ratings in one year could only predict from 4% to 16% of the variation in such ratings in the following year. Thus, a teacher who appears to be very ineffective in one year might have a dramatically different result the following year. The same dramatic fluctuations were found for teachers ranked at the bottom in the first year of analysis. This runs counter to most people’s notions that the true quality of a teacher is likely to change very little over time ….
This is scary stuff, but only if used unwisely – only if policy makers give too much credence to the scores. VAM approaches do tell us something; a teacher (or school) whose VAM scores are consistently at the extreme high end of the distribution are very likely of higher quality than those consistently at the extreme low end. So here’s my alternative proposal: use VAM approaches as a first-stage, cost-effective tool that will help inform a more in-depth, second-stage quality analysis. A teacher or school at the bottom (e.g., the bottom 5 percent) in a state or district should be identified for classroom observations, principal evaluation, and other hands-on information-gathering that can lead to a determination of professional development needs or removal/turnover. Similarly, a teacher or school at the top might be identified for further study that might help us learn from successes. This approach has three major advantages:
- The ultimate evaluations of teachers and schools will not be made based on the test scores; they will be more thorough and reliable.
- The use of VAM here is supportable, since it is not being used to make fine-grained distinctions among teachers, but it does serve as a good tool that allows for a cost-effective use of hands-on evaluation tools.
- A teacher will not feel extreme pressure to teacher to the test, since his or her career would ultimately be determined by hands-on observations and other information, not by students’ test scores.
Sadly, the approaches being considered and implemented in Colorado and elsewhere rely far too much on test scores and VAM approaches. We are rushing toward a system of teacher evaluation that is sure to wrongly identify teachers as good or bad, and it will likely be years before policymakers realize and correct the mistake.
Popularity: 12% [?]