A New York judge ruled Wednesday that the use of student test scores to evaluate a Long Island teacher was “indisputably arbitrary and capricious” — but the teacher’s lawyer now says they aren’t fully satisfied with the decision and may appeal further to ensure the measure isn’t used in the future to judge other teachers in the state.
In the 2013–14 school year, Sheri Lederman, a highly-regarded veteran elementary school teacher in Great Neck, was judged “ineffective” on a specific portion of her annual evaluation, based on a statistical growth model measuring how much her students improved on standardized state tests. (Growth scores are often referred to as “value-added models” or VAMs, though New York technically uses a slightly different approach called “student growth percentiles.”)
Though her overall rating was “effective,” Lederman said the lower growth score had a demoralizing effect on her. With the help of her husband, Bruce, a lawyer, she sued the state.
In an interview with The 74 the morning after the ruling, Bruce Lederman hailed the decision as a victory, which he expects will have national implications: “We know that the VAM system is a failure from our experience … and we hope this decision on a national level brings that home to people.”
The judge “vacated” Sheri Lederman’s test score rating, but also said that beyond her specific case the ruling was moot, since New York recently imposed a four-year moratorium on using state (though not local) tests to make high-stakes evaluation decision.
Bruce Lederman disagrees with that narrow view. He said Thursday that the ruling doesn’t go far enough: “I am still considering whether to appeal the portion of the decision where the judge refused to enjoin the use of [a growth model] because it’s being studied for the next four years.”
Doug Harris, a Tulane University researcher who has written a book about value-added, said he wasn’t surprised by the lawsuit, considering that scores can be unreliable from year to year, and that New York had used a single year’s data rather than a multiple-year average, as most experts recommend.
He was, however, surprised that the case moved forward even though Lederman’s overall rating was effective, meaning she wouldn’t face any negative employment consequences as a result of the growth score. “My question is: What is the harm done here?” Harris wondered.
Pushed by the federal government and driven by concerns that traditional evaluation systems were inadequate, the vast majority of states now partially judge teachers based on student achievement. But a spate of lawsuits have cropped up in response, with mixed results. Judges in Florida and Tennessee have ruled against teachers arguing their evaluation was unfair, but a court in New Mexico recently blocked the blocked penalties based on the state’s evaluation system, pending a trial.
Meanwhile, New York’s evaluation system has faced fierce political backlash, contributing to a large opt out movement among parents who see too much class time dedicated to testing. This in turn led policymakers to hit pause on the state testing component of the evaluation system.
This has occurred even as few teachers have gotten low marks under New York’s system: just one percent scored ineffective in the 2013–14 school year. Previously, The 74 reported that as of late 2015, only one tenured teacher had been fired through the evaluation system and dismissal process.
Bruce Lederman said that in his view the true harm of test-based evaluations is not the direct dismissal of teachers but the broader, systemic demoralization of educators. “The system is so insulting and dispiriting to teachers,” he said. “Sheri was within days of turning in her resignation because of this.” A recent survey found that teachers evaluated partially by student test scores viewed the process less positively than those not judged by assessment results.
The state teachers union, which was not directly involved in the case, praised the decision in a statement: “New York teachers statewide have been unfairly labeled by the state's untrustworthy and mysterious mathematical algorithm that took the focus away from what matters most — teaching and learning.”
Tulane University’s Harris said that the use of test score models has resulted from widespread dissatisfaction among policymakers with the country’s prevailing evaluation systems. But he is quick to note that new evaluation methods have failed to garner much buy-in from teachers: “What we have learned is that there’s just not a lot of trust in these measures, and you can’t have a very successful evaluation system if the people you’re evaluating don’t believe in it.”
In the New York ruling, the judge emphasized several key arguments. He pointed out that a teacher’s score can fluctuate dramatically from year to year, including in Lederman’s case: the year after being deemed ineffective, she scored effective based on the same statistical model. The court’s decision also said that growth models can penalize teachers of particularly low- or high-achieving students. Finally, the judge criticized the fact that this approach to evaluations creates a bell curve, ensuring that some teachers will be marked below average regardless of overall results.
The Ledermans’ suit was supported by a handful of prominent academics who filed affidavits criticizing New York’s use of a growth model to evaluate teachers. But other researchers have argued that the approach has an important role to play in evaluation, and some research studies suggest that comprehensive teacher evaluation systems that use value-added can improve teacher quality or student achievement.
Research has also found that other measures of teacher evaluation, like classroom observations, may be plagued by some of the same problems, such as bias, as growth models.
It’s not clear whether the state plans to appeal the decision. A spokesperson for the State Education Department said she couldn’t comment on pending litigation.