It was 2009 and Arne Duncan was riding high. His basketball buddy had just been elected president, and he had just been appointed Secretary of Education. At his confirmation hearing, Republican Senator Lamar Alexander declared: “President-elect Obama has made several distinguished cabinet appointments, but in my view of it all, I think you are the best.”
Shining from bipartisan praise, Duncan set to work on strengthening the standards and accountability movement enshrined in No Child Left Behind.
But Duncan overplayed his hand by creating a policy that so expanded standardized testing, it helped lead to a politically disastrous opt-out movement, a radicalization of teachers’ unions, and ultimately a loss of political support for federal education policy.
Duncan’s determination to rely partly on student test scores to evaluate all teachers in all grades and subjects was a strategic misstep that helped wed otherwise opposing forces.
The worst of Duncan’s foes now want to do away with the U.S. Department of Education altogether and the friendliest would take away much of its power. Meanwhile Duncan’s rhetoric has dramatically downplayed his department’s role in incentivizing the creation of more tests.
“I’ve never seen both Democrats and Republicans want to curb the authority of the federal Department of Education the way they want to now,” said American Federation of Teachers president Randi Weingarten.
Give the states money and they will test
It started in 2009 with the realization that cash-strapped states were desperate for money in the midst of a struggling economy. Knowing this, Duncan designed Race to the Top, an ingenious program that gave states the chance to dip into a $4.35 billion pot of federal money if they adopted certain accountability and school choice policies. One of those policies was the evaluation of all teachers based partly on student growth on tests.1 And the administration meant all teachers, explaining that for educators in non-tested grades and subjects — think social studies and first grade — “alternative measures of student learning and performance” needed to be used.
Supporters of a Colorado law to base 50 percent of all teachers’ evaluations on test scores cited the $175 million that the state stood to gain from Race to the Top. (The law passed, but Colorado did not initially win the federal funds with reformers in the state blaming lack of union buy-in. The state eventually won $17.9 million in a later round.) Similarly, New York put in place a far-reaching law that based 40 percent of teachers’ evaluations on test scores2 with the explicit premise of competing for $700 million in Race to the Top dollars, which the state subsequently won.
A recent study confirmed that Race to the Top drove significant changes in state policy. The National Center for Teacher Quality found that in 2009 just 15 states used “objective” measures of student achievement — test scores — to evaluate teachers; by 2013, 41 states did.
Duncan declined to be interviewed. Department spokeswoman Dorie Nolt said in a statement that Race to the Top accomplished what Duncan set out to do — “advance a new way of making education change — not based on prescriptive mandates from Washington, but instead through the powerful ideas of educators and leaders in states.”
She said the more than $4 billion was an unprecedented federal investment in reform that “sparked innovation even in places that didn’t receive money” and spurred states to collaborate with one another to create more opportunities for students, especially the neediest ones, to reach their full potential.
“Today, the innovations unleashed by Race to the Top are touching nearly half the nation’s students and 1.5 million teachers in schools across the country,” Nolt said.
Not interested in Race to the Top? Too bad.
There was no requirement that states compete in Race to the Top, and 10 didn’t in the first phase, including Washington state. But Duncan’s Department of Education pulled Washington’s waiver from the federal No Child Left Behind law (NLCB) because the state was not evaluating teachers and principals based on test scores. In order to receive a waiver, states must develop teacher and principal evaluation systems that “use multiple valid measures in determining performance levels, including as a significant factor data on student growth for all students.” That’s education-ese for use student test scores to evaluate teachers.
Without a waiver, Washington is subject to an NCLB requirement that 100 percent of students score proficient on state exams. Schools that fall short of that number — as 88% of the state’s schools did in 2014-15 — were labeled failing and faced sanctions, including paying for outside tutoring. The threat of losing their waiver likely caused some states to pass new evaluation laws, and may have stopped others from reconsidering their already-adopted Race to the Top policies.
Meanwhile, states that had fallen in line with Race to the Top and the NCLB waivers faced a hurdle that most districts were entirely unprepared for: Measuring student growth for teachers in non-tested grades and subjects.
Much of the evaluation debate has focused on reading and math teachers in grades 4–8, because they can be evaluated on federally mandated state tests; yet teachers in non-tested grades and subjects make up the majority of the teaching force. Few state laws made separate provisions for these two categories of educators.
Duncan’s Department of Education laid out three options for addressing the problem: Use new teacher-created assessments, known as student learning objectives; create new standardized tests tailored to traditionally non-tested subjects; or use group measures of performance.
Essentially the feds left states to pick their poison: Either hastily create a proliferation of new tests or unfairly evaluate teachers based on tests over which they have little-to-no influence.
States ended up doing a combination of both.
Many districts opted for using group scores, or “measures of collective performance” as the Department of Education calls it. What that means is that teachers are evaluated based on students they don’t teach, subjects they don’t teach, or both. It means an art teacher will be evaluated on English test scores; it means a kindergarten teacher will be judged on third-graders’ scores. Stories have poured in across the country — including in New Mexico, Florida, Tennessee, and New York — of educators outraged that their evaluation ratings are based on scores they couldn’t control.
Lily Eskelsen Garcia, the president of the National Education Association, said she asked Duncan’s department to stipulate that no teacher should be evaluated based on the scores of student they’ve never met. The department refused, she said.3 Nolt, the department spokeswoman, did not respond to a request for comment.
Many districts have tried to avoid the unfairness of group measures by creating new tests to evaluate teachers. New York City, for example, uses a series of standardized “performance assessments” — which sometimes include both pre- and post-tests — as part of the city’s teacher evaluation system. Colorado is working to rapidly expand its assessment options, and at least one district4 there has created standardized tests in art, music, and physical education. Chicago and New Mexico are reportedly doing the same.
The Commissioner of Education in Kentucky estimated at a Senate hearing that 40% of his state’s tests were due to the federally incentivized teacher evaluation system.5
When it comes to time spent on testing, hard numbers are difficult to track down, and it’s often unclear what, exactly, is leading to all these assessments. But in Ohio, there is strong evidence that teacher evaluation has created a large share of tests that many are complaining about. A report by Ohio’s Education Department found that most testing was required by a combination of the statewide teacher evaluation system and NCLB’s requirements. That includes testing in grades K–2, most of which occurred due to the evaluation law.
None of this should be surprising: In Ohio’s Race to the Top application the state promised Duncan that it would introduce a daunting battery of “supplemental tests, end-of-course exams, and performance-based assessments” for its evaluation system.
It’s this proliferation of testing that may have spurred the opt-out movement — not the NCLB-mandated 3–8 testing that long predated it. It seems like every state has had its own anti-testing backlash. New York state had massive opt-out numbers. Connecticut, Colorado, and Minnesota have all passed bills to scale back the number of assessments. In fact, a recent report found that a remarkable 39 states are attempting to reduce standardized testing.
Duncan blames the locals
The anti-testing sentiment has clearly made an impression on Duncan. In an August 2014 blog post and press conference, Duncan acknowledged and agreed with the concerns about over-testing, writing, “There’s plenty of responsibility to share on these challenges [regarding testing], and a fair chunk of that sits with me and my department.”
He also announced that he would award NCLB waivers to states that did not factor in test scores for teacher evaluation during a grace period. Such a move allowed states to temporarily reduce the stakes attached to tests, but did little to address over-testing or the concerns of teachers unfairly evaluated.
Duncan has continued to talk about the issue, but by early spring 2015, his rhetoric had subtly shifted. Duncan began to lay the blame at the feet of states and districts.
“This is really for the state to grapple with and figure out,” Duncan said in West Palm Beach, Fla., referring to concerns about too much testing. In Chicago, Duncan suggested that over-testing was a result of local policies: “I've been clear that there are some places that overtest.”
In Kentucky, when confronted by a protester who complained about testing in early grades, Duncan responded that the federal government doesn’t require testing until third grade. In remarks to state school superintendents, Duncan claimed that districts mistakenly added additional testing without taking away old ones. Yet largely unacknowledged in Duncan’s remarks is his department’s role in incentivizing the creation and use of more tests — a policy that still remains with NCLB waivers.
U. S. Education Department becomes the enemy
Teachers unions and conservatives say that Duncan’s actions show the problem with federal overreach. But that’s not quite right — the problem is with one major decision Duncan made, not with the perch from which he made it.
If Duncan had simply incentivized states to implement evaluations that included test scores only for teachers who already had them the pushback we now see might be much smaller. There wouldn’t have been the same proliferation of new tests, and there wouldn’t have been educators unfairly evaluated.
None of this is to say that, as a policy matter, increased testing is a bad idea. Perhaps there’s value in more standardized assessment in the early grades, in arts, in science. For all the gripes about over-testing in Ohio, for example, the state report found that students spent somewhere between one and three percent of the school year on mandated tests. Include time spent on practice tests and that number goes up by just 1.4 percent.
But regardless of the wisdom from a policy perspective, the increase in testing has contributed to a political backlash that threatens the whole enterprise. And the self-evident folly of evaluating educators on subjects they don’t teach is bad on both counts.
Now Congress is working to reauthorize NCLB, and the most likely result will be a gutting of the federal role in education. The bipartisan Senate draft bill prohibits the current policy of federal involvement in state teacher evaluation systems, and would significantly limit federal oversight in school accountability systems.
Duncan, unsurprisingly, has decried aspects of the rewrite, but his actions have created potent adversaries (and strange bedfellows) in Republicans and teachers’ unions, both of whom want to roll back federal involvement. Even Democratic accountability hawks have essentially conceded that the federal role in education will be reduced.
The irony of Duncan’s overreach is that while it’s sparked widespread backlash, it’s also put the White House in a strong negotiating position. The power to grant or deny waivers under the current version of NCLB means Obama does not have much motivation to sign a bill he doesn’t like. Yet Democrats may regret it if a rewrite doesn’t gets done and suddenly a Republican administration has free reign to implement its own educational priorities.
Duncan has always emphasized the urgency of reform. This urgency may have harmed his own cause, leading him to overplay his hand and ignore political realities. Whether reformers across the country will learn the lesson from Duncan’s cautionary tale remains to be seen. Perhaps the better question is whether they’ll have another chance.
Photo by Getty Images
1. Specifically, states were incentivized to “Design and implement rigorous, transparent, and fair evaluation systems for teachers and principals that (a) differentiate effectiveness using multiple rating categories that take into account data on student growth (as defined in this notice) as a significant factor, and (b) are designed and developed with teacher and principal involvement.” (Return to story)
2. New York has since altered the law so that essentially half of teachers’ evaluation will be based on test scores. (Return to story)
3. See 31:45 in this video. Eskelsen Garcia says, “We actually asked, at NEA, we asked a to-be-nameless person who works with the Department of Education, and we said, ‘Can we at least get you guys to agree that no teacher should ever be evaluated on the test scores of students they’ve never met? Can we just get that?’ They would not agree!” (Return to story)
4. Disclosure: I previously taught middle school language arts in this district. (Return to story)5. See 1:57:19 in this video. Commissioner Terry Holliday, in response to a question about what percentage of state testing the federal government can reduce, stated, “If we eliminate the teacher evaluation component, which added about 40% testing, that would be about 40% right there. And if we were able to address accountability at the state level rather than the federal level we might be able to reduce another 20% because most of the tests are local and school district tests to add to the teacher evaluation and tied to the federal accountability.” (Return to story)