Osborne: Test Scores Give Only a Partial Picture of How a School Is Doing. School Quality Reviews Can Help Fill the Gap
Standardized testing has become controversial in a way few predicted a decade ago. As I wrote in the first piece in this series, test scores give us important information about the quality of schools, but they leave out a lot of other important information.
Consider, for instance, the school that suddenly had to take in 60 new students midyear, because a nearby school closed. The newcomers’ scores on tests given two months later would not tell us much about the quality of that school.
Or how about schools that were closed for a month because of a hurricane and flooding? Wouldn’t their scores misrepresent their quality?
And what about specialized schools, like those that focus on dual-language immersion or the performing arts? Would reading and math scores really tell us what we need to know about their performance, if we don’t also rate them on how well kids are learning their second language or their singing, dancing or acting.
Outstanding schools do many things that test scores don’t measure, such as engaging families, motivating students, regularly assessing their progress, offering remedial help for those who are behind and paying attention to social-emotional learning. Science tells us that these are all important practices. Wouldn’t it be nice if state accountability systems encouraged schools to use them?
Finally, test scores don’t say much about school culture or aspects of deeper learning, such as critical thought, problem solving, research and speaking skills.
In sum, we need a more holistic way to judge school quality, but one that we can trust to be fairly objective. Fortunately, a few other countries have figured this out. They use rigorous school quality reviews in their accountability systems.
In Great Britain, for example, small teams of experts, many of them former school leaders and teachers, visit each school every two to five years, usually with one day notice, and spend two days gauging its quality. Schools that have been rated “outstanding” get a longer time between evaluations; those that have received weaker ratings get more frequent evaluations.
The reviewers sit in on classes, examine student work, talk with groups of students, staff and members of governing boards, look over documents, records and test scores, review student, staff and parent surveys, solicit written input from parents and often meet with families. Then they publish reports — distributed to all parents — full of qualitative judgments.
Using formal evaluation rubrics, they rate schools on “overall effectiveness” and four more specific areas: “quality of education,” “personal development” (i.e. SEL), “behavior and attitudes” and “effectiveness of leadership and management.” There are four possible ratings in each category: outstanding, good, requires improvement and inadequate. The work in England is overseen by the Office for Standards in Education, Children’s Services and Skills, an independent government agency created in 1992.
Research suggests that the British system is effective in identifying low-performing schools. It also differentiates effectively among outstanding schools, good schools, those that require improvement and those that are inadequate. Schools that fall into the lowest ratinghave responded by raising student achievement.
New York City has adopted a pared-down version of this model, with only one reviewer visiting each school. Charter school authorizers in Massachusetts, Indianapolis and elsewhere use visits based on the British model in reviewing their charter schools, and large charter networks such as KIPP have done the same.Several states, including Massachusetts, Kentucky and Connecticut, use a similar approach to review their low-performing schools.
With all this demand, multiple companies have emerged to do these reviews, on contract with state agencies or charter authorizers.
Review teams can give schools points for using proven practices, such as engaging parents or regularly assessing student progress. They can also put all the information they gather in context, because they can see the overall nature of the school. If it has a high percentage of students with learning disabilities, or if it caters to a particular kind of student, they can factor that into their judgments.
Obviously, assessments such as this are more expensive than standardized tests, though they don’t have to be done every year. If we want a balanced set of quality measures that reflect the whole child’s experience, however, they are indispensable.
Public schools already spend money getting accredited every six to 10 years. There are four regional accrediting agencies around the country. (Some states have their own “accreditation” programs in addition, but these are often quite different, more like a snapshot of school quality focused mostly on test scores.) Accreditation by a regional agency is mandatory only in a minority of states, but most schools in the U.S. choose to go through the process — at their own expense — to earn an external seal of approval. School staff go through a self-evaluation process and develop improvement plans, and the accrediting agency’s team of volunteer school staff from other places makes recommendations.
Even when done well, however, plans made during the self-evaluation are often shelved for lack of funding or will to implement. Similarly, implementation of the accrediting agency’s recommendations depends entirely on the school. Since accrediting agencies rarely refuse a public school accreditation, and there are few other consequences, the process often has little long-term impact.
Ted Sizer, the late headmaster of the elite Phillips Academy, author of Horace’s Compromise, founder of the Coalition of Essential Schools and chair of the Harvard and Brown University education departments, told me that the assessment done when he ran a charter school at the end of his career was far more valuable than any accreditation process he had ever been through.
What we spend on accrediting K-12 public schools would be far more productively spent on a British-style assessment of each school every three years. Perhaps regional accreditation agencies could be brought into the new system and reoriented, to perform more rigorous, British-style evaluations. If it costs more than accreditation every six years, the investment will still be worthwhile. We cannot afford to be penny wise and pound foolish about accountability.
In large states, the scale of these qualitative assessments might require that they be phased in over multiple years. But organizations already exist that know how to do them, and there are plenty of retired teachers and administrators who would be happy for the part-time work. With 53 million people, England is far larger than any state, and its government has managed its system for three decades.
School quality reviews should not supplant test scores in our accountability systems; they should complement them. As I argued in the first piece in this series, test scores should be given approximately half the weight in state ratings of schools. Quality reviews should account for about a quarter of the weight, and other factors should be given the rest. These should include student engagement (measured by parental surveys) and, for high schools, outcomes such as graduation rates, college-going and persistence rates and employment rates for those not going to college.
By combining different views in this way, we can balance test score performance with other factors that are just as important to the success of a school and its students.
David Osborne is author of Reinventing America’s Schools: Creating a 21st Century Education System, which includes a more in-depth discussion of how to measure school quality and hold schools accountable. He leads the K-12 education work of the Progressive Policy Institute.
Get stories like these delivered straight to your inbox. Sign up for The 74 Newsletter