Tyre: This Pandemic Pause Is a Chance to Rethink How We Test Students. The International Baccalaureate Exam Program Is Worth a Look

This essay originally appeared on the FutureEd blog.

When schools were shuttered around the country three months ago, the pandemic did what nearly a decade of activist parents and testing skeptics could not do — put a systemwide pause on statewide standardized testing. It wasn’t because the tests were too long or poorly aligned to classroom learning, or because benchmark exams and test prep were robbing students of a deep and meaningful curriculum — charges that testing critics had been articulating for years.

Rather, classroom learning was shifting to distance learning and the federal government, whose mandates motor much of the statewide testing, offered waivers to the states. And just like that, the $1.7 billion testing system got a hole as big as Texas blown through it.

In this pandemic pause, there’s an opportunity to look at some new ideas and to build a vision for a new generation of assessment. The PARCC and Smarter Balanced consortium tests were a giant step forward for most of the states that adopted them. Even those states that balked were forced to up their game.

But the promise of a next generation of assessments remains unfulfilled. How can assessments be better aligned to curriculum? How can tests be used to promote deeper learning? How can we test what is important rather than determine what is important by how easy it is to test?

The assessment system devised by the International Baccalaureate program is worth scrutinizing for answers to these questions. IB schools were established in 1968 as a way to provide children of peripatetic diplomats and international businesspeople with a consistent, internationally recognized high school curriculum and assessment system that would be acceptable to top universities around the world.

The 5,000 or so IB schools now use an interdisciplinary approach to education for 1 million students around the globe. Four IB programs enroll students from ages 3 to 19. Although the program started in Geneva, the United States has the largest number of IB programs (2,010 out of 5,586), offered in both private and public schools, some of which serve middle- and low-income communities.

Although formative assessments are given throughout the year, the program has an annual summative assessment at the end of the school year, though not all IB students take it. Prepping, which is called Reading Period, isn’t about test-taking tricks or learning to fill in bubbles, but rather reading deeply into material covered in their structured curriculum. Middle schoolers take a multimedia online test and submit both a project and a portfolio of work around art and design. High schoolers submit classwork and take tests in a myriad of subjects, which usually consist of writing essays, conducting multistep calculations and giving short answers. (This year, the IB program assessed students on classroom work alone.)

These tests force students to actually show their ability to think and integrate ideas on the spot. And they aren’t easy. A high school student sitting down to take her history exam might be asked to write essays on sample questions like these: “How successful was either Lenin (1917-1924) or Mussolini (1922-1943) in solving the problems he faced?” Or, “To what extent do you agree with the view that war accelerates social change?”

A student might culminate his study of chemistry by taking a test with this question: “The effect of some drugs used to treat cancer depends on geometrical isomerism. One successful anti-cancer drug is cisplatin, whose formula is PtCl2(NH3)2. Describe the structure of cisplatin by PtCl2(NH3)2 referring to the following:

….the meaning of the term ‘geometrical isomerism’ as applied to cisplatin

….diagrams to show the structure of cisplatin and its geometrical isomer

…… the types of bonding in cisplatin.”

While these tests are challenging, they succeed in assessing higher-order thinking skills on a wide range of students who hail from different social and geographical contexts, and they hold those students to the same standard in a transparent way. They are graded by teams of teachers and exam monitors who are trained and overseen by chief examiners. Each grader uses a “weak criterion referencing” system — that is, setting the standard according to a description of what to look for in candidate performance with an eye to how top IB students scored in years past.

This kind of assessment costs a lot of money: $119 per student per subject. By comparison, states that use PARCC and Smarter Balanced pay about $22 per student per test, but some pay as little as $9 per student per year for assessments.

Almost every state is girding for severe cutbacks in educational spending. But education spending is on cycle. It might be useful to employ this crisis as an opportunity to plan for the future.

As they do that, states should ask: How does our current test system accurately measure the things that count? What better ways of measurement are available to us? And, most crucially, how much are we willing to pay for a better test that assesses student learning, not simple recall, logic and test-taking ability — the cheapest and easiest things to measure?

Peg Tyre is director of strategy at the Edwin Gould Foundation in New York City and a FutureEd senior fellow.

Get stories like these delivered straight to your inbox. Sign up for The 74 Newsletter

Republish This Article

We want our stories to be shared as widely as possible — for free.

Please view The 74's republishing terms.

On The 74 Today