The most common exam required to become a principal in the U.S. is not related to performance on the job, and candidates of color who take it are three times as likely to fail as white candidates, according to a recent study in Tennessee.
This combination — a weak connection to job performance and higher failure rate among some racial groups — creates concerns about whether the exam is permissible under federal civil rights laws, according to an employment lawyer and U.S. equal employment opportunity guidelines.
“This raises serious questions about the legitimacy of the exam, and they are questions that need to be answered,” said Gillian Thomas, an attorney with the American Civil Liberties Union. “It doesn’t mean there’s not a legitimate justification to be found.”
Paul Fleming, an assistant commissioner at the Tennessee Department of Education, defended the use of the exam, known as the School Leaders Licensure Assessment, or SLLA.
“The SLLA serves as just one measure that we ask for candidates to provide when they apply for an instructional leader license,” he said. “The role of licensure … is to serve as a check on whether educators have demonstrated that they have the right knowledge base and skill set to be an effective educator.”
Tom Ewing, a spokesman for the Princeton, New Jersey–based Educational Testing Service, which designs the test, said it was developed “through a rigorous process to assess the skills, knowledge, and abilities that licensing authorities have determined school leaders need.”
“Licensing exams like the SLLA are designed to protect the public from practitioners who lack basic competence, not to predict job success,” Ewing said.
The test is used in 18 states, according to the study. The researchers analyzed data for 10 years of Tennessee test takers, between 2003 and 2013. They studied performance evaluations, student achievement, and teacher leadership ratings for those hired as principals. The study compared principals who scored high marks on the exam with those who barely passed. Researchers said they found little evidence that the test scores predict how a principal will perform in the job.
“We were surprised,” lead author Jason Grissom told The 74. “I was really expecting that we would find that the test would predict future outcomes.”
Grissom, an associate professor of policy and education at Vanderbilt University, thinks the reason the test was not good at predicting school leadership success in Tennessee “may overlap with why test-takers of color are less likely to score highly.”
“The SLLA is trying to measure whether principals have the leadership knowledge to be ready to lead a school. But it may not measure the right kinds of knowledge, or that knowledge may not translate into the behaviors needed to be an effective principal, which are really difficult,” he said in a statement. “Leaders of color may bring a different kind of knowledge and experiences to the table that are just as useful for school leadership but that aren’t being measured by the test.”
The study comes when there has been increasing focus on diversifying the educator workforce based on research that students of color stand to benefit. Research has directly linked principal diversity to teacher diversity.
About 80 percent of principals in the country are white, 10 percent are black, and 7 percent are Hispanic. The numbers are very similar for teachers: About 82 percent are white, 7 percent are black, and 8 percent are Hispanic. That compares to roughly 46 percent of students who are white, 15 percent who are black, and 29 percent who are Hispanic.
A spate of recent research has suggested that students of color stand to benefit from teachers of color. Those students score (slightly) higher on standardized tests, are less likely to be suspended, are held to higher expectations, and are more likely to be identified for gifted and talented programs. Teachers of color, on average, receive higher ratings from students, regardless of students’ race.
There is less empirical evidence on the value of principal diversity, but Grissom’s research on principals lines up with studies of teachers. He shows that black students are better represented in gifted programs when attending a school led by a black principal.
“We have better evidence about the importance of principal diversity for [creating] teacher diversity,” said Grissom. “Since we know that teacher diversity in turn matters for student [outcomes], we can see the connection, even if we don’t have as conclusive of evidence linking principals to student outcomes.”
Non-whites do systematically worse
Grissom and his co-authors’ study, “Principal Licensure Exams and Future Job Performance,” was published this January in the peer-reviewed journal Educational Evaluation and Policy Analysis.
First, the researchers show that non-white candidates were three times as likely to fail the exam as white candidates. Including those who took the test multiple times, 15 percent of minority test-takers did not pass, compared with just 5 percent of white candidates. Put simply, “non-White candidates perform systematically worse than their White counterparts on the SLLA,” the study states.
The test is not the only qualification to become a principal in Tennessee, said Fleming, the assistant commissioner. Candidates must have at least three years of acceptable experience, complete a state-approved instructional leader preparation program, and be recommended by the state-approved educator prep program.
Anyone who doesn’t pass the test, however, can’t be hired, and even among those who do, the researchers found that candidates with higher scores were more likely to become principals.
Giving the test such a significant role might make sense, the authors say, if the assessment does a good job predicting who will be effective in that role, but the data do not bear that out. They say they could find no consistent evidence that principals who score better on the licensure test perform better on the job.
“Across a variety of job outcomes, numerous specifications, and different samples of principals by level of experience, neither surpassing the SLLA cut score nor obtaining a higher score on the exam serves as a useful predictor of future principal job performance,” the researchers write.
The one exception seems to be for assistant principals, whose scores appear to relate to overall evaluation ratings. It’s not clear why the results are more positive for them.
(The 74: How Districts Are Joining the Fight to Close a Troubling Training Gap Among America’s School Leaders)
This appears to be the first study directly linking performance on the principals test to on-the-job performance, according to the authors. Previous research in California also found that black and Hispanic candidates were much more likely to fail the exam than white test-takers.
Teacher certification exams have also come under scrutiny for relatively low passing rates among minority candidates, though research suggests the test scores are often positively — though very weakly — associated with classroom performance. New York recently eliminated a test for prospective teachers meant to measure reading and writing ability amid concerns that it screened out too many potential candidates of color.
Not everyone agreed that it was better for students to abolish the test.
“While New York’s teacher shortage and the need to recruit and retain quality teaching candidates must be a priority, watering down literacy standards is the wrong way to solve the problem,” said a statement issued by High Achievement New York, a group that supports rigorous standards. “If the Academic Literacy Skills test is flawed, work to improve it — and ensure it is fair to every prospective teacher who takes it — but don’t end it.”
Disparate impact and job-related
The federal Civil Rights Act of 1964, Title VII, prohibits discrimination in employment based on “race, color, religion, sex, or national origin.”
Guidance from the Equal Employment Opportunity Commission specifies how employers can and cannot use exams as a hiring screen: “Title VII ... prohibits employers from using neutral tests or selection procedures that have the effect of disproportionately excluding persons based on race, color, religion, sex, or national origin, where the tests or selection procedures are not ‘job-related and consistent with business necessity.’ ”
To bring such a suit, Thomas, of the ACLU, said a plaintiff would first have to show what is called “disparate impact” — that the test disproportionately screened out a protected class, which the research in Tennessee shows for prospective black and Hispanic principals.
To defend the hiring screen, the employer — here, Tennessee and its school districts — would have to show that it is job-related.
“The legal framework is, once that disparate impact has been shown … then the burden is on the districts or the states that are using that exam to prove why it’s necessary to their business, why it’s necessary to do selection of the best possible principals,” Thomas said.
Even if that could be shown, it wouldn’t necessarily end the case, if there was a better, non-discriminatory option, according to the EEOC.
“If the employer shows that the selection procedure is job-related and consistent with business necessity, can the person challenging the selection procedure demonstrate that there is a less discriminatory alternative available?” says a commission fact sheet.
Thomas said that the Tennessee study alone can’t show definitively whether the principals test would survive a court challenge, but its findings do raise questions.
“As an attorney, if someone came to me and said, ‘We’re finding a failure rate that’s this high and at least one group of experts have said it doesn’t predict any better job [performance],’ ” said Thomas, “I would be interested and want to learn more about how the test was devised, what kinds of validation it has gone through.”
Fleming said Tennessee doesn’t look to the test to predict who will perform well as a principal and uses other tools to support effectiveness once candidates are chosen, such as team evaluations and school leader academies.
Ewing, of the Educational Testing Service, said that the principals exam went through significant vetting to ensure it is valid and not discriminatory.
“The committees that developed the SLLA were made up of a diverse group of high school and college educators, and committee members reviewed every test item for possible unfairness to any group of test-takers,” he said.
Ewing also pointed to “an array of free test preparation materials” to help ensure equal access to the exam.
He did not address why the test might still produce failing scores for three times as many black and Hispanic candidates as for whites.
Study has its limits
Like all research, the Tennessee study comes with a number of important caveats.
Perhaps the most important is that it could not look at whether candidates who did not pass the test turned out to be an effective or ineffective principal. In theory, the results might be different if it were possible to observe the ability of those who failed but still went on to become school leaders.
Of all the places that use the test, Tennessee has one of the lowest passing scores, which is set by the state. Because of that, researchers could simulate varying cut scores used by other states — including the score recommended by the test maker — to see if those places seem to be screening out less-effective principals. The research “finds little evidence that the SLLA serves as an effective performance screen at any of the cut scores.”
Grissom also notes that states with higher cut scores likely have even larger disparate impacts on potential principals of color.
“If you moved it up to the Mississippi cut score [and] if their distribution of scores by race looks like Tennessee’s, they would be weeding out about two-thirds of their non-white test-takers,” Grissom said. “And that is shockingly high.”
In that sense, the study could provide stronger evidence against the exam in states that have higher passing scores; however, the data are limited to the on-the-job performance of Tennessee principals, so the results might differ elsewhere.
Another limitation is simply the measures of what it means to be a good principal. Although the research uses a variety of metrics, there is not widespread agreement on how to judge principal quality.
“One thing you have to take seriously is that we just might not have very good measures of principal job performance,” said Grissom, who points to his and others’ research showing that it’s difficult to isolate a principal’s effect on students’ test scores.
There isn’t much evidence on whether supervisors’ ratings are strong indicators of principal quality, though some research suggests teachers’ perceptions of principal leadership are predictive of teacher turnover and student achievement.
“Defining principal effectiveness is a tough nut to crack,” agreed Peter Goff, an assistant professor at the University of Wisconsin–Madison, who says that perhaps the best way to judge principals is to look at how they affect their teachers. “In the same way that the quality of a teacher is defined by their work with students, the quality of a principal is defined by their work with teachers.”