Through the Looking Glass on Educational Assessment and What I Found There

One test says you’re larger
And one test says you’re small
And the tests that the state gives you
Don’t say anything at all.

Many will read the modified lyrics above and think that they perfectly capture the futility of large-scale educational assessment. Others will read them and lament in frustration that it doesn’t have to be that way with well-designed systems of assessment that are balanced, cohesive, and fair. Then there are those like me, grinning like a Cheshire cat, who read those same lines and think “that makes perfect sense, that’s the way that educational assessment is supposed to work.” (More on that later.)

Therein lies one of the biggest barriers to improving the utility of educational assessment: the lack of agreement on what it does and what it’s supposed to do. Further, most attempts to better define large-scale assessment and its role in the educational ecosystem seem to have a “through the looking glass” feel to them, where nothing seems quite as it should be.

It’s a world where we’re rushing to get state assessment results into the hands of teachers without any clear sense of what they should do with those results. While at the same time, we are perfectly content to wait until this week for the release of the 2023 TIMS results and to wait another month for the release of the 2024 state NAEP results – results that might actually inform policy.

It’s a world where we would rather quibble over achievement level labels than address the factors affecting school and student performance behind those labels.

It’s a world where we expect all students to achieve the same academic standards but seem willing to devote more energy and resources to explaining rather than eliminating gaps in achievement; for example, spending two years producing “demographically adjusted” NAEP results and resurrecting decades-old ideas to fine-tune our understanding of persistent achievement gaps.

It’s a world where we have created a mythical menace, high-stakes standardized testing, our Jabberwock that must be slayed to restore order to the educational world, bringing liberty and justice for all. But a world in which we want accountability and comparability and to be able to monitor individual student progress and growth over time.

It is most definitely a world that thrives on jabberwocky and the creation of assessment and educational neologisms that take on a life of their own, often devoid of the burden of actual meaning, theoretical or empirical underpinning, or shared understanding. More often than should be, essays, posts, and articles about the future of assessment are built on phrases that sound nice when strung together, but in the words of Lewis Carroll’s Alice,

“It seems very pretty,” she said when she had finished it, “but it’s rather hard to understand!” (You see she didn’t like to confess, even to herself, that she couldn’t make it out at all.) “Somehow it seems to fill my head with ideas—only I don’t exactly know what they are! However, somebody killed something: that’s clear, at any rate.”

What’s been killed is our case, however, is not the monster, but rather clear communication. Gone are clear, concise statements of the purpose of large-scale assessment, what it is we want it to do; the need to explain ourselves and our actions having been replaced by “it’s a federal requirement” – the laziest of all justifications. Also severely wounded, if not outright killed, has been our capacity to publicly acknowledge the limitations of large-scale assessment, clearly stating what it can and cannot do.

It is only by acknowledging those limitations that we can have a chance to emerge from the hallucinogenic assessment fog that we have been living under throughout this century.  

What are those limitations?

When the Men on the Chessboard Get Up and Tell You Where To Go… And Your Mind is Moving Low

For far too long, we have been on the wrong side of the looking glass.  We have contributed to, if not created, an environment in which instruction mirrors assessment rather than assessment mirroring instruction as it should. By limiting large-scale assessment to content and skills that can be easily assessed, we have at best inadvertently limited what is taught.

As they have said for a long time now, and as a teacher friend said to me last weekend, the reality is that “what gets tested, gets taught.” For a time, we even adopted that mantra as our battle cry for innovative assessment, but alas, we failed to live up to our end of the bargain.

When Logic and Proportion Have Fallen Sloppy Dead

Large-scale assessment, which should assume its rightful position at the ass end of any educational reform or initiative has managed to ascend to a position of dominance in education policy.

When designing an instructional intervention or crafting an education policy, the first questions that should be asked are what we want to accomplish, what are the conditions and/or behaviors we hope to change, and what is necessary for this program to produce those changes and outcomes. Only when those questions have been asked and answered is it time to ask how to best use assessment (large-scale and other types of assessment) to evaluate and inform progress and success.

Our current practice of “assess first, ask the important questions later” has always been a nonstarter.

To be fair, we didn’t start out with an “assess first” mindset, but one of the unintended consequences of mandated assessment and accountability requirements is that over time they always flip the script, increasing the focus on assessment and test scores while eroding attention to programs and underlying conditions.

Before leaving this “logic and proportion fallen dead” section, I would be remiss if I didn’t acknowledge that always lingering near the surface in the assessment wonderland we find ourselves in is the belief that if only large-scale assessment were removed or improved then all of our educational problems would suddenly disappear. If only large-scale assessment really was the root of the problem. If only it were that easy.

If You Go Chasing Rabbits, You Know You’re Going to Fall

Finally, we need to stop chasing the “actionable” and “instructionally useful” information rabbits. Large-scale assessment programs with on-demand tests administered at the end of the year, or even a few times per year, will never provide that type of information – at least not as those terms are currently used and understood in the field.

That is not to say that the mechanics behind large-scale assessment cannot be used in other ways to support more dynamic and informative forms of testing, but that’s a different rabbit hole for another day.

Feed Your Head Feed Your Head

To end on a positive note, let’s swing back to our opening lyrics.

One test says you’re larger
And one test says you’re small
And the tests that the state gives you
Don’t say anything at all.

To my mind, those lyrics describe close to an ideal state, one that perhaps could only be made better if all three statements were the product of a single test.

There is nothing inconsistent or incoherent about reporting that a student is larger but that the student is still small – that is, that the student has grown since the previous test administration, but their achievement still falls short of meeting the grade-level standard.

As for the last lines, I have long argued that the best we can hope for is that the state tests don’t provide any information that we didn’t already have. In an ideal educational world where everything is functioning correctly that might mean that the test the state confirms that all students are meeting the standard. However, even in the real world, what we hope for is that educators who are with students day after day for the entire school year already know whether students are proficient – and a lot more than that. If asked, through student information systems they could provide that information to state and federal policymakers as easily as they do daily attendance data. However, because we live in the real world, the state test is a necessary audit. Nothing more. Nothing less. And the desired result of an audit is that it confirms information that we already have. 

Only when we all understand that will large-scale assessment and state testing be able to do what it’s supposed to do, be what it’s supposed to be. For in the words of Lewis Carroll’s Duchess,

“And the moral of that is—‘Be what you would seem to be’—or, if you’d like it put more simply—‘Never imagine yourself not to be otherwise than what it might appear to others that what you were or might have been was not otherwise than what you had been would have appeared to them to be otherwise.’”

Or perhaps more succinctly in the words of the King,

“If there’s no meaning in it,” said the King, “that saves a world of trouble, you know, as we needn’t try to find any.”

 

Image by StockSnap from Pixabay

Published by Charlie DePascale

Charlie DePascale is an educational consultant specializing in the area of large-scale educational assessment. When absolutely necessary, he is a psychometrician. The ideas expressed in these posts are his (at least at the time they were written), and are not intended to reflect the views of any organizations with which he is affiliated personally or professionally..