I emptied our spare change jar the other day. The actual “jar” itself has at times been an empty ricotta container or peanut butter jar. But for years it has sat in the same spot collecting the loose change from our pockets at the end of the day. It’s right there at the entrance to the kitchen, sitting on top of the bookshelf full of cookbooks and folders full of 30 years’ worth of recipes clipped from newspapers; and directly below the wall-mounted landline phone with the chord long enough to reach all of the appliances in the kitchen and a chair at the dining room table.
Sharing the space on top of the small, homemade bookshelf is a notepad, a separate smaller jar to hold some pens and pencils, an address book containing important phone numbers and contact information, and the inverted cardboard box top that my daughter colorfully decorated and labeled “Dad’s Stuff” to hold my wallet, keys, sunglasses, etc. Nestled next to the bookshelf is the now overflowing box containing replacement notepads brought home from countless conferences and meetings across the country (but mostly from Rhode Island).
From a Bygone Day
We never answer and rarely use the landline phone, so the notepad and pens see little use. The answering machine (in another room) collects information from automated systems calling to confirm medical appointments, offer daily last-chance reminders to extend our car warranty, and give us the happy news that we have won one grand prize or another (who knew we were so lucky). There are a few messages from legitimate fundraisers and lots of hang ups.
Sadly, most of the contact information contained in the address book belonged to people who are no longer with us. We can report power outages online now, so the card with the 1-800 number from the power company that we often pulled out in the middle of the night is no longer needed.
We do still use a few of the cookbooks, on occasion, but like everything else, many of our books are online now, not to mention helpful videos that offer so many more than 101 ways to prepare chicken or broccoli.
And, of course, we no longer use coins or accumulate spare change.
Even before the pandemic we had joined the crowd and switched to using credit cards even for grocery shopping, and I used my precious gold card at Starbucks (gotta collect those stars) before I started simply using my phone. We’re not total luddites.
I used cash primarily at the McDonalds drive-through and gas station convenience stores. And we would dig quarters for parking meters out of the change jar for trips to the dentist office and parking at the Amtrak station.
Now it’s an app or credit card for the parking meters and no more McDonalds.
All of those items on my bookshelf served us well. They did their job. We grew very comfortable using them.
And they are still capable of doing the jobs that they did so well.
But there are other tools now that do those same jobs more efficiently and effectively – that is, those other tools do the job better.
Which brings me to state tests.
Once I Built a Testing Program and Made it Run…
Have end-of-year, on-demand, standardized, state summative tests met the same fate as landline phones, cookbooks, and my spare change jar?
The tests still do what they were designed to do. You can even make a solid argument that they do that thing they do better and more efficiently than they did it 10, 20, or 30 years ago.
But it’s only fair to everyone that we examine state tests in the same light as the other tools I’ve mentioned, asking what is their job and are there other tools or approaches that could do it better?
State Tests – When You Try to Be Everything to Everyone…
Before we can determine whether there are better tools to do the job of state tests, we have to address one of the great unsolved mysteries of our time: What is the purpose of state tests?
Many answers to that question have been offered by both friends and foes of state testing. Let’s quickly run through some of the perennial favorites:
- Inform instruction – Where do I begin to list the flaws in the argument that the purpose of state tests is to inform instruction? Timing and unintended consequences would be a very good place to start.
- Model assessment best practices for the classroom – This holdover from the “if they’re going to teach to the test, we’ll give them a test worth teaching to” era re-emerges every decade or so, but I think that there is pretty widespread agreement at this point that there are more effective and efficient ways to model assessment practices for the classroom than a state testing program.
- Instantiate the state’s content and achievement standards – Although rarely a stated purpose of the state test, as my wise mentor and friend George Madaus often noted, historically, local educators have relied heavily on the state tests as a guide to interpret the state standards and as a signal of what’s considered important; “Is this going to be on the test?” The use of state tests as the primary tool to clarify content and achievement standards was not good then and is certainly not good now.
- District and School Accountability – Without a doubt, a primary use of state test results is as in input in school accountability systems, which would suggest that the purpose of state testing is to provide, produce, procure, etc. the data and/or information needed to support school accountability.
State Tests – One Test, One Use, One Purpose
“One Ring to rule them all, One Ring to find them,
One Ring to bring them all and in the darkness bind them.”
If supplying the information needed to support school accountability is a primary use of state tests and state testing, we are still left with what I will refer to as the mother of all issues regarding the purpose of state testing.
Because the state test is, in fact, a test – a measurement instrument – we have been conditioned to believe that the purpose of state testing is to measure student proficiency.
However, as I have argued so much that I sound like a broken record, the actual purpose of K-12 external testing, such as state testing, going all the way back to Horace Mann has been to collect data about student proficiency.
The fact that at one time, and for a long time, a common, standardized, end-of-year test arguably was the most effective and efficient method to collect the desired data about student proficiency did nothing to narrow the underlying problem from a broad data collection problem to a problem for which there was only a single measurement solution.
Our exclusive reliance on tests to solve that data collection problem, with all of the constraints and baggage associated with them, in the darkness binds us.
[The “broken record” reference to vinyl records, which have made a surprising comeback as a niche, chic item, does raise the interesting question can something that became an anachronism stop being an anachronism.]
This is Only a Test – Measuring Proficiency, Let’s Play Along
For the sake of argument, let’s play along and play out the idea that the primary purpose of state tests is to measure student proficiency.
Even if that were the case, there is nothing sacred about the end-of-year standardized test. In fact, we are making a giant inferential leap based on relatively little evidence when we rely on a single, on-demand test to make generalized claims about individual student proficiency.
Take a few seconds to think about that.
Recently, there has been increased movement toward the use of through-year assessment designs to measure student proficiency – an idea that first emerged as an alternative during the Race to the Top era.
Several people, including some of my former co-workers at the Center for Assessment, have pointed out that the inferences and claims supported about student proficiency are different when generated from an end-of-year test versus from data collected throughout the year. That argument certainly is solid with regard to the direct interpretation of the single test score (end-of-year) or composite test score (through-year).
Over time, however, I have become less certain of the applicability of that argument to inferences about student proficiency. What is so special about that single test score generated at a single point in time? We’ve been seeking a better way for decades.
Is the inference (i.e., generalization) about student proficiency based on a single, on-demand test administered late in the year any “better” than an inference based on repeated measures of student performance collected across multiple points in time and perhaps across different settings and contexts?
It seems to me that question should be answered empirically – which of course leads us directly to the assumption, fact, or inconvenient truth that student proficiency must and does exist outside of the state test which means that there must exist other ways to collect data about it.
Even if you want to cling to the measurement paradigm it’s time to look beyond the end-of-year state test.
Can You Hear Me Now?
Viewing the student proficiency problem from a different perspective (i.e., data collection) opens up a whole new world of possibilities.
If we still viewed “enabling to people to talk to each other” as the primary problem telephones were intended to solve rather than the broader communication problem, we would still be asking “Can you hear me now?” on a regular basis. Frankly, when I used my cell phone last weekend to make a birthday call to my uncle in Florida from my house in Maine, neither of us more than six miles off of I-95, the quality of the voice connection wasn’t much better than it was 20 years ago. But when I texted my cousin who was there with him and we switched to FaceTime, it made all the difference in the world.
We can continue to build better state summative tests to measure student proficiency, but that investment will be the equivalent of building a more comfortable carriage and breeding stronger, faster horses when the world has moved on to automobiles, trains, and planes – and Zoom.
Will there still be issues like comparability, security, and fairness to solve when we view the problem as a data collection problem? Of course. But we haven’t solved those problems yet within the limits of the measurement/assessment paradigm. I, for one, cannot wait to see how those issues are solved with a broader array of tools than are available in our assessment toolbox.
Am I suggesting that we cancel the Spring 2023 state tests? Of course not. We’re not there yet. But we will never get there if we don’t realize that there are viable alternative solutions.
When we get there, state tests, like me, can retire happily as an anachronism. I just pre-ordered both a CD and vinyl edition of Taylor Swift’s new Midnights album – available October 21.
Image by Miguel Á. Padriñán from Pixabay