In my last post, I outlined many of the ways that state testing has changed since the shift from norm-referenced to criterion-referenced, standards-based tests in the 1990s, changes that include
- who is tested (accessibility and inclusion),
- what is tested (alignment to state standards, higher-order skills)
- when we test (testing windows and timing of sessions),
- how we test (computer-based v. paper-and-pencil, constructed-response items), and
- why we test (formative program evaluation v. accountability and individual scores).
Despite significant changes in these and other areas, most would agree that there has been relatively little innovation in state testing over that same time period.
Yes, educational assessment has been the beneficiary of and/or taken advantage of innovation in other areas, particularly advances in technology.
Yes, the processes for developing and analyzing state tests have been tightened, sharpened, focused, and in general, improved by the application of such innovations.
But innovation in state testing itself, not so much. We’re all familiar with the timeline.
Innovation Interruptus
The absence of innovation is not for a lack of trying.
The 1990s bubble that was a Golden Age of Innovation in state testing burst when Rand, among others stuck a pin through the heart of it.
The heralded giant leap Beyond the Bubble Test to the next generation of state tests that began with a bang in 2010, ended with a whimper.
And as for the Innovative Assessment Demonstration Authority (IADA), well, that program was always more about finding innovative (i.e., more efficient) ways to arrive at the same “proficiency determination” than it was about what most people regard as true innovation in education and educational assessment. Let’s put a pin in IADA for now, and we’ll circle back a bit later.
I’m not claiming that there has been no innovation in educational assessment. For example, DLM alternate assessments for students with significant cognitive disabilities feel innovative; as does the Navvy Classroom Formative Assessment system and some of the tests designed to assess the Next Generation Science Standards (the NGSS themselves representing something of an innovation in state standards). There is also innovation potential in other areas of assessment, but it’s still a little too soon to say with regard to measures and instruments being developed in the name of competency-based assessment and way too early to weigh in on culturally relevant and sustaining assessment.
The bottom line is that innovation is not a word that I can apply to describe state testing in English language arts and mathematics at grades 3-8 and high school
Why?
Barriers to Innovation in State Testing
For my money, there are two obvious barriers to innovation in state testing: Us and Them.
“Them,” of course, is the federal government. Federal assessment requirements by their very nature suck the life out of innovation. First, the sheer volume and frequency of testing required leaves little room for anything more than maintaining the status quo. Second, those assessment requirements are highly prescriptive – again leaving little room for innovation. The upshot of those two factors being, as Watershed Advisors noted in a recent issue of their newsletter, The Delta, state education agencies are primarily focused on and built for compliance. That was not always the case with regard to assessment and state testing. State agencies and state testing once were breeding grounds for assessment innovation (see the 1990s). As I wrote in a 2021 post, however, when the focus of a field is on compliance, those with an innovative spirit and mindset move elsewhere.
“Us” in this context refers to the cadre of education reformers to which assessment specialists belong and its counterintuitive (and IMHO, truly warped) view of innovation as it relates to standards-based education policy, instruction, and assessment. In almost all other areas of interest innovation makes an experience easier or more efficient for the end-user in some definitive and desirable way; that is, innovation removes barriers and frustrations (that blinking 12:00 notwithstanding).
In standards-based education, however, the term innovation invariably is used in reference to implementing more challenging standards, setting higher expectations for student achievement, and holding schools and stakeholders (i.e., educators and students) accountable for accomplishing more in the same amount of time under largely unchanged conditions. I’ll take that blinking 12:00 over that type of innovation any day of the week and twice on testing day.
Nothing in the previous paragraph should be construed as suggesting that the aforementioned reforms (higher standards, improved instruction, more rigorous assessment, accountability) are not needed – they surely and sorely are. They cannot and should not, however, be marketed as “innovative” or conflated with “innovations” unless they are accompanied by supports and changes to conditions that at least in some small way meet the basic requirement of making life and the educational experience easier (not just better) for end users.
Circling Back to Innovation and IADA
Without going into specifics or naming names, I think that each of the handful of state attempts at innovative assessment that we have seen under IADA, as well as other alternatives proposed under RTTT and ESSA before it, can be placed neatly into one of two boxes.
The first box contains assessment initiatives geared toward the traditional definition of innovation. Those states proposed programs designed to increase efficiency by making use of external assessment instruments that schools were already using or by attempting to link state assessment directly to the curriculum and tests that would naturally be administered during the course of instruction.
The second box (somewhat larger than the first) contains all of those assessment programs that fit the education reform definition of innovation. Some of these efforts could be described as bottom-up, emerging organically from a few dedicated early adopter districts while others were assessment visions dropped top-down from the state. The common characteristic they shared was asking schools, students, and teachers to do something that they were not already doing in the name of making things better.
The first box was taped up and set out on the curb because the “true” reformers viewed those assessments as ratchet. The second box joined it because those programs tried to ratchet things up too quickly, without gaining necessary buy-in or making the prerequisites improvement to infrastructure.
There is a third group of potentially innovative assessments that don’t belong in either of those two boxes. These are the assessments that were boxed in by convention. They started out as grand visions of innovation, but after being subjected to a few rounds of state testification during the design process ultimately looked like a generic state test.
Innovating Outside of The Box
How then do we break out of these boxes and innovate in state testing?
The correct answer to the question “How do we innovate in state testing?” is the same today as it was 20, 30, 40 years ago and the same as it will be 20, 30, 40 years from now:
We don’t.
And the reason that the answer is “we don’t” is because it’s a trick question, one that continues to trick us over and over and over again.
Innovative assessment – in the education reform sense of the term – is the classic case of the assessment tail wagging the dog, and if there is one lesson that we need to take from the past 40 years it is this: That dog won’t hunt.
Can we improve assessment? Yes.
Can we make assessment more efficient? Sure.
Innovation in assessment, however, can only come after innovation in the things that we are asked to assess.
I am even a bit uncomfortable suggesting that assessment should be at or on the table during discussions when initial decisions are being made about instruction and curriculum because of the risk of those decisions being too heavily influenced by notions of what can be assessed.
Let others better suited to the task make critical decisions about curriculum, instruction, and what information they need to gather via assessment. Then we’ll work with them to figure out ways to provide them with the information they need. That’s our role.
It may not be perfect information at the start, but it will be adequate; and adequate information about the right thing is always better than perfect information about the wrong thing. Over time, feedback will result in iterations and refinements to what is assessed and how it is assessed – and that’s the way this science is supposed to work. Our task is to ensure that those refinements are driven by the construct (content, skills) and not construct-irrelevant assessment factors.
Before closing, I also want to be careful here to note an important distinction between innovation and invention. They say that necessity is the mother of invention. Faced with the necessity of accomplishing tasks like testing seven instead of three grade levels, assessing all students (including students with disabilities and English learners), testing all students with a single computer lab, etc. we will invent ways to accomplish those tasks.
Innovation is a level beyond that – of dreaming of things that never were and asking why not.
Right now, there are people working hard each day, dreaming of innovative ways to better convey 21st century skills to 21st century students.
Where they lead, we must follow.
Image by Mediamodifier from Pixabay