Mark Schneider and Kumar Garg’s recent call for a SpaceX for assessment and the responses to it by Chester Finn and Anne Wicks, prominent among others, highlight the complexities of large-scale testing and the challenges associated with trying to improve K-12 assessment. Sadly, even if our fractured field were somehow able to address the many technological, bureaucratic, and human issues raised in those thoughtful posts, I am not convinced that we would be any closer to addressing, let alone solving, the underlying problems that are masked by a focus on the real or perceived deficiencies in K-12 large-scale tests, assessment programs, or assessment policies.
As I wrote ten years ago in a 2011 EdWeek commentary, “developing an assessment system, even a next-generation one, … is not rocket science,” but “large-scale assessments … are simply supporting pieces” in an effort “to improve instruction and student learning.”
I am all in on need to continue the ongoing improvement of large-scale state testing and the evolution of our thinking about balanced assessment systems and the role of assessment in K-12 education. On some days I might even agree with those calling for an assessment revolution.
Putting too much effort into improving large-scale testing before we have a better understanding of how we want to use it and before we address the underlying factors affecting student performance and/or school quality seems to me to be a fool’s errand.
Since the passage of ESEA in 1965, we have more than 50 years of test results showing consistent gaps in achievement. We have state-level NAEP data since 1992 showing the same thing. Thanks to NCLB, we have more state test data than we can handle, disaggregated in more ways than anyone can possibly understand.
I am at least 15 years beyond the point of believing that simply building a better large-scale test is going to make that much of a difference.
Wag The Dog
There are, of course, grains of truth in the adage “what gets tested, gets taught,” and in the belief that curriculum and instruction are influenced more by the items that appear on the state test than by the contents of the state’s rigorous college-and-career ready standards.
I suspect that the influence of the state test on curriculum and instruction is directly proportional to the consequences associated with the test for individual districts; and therefore, is inversely proportional to the capacity, or ability, of the district to meet the state’s performance expectations.
That is, the state test will have a greater influence on behaviors in low-performing districts than in high-performing districts. To a certain point, that outcome is what we are seeking by implementing a state assessment program. In general, we want people to pay more attention to any measure when it is in the “red zone” than when it tells us that everything is fine.
Problems arise, of course, when local educators and state policymakers drift beyond that certain point and instead of a desirable outcome we get undesirable behaviors – in particular, the classic case of the tail wagging the dog.
Actually, the case of large-scale state testing fits both negative connotations of the term wag the dog.
Something important being controlled by something much less important
At the local level, we see curriculum and instruction (which are very important) being controlled by a test score (which is much less important). Efforts to improve test scores overwhelm efforts to improve instruction and learning. And why would we expect anything different?
We state glibly that if schools improve curriculum and instruction, test scores will follow (in much the same way and with the same chance that it could happen as NCLB stating that states should test students in mathematics in their native language to the extent practicable). Without the infrastructure, resources, and capacity to improve curriculum and instruction, the best option available to districts may be to focus on improving test scores by any means necessary.
Focusing on one thing to distract attention from a more serious issue or problem
Large-scale state test results are big news and large-scale state tests are both an easy solution and an easy target.
I don’t know whether the theory of action for NCLB mandated test-based accountability every really was that simply administering annual tests would make things better. The disproportionate focus on test scores (i.e., outcomes) that we see year after year rather than on how effectively an enormous amount of state and federal money is being spent to produce the achievement reflected in those scores (i.e., inputs) does make one pause and wonder why.
More confusing to me than state-level obsession with test scores has been the willingness of local educators (i.e., teachers’ unions) to focus so much of their limited resources on attacking state assessment. Sure, there is room for improvement in large-scale state tests and even more room for improvement in test-based accountability. Knee-jerk complaints that K-12 large-scale tests are too expensive, too long, too irrelevant, too focused on the low-level cognitive skills, and return results that are too late and are too gross to serve any useful purpose to improve instruction do little, however, to focus attention on the underlying issues.
We must not continue to allow ourselves to be distracted by large-scale testing. Use the test scores as a weapon to make the case for the need for high-quality curriculum, adequate resources to support instruction, and the infrastructure to support student learning. Gain some traction on those issues and test scores will follow. There it is, just the right amount of glibness.