What Do We Want From State Testing?

What is it that we want from state testing?

The responses are as varied as the responders.

Some want everything – A, B, C, D, plus all of the above. And our field has been more than willing to promise them the moon and stars – everything and just a little bit more. It’s nice to feel wanted.

Some want nothing, nothing at all. Lying in wait for the day when state testing finally meets its inevitable fate, the sun shines again, a healthy climate is restored, and our long national nightmare is over.

Others who want nothing at all spend their time actively lying about state testing in attempt to hasten its demise.

In between those who want everything and those who want nothing are those who want something – preferably something useful. [You won’t get observations like that just anywhere.]

Some want information to inform policy.

Some want data to evaluate, advocate, or criticize a particular program.

Some want to monitor progress.

Some hope that their state tests provide confirmation, affirmation, or validation – and the peace of mind that comes with it.

Fine. To each their own.

Let’s try this question:

How are we going to use the data and information that we glean from state testing to improve educational inputs and outcomes for individual students, schools, districts, and the state?

Take your time. I’ll wait.

As the song says, “I’ve never heard silence quite this loud.

Nearly a quarter century into the NCLB era, that question should not be difficult to answer. It should be crystal clear at this point in time, exactly what state testing – in its current form – can provide and cannot provide. What it can and cannot tell us help answer that nagging question about whether our children are learning.

There should be countless peer-reviewed articles published in highly regarded journals, and also in the stuff that educators and policymakers read or watch online, detailing how state testing has been used to improve the lot of this group of students and that group of students.

With information from state tests telling us whether Johnny can read, we should be not only be able to answer the age-old question, Why Can’t Johnny Read? but at a minimum be well on our way to ensuring that every Johnny, Jane, and [fill in your own assortment of ethnically balanced and androgynous names here] can read, write, and do math (or at least understand when math is being done to them).

Sadly, that’s not where we are. Not even close.

So, let’s try one more question:

What is it that we need, truly need, from state testing?

First and foremost, we need to know whether students are able to do what we expect them to be able to do at a particular point in time.

  • The phrase knowing “whether students are able to do what we expect them to be able to do” is a generic way of saying we need to know whether students have met a specified standard: that is, met grade-level expectations, are proficient, on track, or college-and-career ready. The particular label used matters a little more than we thought it did, but not nearly as much as we think it does. What matters is the definition of the content, skills, behaviors and the achievement standard.
  • That point in time might be at the end of every grade level. It might be at the end of grades 4, 8, and 10 or 3, 7, and 9, or 4, 6, and 8. It might be at the end of every three months from kindergarten through twelfth grade – probably not but it might be. There are solid arguments to be made for the need for information at various points in time. The most important points in time might very well vary by content area.

PLEASE NOTE: The statement above is very different from stating that we need to know what a student can do – or what they cannot do.

Simply put, the state has no need to know what every individual student can or cannot do at a particular point in time. More importantly, a state test (under even the broadest and most innovative definition of the term) cannot provide that level of detailed information. No debate. No tradeoffs to be considered. Those are just the facts, ma’am or sir or [fill in your own assortment of ethnically balanced and androgynous personal titles here].

Second, we need to be able to tell parents, teachers, administrators, perhaps policymakers, and students themselves whether students have made sufficient progress – again, during a specified period of time.

Note the difference in phrasing between “we need to know” with regard to status and “we need to be able to tell” used here for progress. I am not sure how much the state needs to know about progress, but I am certain that the key stakeholders listed above want that information. Status alone (e.g., Proficient/Not Proficient) is not enough information to sell state testing.

  • Progress goes by many names and has many interpretations, some appropriate, some wildly inappropriate, but that discussion is beyond the scope of this post. The takeaway here is that people want an indication of whether students are making sufficient progress toward some goal, relative to other students, and in one content area relative to another. (see Betebenner, 2009)

In an ideal world, progress might be reported by simply noting that all students have met the expectations mentioned above at each specified point in time.

We do not, however, live in an ideal world.

Our world is full of variation – which you may or may not consider ideal.

Students learn at different rates – which you may or may not consider ideal.

Students are not assigned to grade levels on the basis of the likelihood that they will be able to do what we expect students to be able to do at the end of that grade level – which you may or may not consider ideal.

So, we need to come up with creative ways to describe student progress – which you may or may not consider ideal, but it is what it is.

There you have it. We need, truly need, state tests to be able to provide data we can use to determine whether students are meeting expectations and making sufficient progress.

A tale as old as time.

Oh, and we need state tests to provide that data efficiently.

That’s all.

It makes you wonder how we have managed to make this so complicated.

You can’t always get what you want

Clearly, I have left out a lot of things that people say they want from state testing, including some things that we routinely provide on our own or in response to federal requirements. I know that.

A partial list of the things that state tests cannot and should not provide:

  • Actionable information (whatever that means) to teachers to inform the instruction of their current individual students.
  • Information on student mastery of individual standards, competencies, etc.
  • Subscores
  • The aforementioned detailed information of what a student can and cannot do.
  • A student’s precise location on a scale – as the term “scale” is used by everyone in the world outside of educational testing.
    • Don’t even mention so-called vertical scales.

On the surface, it may seem contradictory to state that we can use the state test to tell people whether a student has made sufficient progress from Time A to Time B but cannot tell them what a student can do (and cannot do) at Time A and Time B. It may seem contradictory, but it’s not. Trust me on this, I’m a doctor.

It may also seem that I am being unduly positive about the use of performance classifications (i.e., achievement levels) and overly negative toward the value of scales and scaled scores. Perhaps.

Achievement levels and performance classifications have their drawbacks, particularly when reported with “certainty” for individual students and when used as they have been for school accountability – as Andrew Ho and others have eloquently pointed out many times over.

But achievement level classifications do not have as many drawbacks as the unidimensional scales which we have oversold – oh, how we have oversold those meaningless, arbitrary, capricious, and misleading scales and scaled scores.

When we have figured out a closer connection between learning progressions (i.e., content-based markers of student development and progress) and scales, we can talk scaled scores.

Until then, report distributions of scaled scores on a single grade-level test at the aggregate level (school, district, subgroup, etc.) if you must – no averaging scaled scores across tests, please, and enough with ineffable effect sizes.

But treat scale scores at the individual student level as nature and our founding forebears intended – as raw data to be transformed into statistics that with careful thought and planning can convey useful information to test users; that is, to people who know what to do with it – DIKW – IYKYK.

Image by Pete Linforth from Pixabay

Published by Charlie DePascale

Charlie DePascale is an educational consultant specializing in the area of large-scale educational assessment. When absolutely necessary, he is a psychometrician. The ideas expressed in these posts are his (at least at the time they were written), and are not intended to reflect the views of any organizations with which he is affiliated personally or professionally..

One thought on “What Do We Want From State Testing?

Comments are closed.