Assessment by Any Other Name, Please

Edy’s Pie, Ben’s Original Rice
The Chicks, Lady A
The Princeton School of Public and International Affairs
The Washington Football Team
The Altria Group, American Outdoor Brands Corp.
WW, Dunkin’

One of the legacies of 2020 is a spate of name changes, most for the same underlying reason.  As demonstrated by final rows of the partial list of prominent name changes above, however, changing names did not begin in 2020 and names are not changed solely in pursuit of advancing social justice. Names of things, places, and even people are changed for any number of reasons – Ruth Bader Ginsburg was not named Ruth, Martin Luther King was born Michael King, Jr.  If social media last week is to be believed, even Anne Hathaway doesn’t liked to be called Anne.

Names are often changed in an effort to disassociate from the past or to honor the past. However, some hope that a name change will enable them to expand a market or to become more inclusive, to create a more modern or appealing image, or to refocus and recenter. Some name changes are intended to obfuscate and others to increase clarity. In 2018, I called for the rebranding of educational measurement for a combination of the reasons cited above. (I expect the tiny ripple caused by that post to become a wave any moment now.)

This year, I set my sights on our overuse of the word assessment. Fancying ourselves assessment specialists (an ill-advised oxymoron to begin with), we are happy to include virtually anything and everything under the umbrella of the term assessment. Instead of being selective about what we consider assessment, we devote considerable effort to describing differences between various types of assessment by assigning the appropriate modifier: performance, authentic, comprehensive, balanced, large-scale, classroom, benchmark, diagnostic, high-stakes, low-stakes, and of course, formative, interim, and summative.

Ah yes, formative and summative, terms we appropriated from our background in program evaluation.  And interim, the term coined by my colleagues at the Center for Assessment to protect formative assessment by corralling the host of commercial tools falling between formative and summative assessment.  Perie, Marion, and Gong (2009) also developed a schema to distinguish among those three tiers of assessment “based on the intended purposes, audience, and use of the information.”   Their contribution to the gallery of assessment triangles further distinguished formative, interim, and summative assessment by scope and duration of cycle and frequency of administration.

The use of formative, interim, and summative labels has been somewhat successful. However, as we begin 2021, I have become convinced that many of our current problems stem from the use of the term assessment to describe all three.  There are several key differences and only a few superficial similarities (e.g., the use of tests) between the purpose and use of formative, interim, and summative assessment. Those differences affect everything, including the type and quantity of evidence collected, how it is collected, when and from whom it is collected, the results that are reported and how they are reported.

From my perspective, we gain very little by calling all of this assessment. On the other hand, the continued use of the term assessment creates confusion and frustration by creating the false expectation that all will provide actionable information to inform instruction; an expectation that simply cannot – and should not —be met.  During an NCME webinar last week, Allison Timberlake of the Georgia Department of Education reported with dismay that the vast majority of Georgia educators surveyed felt that the state assessment served no purpose or provided no benefit to them. I propose that a new set of labels is a necessary first step toward helping those educators understand the purpose and use of state tests, and ultimately helping them accept the role that state tests are designed to play in K-12 education.

New Names for a New Day    

Current Name

New Name

Formative Assessment

Formative Assessment

Interim Assessment

Progress Monitoring

Large-scale Summative Assessment

Census of Academic Achievement

Formative Assessment  →  Formative Assessment

Although this may seem heretical to my large-scale , standardized assessment sisters and brothers, I propose that we relinquish all claims to the term assessment and leave it with teachers where it belongs.

We deal with tests. Teachers deal with assessment. Despite our longstanding and continued practice of using the terms interchangeably or as synonyms, they are in fact different. A test, plain and simple, is a measuring instrument, a device for collecting data.  Assessment, as described by Cizek, “is best defined as the collection of many samples of information – that is , many tests – toward a specific purpose,” and “the planned process of gathering and synthesizing information relevant to the purposes of: discovering and documenting students’ strengths and weaknesses, planning and enhancing instruction; or evaluating and making decisions about students.” (Cizek, 2020, 1997)

Cizek concludes, “In every case, assessment [inside and outside of education] involves collecting and summarizing information in order to develop a course of action uniquely tailored to an individual’s needs.” Within the context of K-12 education, that is the role of teachers and that is the purpose of formative assessment.

Interim Assessment  →  Progress Monitoring

As Perie, Marion, and Gong (2009) explained and D’Brot and Landl (2019) further elucidated, the various tools that we classify as interim assessment serve a variety of purposes (predictive, evaluative, instructional) and “the key distinction between different interim assessments is how the results are intended to be used in service to [informing teaching and learning.] This time with apologies to my extended family from the University of Minnesota, from my perspective virtually all of those tools and purposes that we include under the heading interim assessment are captured nicely by the concept of Progress Monitoring.

Progress Monitoring is a term already in use and familiar to many teachers and local administrators. Additionally, with delineated tiers of screening and intervention, it encompasses the different purposes, types, and frequency of testing commonly associated with interim assessment; and progress monitoring also ties the results of those tests much more directly to further instruction and testing than we have ever done with our attempts to define interim assessment. Finally, in response to the argument that co-opting the term Progress Monitoring will only cause more confusion, I would argue that rather than co-opting we are in fact, repatriating these tests and this process to their rightful home. Tracing the historical roots of most of what we now classify as interim assessment will lead directly back to progress monitoring.

Large-Scale Summative Assessment (State Tests)    →   Census of Academic Achievement

It has long been my contention that state testing is a data collection activity rather than a measurement activity. The proposed name, Census of Academic Achievement, is intended to reflect that belief. My hope is that the use of the term census will better convey the sense of state tests providing information to inform policy and distribution of resources and lower expectations of receiving feedback from them to directly inform instructional decisions.

The term census should also convey the idea that the information collected (via a test or other data collection tool) is going to be analyzed, combined with other data, placed in context, and converted to information that is reported in a manner designed to inform policy. I should note that prior to NCLB, it was much more commonplace for state “assessment” programs to collect a wealth of additional data via questionnaires (student, teacher, administrator) or other means, and to incorporate that data into the reporting of state test results.  NAEP and college admissions tests still engage in this practice. At the state level, however, much of that data collection and interpretation activity has now been redirected to state accountability systems.  One could argue that state accountability systems are a much better match to the general definition of assessment than state tests. Accountability systems, however, lack the direct connection to informing instruction that is central to the definition of assessment in the context of K-12 education.

I also considered Annual Census of Academic Achievement as a new name, but I am not certain that an annual census is necessary or whether a census every two or three years might be more beneficial for informing education policy and supporting school improvement.

A Summative Coda

The name change proposed for summative assessment applies only to large-scale summative assessment. Obviously, there is additional testing under the heading of summative assessment that occurs at the school and district level that is not addressed here.  Without going into detail, I suggest that the bulk of summative “classroom testing” using various tools (e.g., tests, quizzes, projects, portfolios, presentations) can be covered under the umbrella of grading.

