The Butterfly Effect

This is the second in what unexpectedly became a series of posts on validity, validation, state tests, and state assessment programs. In these first two posts, I am focusing on the primary purpose of state tests and interpretation of student test scores as a measure of student performance on the state’s content standards; that is, state tests as a measure of student proficiency.  In subsequent posts I will expand the discussion to address the implications and consequences of other interpretations and uses of those test scores. 

As I mentioned in my previous post, historically, the primary interpretation of scores from state assessments was related to student proficiency on state standards.  This limited claim created a comfortable cocoon within which we could craft a content-based validity argument.

With the introduction in 2006 of annual testing at grades 3 through 8 under No Child Left Behind, our validity argument started to push on the cocoon just a bit.  States began to expand their interpretation about student test performance to explicitly suggest that student proficiency at one grade level signaled a level of preparedness for instruction in the following grade. That forward-looking interpretation was not stressed to the same extent when state assessments were administered once in elementary, middle, and high school and there was a greater focus on schools than students. (Yes, there were cases where student performance on the state test was a gatekeeper to advancement to the next level of schooling or to high school graduation, but I am treating those more as test uses than as primary interpretations of test scores and will address them in subsequent posts.)

Before states and the measurement community could process that subtle change, however, the cocoon was burst wide open with the development and adoption of the Common Core State Standards (CCSS) and the emergence of the concepts of college-and-career readiness at the high school level and its grade 3 through 8 counterpart, “on track” to college-and-career readiness. The explicit focus on college-and-career readiness that accompanied the CCSS had implications for curriculum, instruction, and assessment. Another effect of the external focus of college-and-career readiness, of course, was that it introduced an additional interpretation of scores from state tests that must be accounted for in the validity argument for state tests developed to measure student proficiency on the CCSS.

As practice catches up with this change and this charge, we are beginning to see increased pressure on states to provide evidence supporting the interpretations of college-and-career readiness and “on track” to college-and-career readiness. Although those two interpretations may seem similar in that both address college-and-career readiness, there are key differences between them with regard to validity and the validity evidence that states should be considering.

College-and-Career Readiness

 In the pre-Messick, trinitarian view of validity, interpretations related to college-and-career readiness would be a textbook example of criterion-related validity.  Even within the current validity framework evidence based on relations to other variables (e.g., freshman year course grades or GPA) is likely to be the primary evidence needed to support interpretations of college-and-career readiness for K-12 state assessments —particularly as interpreted by USED with college-readiness equivalent to college-and-career readiness for all intents and purposes,

The use of college admissions tests in state assessment programs as indicators of college-readiness raises additional validity challenges.  The consequences of the use of tests such as the ACT and SAT for college admissions is an issue that must be considered by states even though that might not be the intended interpretation and use of the tests as part of K-12 state assessment programs.

The greater challenge facing K-12 assessment programs, however, is providing evidence to support two distinct interpretations of test performance. The example provided in the Standards might have been written directly to address the use of the ACT and SAT in high school state assessment programs:

 “As the discussion in the prior section emphasizes, each type of evidence presented below is not required in all settings. Rather support is needed for each proposition that underlies a proposed test interpretation for a specified use. A proposition that a test is predictive of a given criterion can be supported without evidence that the test samples a particular content domain. In contrast, a proposition that a test covers a representative sample of a particular curriculum may be supported without evidence that the test predicts a given criterion.” (AERA, NCME, APA, 2014, p. 14)

A state’s argument that the test supports both interpretations, of course, must be supported by both types of evidence. To the main point of my last post, states committed by law to the latter proposition while committed in spirit to the former will do what is necessary to meet validity criteria for their selected test, which may be quite different from gathering evidence to determine whether the two propositions are actually supported by the use of their selected test.

The introduction of college-and-career readiness into state standards via the CCSS over a decade ago and its subsequent effect on validity requirements has exposed the longstanding disconnect between high school and state assessment. As I wrote in a post and paper earlier this year, the purpose and design of the American high school does not fit well with the concept of state assessment as a measure of all students against the same standards. The challenge now is to facilitate a shift from force fitting assessments and flawed propositions to a conversation that leads to the development of claims, propositions, and ultimately assessment programs that are a better match to our vision of high school and varied student pathways to postsecondary readiness and success.

On-Track to College-and-Career Readiness

Similar to college-and-career readiness, an interpretation that a student is on track to college-and-career readiness has an external focus that is above and beyond proficiency on the content standards at the student’s current grade level.  The relation between the test score and the ultimate criterion of college-readiness becomes less direct, however, as we move further away from the high school test and toward the grade 3 test.

Although the “on track” interpretation can (and should) be framed as a validity issue, it is less clear than with college-readiness what is being validated and how it should be validated.  Within the USED peer review requirements, the issue of whether students who demonstrate proficiency across grade levels are, in fact, on track to college-and-career readiness is raised under Standard 6.3 – Challenging and Aligned Academic Achievement Standards rather than under the Validity standards.  Requested evidence is focused primarily on demonstrating the relation between the achievement standard setting process and the content standards. Implicit in those requirements is the assumption that proficiency on the content standards will lead to college-and-career readiness – if only the appropriate threshold for proficiency is determined.

This assessment-centric approach to validating, evaluating, or confirming the “on track” classification is not unusual, particularly when connected to the approval of an assessment program.  The approach, however, is based on the assumption that a relationship the content standards and the criterion (i.e., a path to college-readiness) has been established.  Without evidence, outside of the test, that some level of mastery of the content standards is associated with being on track to college-readiness, the exercise of providing evidence related to the test and standard setting is meaningless.   Note that meaningless is different from unsuccessful. As discussed in the last post, it will be quite easy to show an association between performance on the test and just about any other cognitive measure.  Asking states to provide test-related “evidence that students who score at the proficient or above level … are on track to succeed in college and the workforce by the time they graduate from high school” is simply a fool’s errand that undoubtedly will result in unintended consequences.

At this time, more than the test score or achievement standard, it is the untested assumptions related to the interpretation of the content standards that must be validated, or evaluated, if you prefer.  In creating the CCSS, on which most state content standards are based, a combination of expert judgment and research was used to back map grade-level content standards from statements of the knowledge and skills needed to be college-ready at the end of high school.  A decade has passed since those content standards were written, and it may still be too soon to evaluate fully whether a student progressing from grade 3 through high school in a program that is implementing the CCSS with fidelity will, in fact, emerge college-and-career ready.  It is not too early, however, to begin the evaluation process; and if we are truly committed to improving instruction, that evaluation will seek more than an association between coarse outcome measures.

(Image from

Published by Charlie DePascale

Charlie DePascale is an educational consultant specializing in the area of large-scale educational assessment. When absolutely necessary, he is a psychometrician. The ideas expressed in these posts are his (at least at the time they were written), and are not intended to reflect the views of any organizations with which he is affiliated personally or professionally..

%d bloggers like this: