assessment, accountability, and other important stuff

Archive for March, 2014

Standard Setting v. Setting Standards

Charles DePascale, Center for Assessment (NCIEA)

January 2014

 

Over the next two years, Smarter Balanced (summer 2014) and PARCC (summer 2015) will engage in standard setting activities to determine the scores on their tests that classify student performance into a particular achievement level. By all accounts, those standard setting efforts will be massive, complex, and will apply the most current and best thinking on standard setting that our field has to offer. Over the last two years, however, the two consortia have been engaged in the activity of setting standards, and therein lies the problem.

 

Standard setting, as described above, refers to the process in which expert judgments and/or empirical data are used to relate performance on a test to established achievement standards; that is, to determine the passing scores or cut scores on a test that distinguish between performance at two or more achievement levels. Setting standards, on the other hand, is the process of defining those achievement levels, or standards, in terms of the knowledge, skills, and abilities that a person who meets the achievement standard must possess and demonstrate. In the case of the new common core assessments and the Common Core State Standards (CCSS), setting achievement standards means defining what it means to be college- and career-ready in high school or “on track” to college- and career-readiness at earlier grade levels.

Through history and happenstance, in K-12 education we have melded the tasks of setting standards and standard setting into a single process closely tied to, and in many ways, defined and constrained by a large-scale standardized test. When the notion of reporting results on achievement tests in terms of achievement standards as an alternative to norm-referenced scores such as percentile ranks or grade equivalent scores began to take hold, there were few, if any, established content standards or achievement standards. With reporting scores on a test as the catalyst for setting achievement standards, it is logical that early efforts combined the processes of setting standards and standard setting and that those achievement standards were associated with a particular assessment.

As the adoption of state content and achievement standards became the norm following the passage of IASA in 1994 and NCLB in 2002, there was an opportunity to separate achievement standards from an assessment and tie them more closely to content standards. In practice, however, content standards require interpretation and the state assessment was used as the vehicle to clarify and define the state’s content standards as well as to define its achievement standards. There are serious drawbacks to this approach, of course, such as the danger that the definition of achievement standards will be limited to those knowledge, skills, and abilities that can be easily measured via an on-demand, large-scale standardized assessments. In the context of state assessment, however, there are also clearly advantages to a close alignment between a state’s content standards, achievement standards, and its assessment. What is most important in the context of state assessments, however, is that there is one set of content standards, one set of achievement standards, one assessment, and one agency (and often one group of people) responsible for all three. For better or worse, a definitive interpretation of the state’s content standards is expressed through its state assessment and achievement standards developed for that assessment.

In the case of the CCSS and the assessments being developed by PARCC and Smarter Balanced, the circumstances are quite different than our familiar state context. We began with one set of content standards, the CCSS, which like their predecessors require interpretation to be translated into curricula and instructional materials and be applied in the classroom. However, we now will have two assessments and two sets of achievement standards aligned to those content standards rather than one. Again, what is most important is that we have not one agency, but NO agency responsible for the interpretation of those content standards. The agencies that coordinated and championed the development of common content standards clapped their hands in celebration of the release of the CCSS in 2010 and then washed their hands of any responsibility for the interpretation of those standards, the development of achievement standards aligned to those standards, and the manner in which those standards are assessed.

The development and publication of the CCSS was a major accomplishment, but an incomplete accomplishment and an opportunity lost. In 1992, the National Council on Education Standards and Testing in their report to Congress recommended the development and adoption of national standards for students that “include the specification of the content – what students should know and be able to do – and the level of performance that students are expected to attain – how good is good enough.”   The Council first ties content standards (what students should know and be able to do) to achievement standards (how good is good enough) and only then links those combined content and achievement standards to assessments: “Since tests tend to influence what is taught, assessments should be developed that embody the new high standards. “   In this era of test-based accountability, the influence of tests on what is taught is likely stronger today than it was in 1992. By leaving the question of “how good is good enough” in the hands of individual states, groups of states, and assessment developers we risk losing the “common” in the common core.

Although PARCC and Smarter Balanced have garnered most of our attention, there are already state assessments in place in Kentucky and New York that define achievement standards based on the CCSS, and several more state and commercial assessments will become operational in the next two years.   By 2016, it is likely that there will be at least a dozen interpretations of “how good is good enough” on the CCSS. Having missed the opportunity to build achievement standards into the CCSS, where do we go from here?

One approach is to recognize that the development of content standards is an ongoing process with a built-in five to seven year review and revision cycle, to acknowledge that the development of the CCSS is incomplete, and to use this opportunity to continue the national conversation about common standards. As achievement standards are being set by the consortia and individual states we can foster ongoing cross-state discussions of what it means to be college- and career-ready or “on track” to college- and career-readiness.   After those achievement standards have been set, we can foster serious, in-depth analysis of what those standards have in common, how they differ, and how well they embody the CCSS. To be clear, that analysis must go well beyond simply mapping the results of the various assessments to each other or to a common measure such as NAEP and declaring one to be correct. It also must go well beyond relying on one set of achievement standards to emerge from the marketplace. Then when the next version of the CCSS are released in 2017 or so, we will have a solid foundation for incorporating how good is good enough into the description of what students should know and be able to do.

Taylor Swift, Headbanging, Summative Assessment & Actionable Information

Charlie DePascale, January 2014

Immediately following Taylor Swift’s performance during the Grammy Awards on Sunday night, Twitter exploded with reaction to Taylor’s emotional ‘headbanging’ while performing her ballad All Too Well. The hullaballoo made its way through internet news sites on Monday and should reach NBC News sometime next week or whenever Atlanta thaws out – whichever comes first. Although the seemingly convulsive movements were a surprise to millions, even the casual Swift observer knows that the violent head flip that sends her long hair flying is one of her go-to moves.

Washington DC, 2010

Washington DC, 2010

Greensboro NC, 2013

Greensboro NC, 2013

Mansfield MA, 2008

Mansfield MA, 2008

Any true Swiftie who followed the RED Tour last yearknew what to expect Sunday night during All Too Well and knows full body movements at the piano as a signature moment of any Taylor Swift concert since the Fearless Tour in 2009-2010.

Fearless Tour

Fearless Tour

Speak Now Tour

Speak Now Tour

How does this relate to summative assessments and actionable information?
Like the Grammy Awards, summative assessments may provide new and actionable information to policymakers not familiar with an individual student or school. However, for anyone with deeper knowledge of the individual student, for example, a teacher, the summative assessment, like the Grammy Awards, should confirm what we already knew and expected to see.