We have all asked the question, “Where did the time go?”
As troubling as that question can be, more recently, I find myself pondering an even more vexing question, Where did time go?
Every day, it seems as though time has been removed as a dimension or component of some part of our lives in which it was always really important.
Television, of course, is a prime example. I grew up with “same bat time, same bat channel,” and Sunday nights at 8 with the family in front of the television (could I stay awake long enough to see Topo Gigio). Later there was 11:30 on Saturday nights and “must see TV” on Thursday. Appointment television!
Now, I can watch a show whenever, wherever, and however I want – on demand. I can still watch any of those shows referenced above as easily as a show that aired last night. And not just whole shows. I can pull up a clip of my favorite moments; like Sheldon erasing time as he makes a basic mistake while explaining the time parameter to Penny on The Big Bang Theory.
Not only can we pull up television or movie clips, clips of our own lives are now also neatly stored and readily available on demand. We are supposed for forget certain things over time and to be able to process, shape, and reshape our memories. However, as Taylor Swift wrote recently,
“This is the first generation that will be able to look back on their entire life story documented in pictures on the internet, and together we will all discover the after-effects of that.”
Will it become more difficult for time to heal all wounds if we remove the passing of time; if every day, or at any time, moments in our lives are replayed for us in full color, with video and even sound?
Our Brief History of Time
Educational measurement, of course, has not been immune to this loss of time. In previous posts, I have discussed our loss of the time needed to design, develop and evaluate assessment programs before making them operational. There is also the apparent lack of any understanding or consideration of time and the foundational formula D = RT when setting accountability goals for individual students, schools, or states. The loss of time that I want to discuss today, however, is more fundamental to educational assessment.
Not so long ago, time was central to the design and administration of tests and also to the reporting and interpretation of test scores. In the heyday of norm-referenced testing, test scores were based directly on the interpretation of a student’s performance at a particular point in time. Grade Equivalent scores described student performance in terms of what was typical (or expected) at a given point of time within a school year. Those scores, as well as percentile ranks and stanines, were based on the particular point in time at which the test was administered; with separate norm tables developed for each week within a defined test administration window. As we moved to the NCLB era and more criterion-referenced achievement levels, student performance was still evaluated and interpreted in comparison to expectations at a fixed point in time (i.e, at the end of a particular grade level). Time was still in play as recently as 2010 with the advent of the Common Core State Standards, when we spoke of student proficiency in grades 3-8 in terms of being “on track for college-and-career readiness” by the end of high school. Referring again to our old friend, D = RT, the use of the term ‘on track’ implies that we have a fairly thorough understanding of distance, rate, and of course, time.
Losing Track of Time
Somewhere over the last five years, however, the assessment/measurement community lost track of time. Ironically, in part, our loss of an appreciation for time can be attributed to pressures directly related to time – too much testing time, too long to report results, and the well-intentioned yet poorly conceived backlash against “seat time” in favor of competencies to be defined later.
But those reasons can only partially explain our complete abandonment of time. Perhaps we simply have succumbed to the pressures of an on-demand world. Perhaps we started to believe our own rhetoric about vertical scales, invariance, and the wonders of IRT. Perhaps the assessment industry is simply trying to adapt to technology and the “lean startup” concept – get the product in the hands of the customer faster.
With almost reckless faith in psychometric theory we are willing to boldly go where no assessment person has gone before. We will administer items anytime, anywhere, in any combination, and apply item parameters generated across wide swaths of time (it all averages out in the end) to produce a theta estimate for a student.
And what do we do with that theta estimate? That’s where things get tricky. Our “time-based” tools for reporting and interpreting test scores have not caught up with this new “time-free” approach to assessment. We convert the theta estimate to a scale score – even a vertical scale score. And then …
Time is all we have
And then we are face-to-face with the reality that educational assessment cannot exist without time. Without slipping into the philosophical argument over whether any type of psychological measurement, including educational measurement, is “real measurement” we have to acknowledge that virtually all of our IRT-based assessment lacks the underpinning of a theory-based scale. At our best, we assemble an agreed upon collection of items and collect data on student performance on those items at a particular point in time. We cannot interpret student performance on our large-scale assessments without a consideration of time and both the expected and relative performance of students at that point in time. We can make awkward attempts to couch test scores in criterion-referenced terms, but as the quote often attributed to Bob Linn says, “scratch a criterion and you’ll find a norm.”
But if we have the serenity to accept the ways in which we cannot change our dependence on time, perhaps we will have the courage to change the things that we can change, and the wisdom to know the difference.
At this time, we are embarking on one of our field’s greatest adventures and challenges – the development of assessments to measure attainment of the Next Generation Science Standards. It is a task that challenges everything we know and hold dear about alignment, item construction, test construction, scoring, reporting, reliability, and of course, validity. With nothing more than a meager notion of a construct, we are developing and implementing NGSS assessments. Perhaps these NGSS assessments will be an example of the old principles of test construction meeting the new principles of the lean startup strategy – iterating with the client to understand the construct and build the product that is needed. The NGSS assessments and construct will form and re-form each other over time. If that’s the mindset of the assessment developers, clients, and policy makers that’s not necessarily a bad approach.
Only time will tell.