assessment, accountability, and other important stuff

Archive for the ‘Uncategorized’ Category

Three Little Words

love_heart_svg.svg

Charlie DePascale

Life is full of three-word phrases.

Some tend to have profound and lasting consequences that extend far beyond what may have been intended when they were uttered.  Phrases such as I Love You, That Looks Safe, and for those among us wavering on new year’s resolutions, Just One Bite might fall into this category.

Other ubiquitous three-word phrases like While Supplies Last, Limited Time Offer, Exclusions May Apply, and Void Where Prohibited function exactly as intended; even if we are usually not happy to see them.  Often hidden in the fine print, their sole purpose is to put constraints on an offer or claim that is being made.

In the last couple of years, a three-word phrase has begun to make its way into the assessment lexicon – on this test.  At first glance, the phrase, or a close variation of it, seems neither new nor threatening when used to describe student or group performance. Charlie spelled 23 words correctly on this week’s spelling test. Karla met the college readiness benchmark on the SAT. In Vermont, 42% of grade 5 students performed at the Proficient level or higher on the Smarter Balanced mathematics test. Taken at face value, the phrase is used simply to identify the test that was taken.

Recent use of this common phrase, however, is intended to do much more than identify the source of performance.  Its purpose is to limit interpretation of student or school performance; to make it clear that the performance should be interpreted within the specific framework of the test or testing program.

Again, at first glance, we might regard this use of the phrase as innocuous or perhaps even a step forward in test use and interpretation.  Identifying the source of a test score seems quite consistent with many of our Standards for Educational and Psychological Testing, beginning with Standard 1.0:

Clear articulation of each intended test score interpretation for a specified use should be set forth, and appropriate validity evidence in support of each intended interpretation should be provided.

When considered as part of a larger effort to marginalize and vilify large-scale assessment, however, the connotation of the phrase on this test changes dramatically. It is the second punch in a one-two combination intended to knock out large-scale assessment.  The left jab that has weakened the credibility of large-scale assessment is the argument “Scores on large-scale assessments are ______” – fill in the blank with your favorite criticism: not valid, unfair, inaccurate, not representative, unstable, insufficient, not authentic, etc.  Now with the right cross of on this test, critics of large-scale assessment (or its uses) seek to nullify test scores by limiting their interpretation to that already weakened large-scale assessment.

Even the most well-designed assessment program can sustain only so many of these blows before collapsing in a heap to the canvas.

pow-1601674__180

My first encounters with the on this test crowd occurred while working with two states setting achievement standards on their new college-and-career-readiness tests.  A vocal minority in both states were adamant that the phrase on this test should be added to each achievement level description. Their stated intent was to convey that students’ performance on the state assessment was not representative of their overall level of achievement.

My most recent encounter came late last year in a Stephen Sawchuk post in Edweek about the decision to add the modifier NAEP in front of the achievement level classifications on the National Assessment of Educational Progress; as in NAEP Basic, NAEP Proficient, NAEP Advanced. As stated in the post, “[t]he rewording may seem awfully minor to the uninitiated. But there’s a deeper subtext behind the changes, and that’s why this is worth noting.”

For their part, the National Assessment Governing Board (NAGB) makes the argument that the addition of the NAEP modifier is intended to clarify that the NAEP Proficient level, “is not intended to reflect ‘grade level’ performance expectations, which are typically defined normatively and can vary widely by state and over time. NAEP Proficient may convey a different meaning from other uses of the term ‘proficient’ in common terminology or in reference to other assessments.” Forgoing for now a discussion of whether the NAEP achievement levels are defined any more or less normatively than any other achievement levels, nobody can deny that achievement standards can and do vary widely by state and over time. Consequently, there is confusion when the same label Proficient is used across states and assessments to describe those varying standards.  From that perspective, the label NAEP Proficient serves the purpose of clearly identifying the set of achievement standards against which student performance is being judged.

For long-time critics of the NAEP achievement standards, however, the modifier is another weapon in their fight to marginalize NAEP results. The achievement level results no longer represent what proficient fourth or eighth grade students across the United States should know and be able to do; rather, they simply reflect NAEP Proficient – a mythical concept that is not tied to any state’s grade level standards and expectations.

As assessment/measurement specialists, our professional values and standards have made us unwitting accomplices in the effort to undermine large-scale assessment.  We agree with and/or can be quoted making statements such as

  • test scores should not be considered in isolation,
  • a student’s score on a given day or test might not reflect her/his true performance,
  • multiple measures should be used to evaluate student achievement, or
  • a test score reflects student performance on this test.

In the past, we mastered the art of expanding on those statements via PowerPoint bullets and charts to defend large-scale assessment with winning arguments before policy makers and the courts.  In this era of soundbites, tweets, and memes, however, we may never get that far.

With hubris, we attach a great deal of importance to our work and our high-quality assessments.  Remember, however, that without the ability to generalize student or school performance beyond a particular test we have nothing.  The task before us is clear; and if we envision a future in which large-scale assessment makes a valuable contribution to improving student learning, we must not fail on this test.

Look What You Made Me Do

A 2018 Blog Year in Review

Charlie DePascale

IDAHO_IMG_5781

We have reached the end of 2018 and another year of posts on Embrace the Absurd. When I look back at the ten essays posted this year, I think that the phrase that best sums up this year of posts is look what you made me do – and not simply for the obligatory Taylor Swift reference.

A primary theme that ran across my posts this year is that we, as a field, may be just a tiny bit out of control; reactive rather than proactive; allowing ourselves to be defined by others; or perhaps overwhelmed by the moment.

I began 2018 with the post, Implausible Values, discussing the stress and strain being put on the field and our equating infrastructure by demands for shorter tests, alternate tests and adaptive forms, less standardization and more flexibility, more accuracy and precision, and immediate results.  I also wrote of the paradox of taking at least six months to produce results for a few NAEP tests and no more than six days to complete equating for a dozen state assessments.

NAEP returned as a topic in April with, If I Did It, a satirical treatment of the efforts to control mode effect and preserve the trend line in the reporting of the 2017 NAEP Reading and Mathematics state results; an effort which could serve as the poster child for our 2018 theme. I have little doubt that 2017 NAEP results will serve as a cautionary tale in educational policy and measurement courses for generations to come.

Across the year, a trio of posts addressed the broad issues of time, validity, and the essence of educational measurement.  In It’s About Time we address not only the lack of time mentioned above, but also the extent to which our measurements and interpretations are dependent upon and bound by time, and the growing need to incorporate time into our measurement models.  Bring Back Valid Tests addresses our ongoing struggle to develop an operational definition of validity.

In my 2016 NERA presidential address, Living in a Post-Validity World: Cleaning Up Our Messick, I argue that in the nearly 30 years since Messick’s 1989 chapter, we have wandered the desert searching for the Promised Land of a unified theory of validity.  As with many of the constructs that we attempt to measure, we still lack a clear understanding of validity; yet one of our guiding principles is that you have to understand and clearly define something before you can measure it.

This leads to my call for Rebranding Educational Measurement with the argument that the field will be better served both by not only acknowledging, but also embracing the uncertainty in what we do; and included this reminder from the 1951 first edition of Educational Measurement, “[t]he primary concern of measurement, however, should be for an understanding of the entire field of knowledge rather than with statistical or mathematical manipulations upon observations.”

 A Year of Professional and Personal Journeys

2018 was also a year of personal and professional journeys. In Ten Years of Taylor I describe literal and figurative journeys with my daughter across ten years of Taylor Swift concerts from 2008 – 2018.  In My Miss Brooks I describe the 4th and 5th grade class that set me on this assessment/measurement journey nearly fifty years ago; and with the benefit of hindsight reflect on the high-stakes test that awaited at the end of those two years that may not have been as high-stakes as we thought at the time.

And throughout 2018, there were other journeys not noted directly in the blog, including the Red Sox eight-month, 119-win journey from Opening Day to their fourth World Series championship since 2004.  And now we have solid empirical evidence (n=2) that the Red Sox own the first 18 years of the century.

After 25 years of organizing regional conferences in small venues in places like Rocky Hill (CT), Springfield (MA), Buffalo, and Pittsburgh, in April 2018 I finally made it to the big time – a national conference on Broadway. Serving as 2018 NCME co-chair with long-time friend and colleague, April Zenisky, we were able to bring together past, present, and future leaders of our field to reflect on the past, present, and future of the field.

Outside of the conference, New York City brought feelings of awe when standing in the middle of Times Square at night or earlier that day sitting in front of a Renoir painting at the Metropolitan Museum of Art. That feeling was matched, if not surpassed, a month later driving through the mountains of Northern Utah and Southern Idaho on a Sunday afternoon with Marren Morris’ My Church on repeat on iTunes. And then on my first trip outside of North America, there was the incomparable and simply indescribable feeling standing in the middle of Anne Frank’s room in Amsterdam.

A New Year and New Beginnings

Today we look forward to a new year with new journeys, and new beginnings.  For the second time in my career it feels like we are on the cusp of a new era in K-12 assessment and educational measurement. Technology, personalized learning, big data, more complex and higher-order content standards, and a renewed interest in assessment in the classroom have created a perfect storm of challenges and opportunities for assessment and educational measurement. NCME has begun work on the fifth edition of Educational Measurement, which brings with it the opportunity to take the time needed to reflect on where the field is now, how it got here, and the directions it might, could, and dare I suggest, should go in the future.

So, as we begin 2019, let’s renew our commitment to keep the faith, fight the good fight, and as always, embrace the absurd.

A Letter to Santa

Letter_to_Santa_logo

 

Dear Santa,

I am the next generation of large-scale assessment and I am 4 1/2 years old.  I have been very good this year. At least I have tried very hard to be good.  I have been reliable and fair. I think that I have been valid, but Uncle Steve says that’s not for me to decide. I have tried not to do things that I really shouldn’t do like evaluating teachers and promoting little kids from third grade to fourth grade.

Some of the bigger kids try to get me to play in their accountability games.  They like to do all sorts of strange things to my scores before they report them.  I am not even sure that what’s reported are even my scores anymore.  I tell them why can’t you just use percent proficient – everybody understands that.  Andy from across the street just laughs at me, “Ho, Ho, Ho”, and says looking at those percentages is like “viewing progress through a funhouse mirror.” My best friend Joey is even meaner.  He just runs around yelling, “Liar, Liar, Hair on Fire!” I don’t even know what that means.

Santa, it seems like people are always trying to change me.  They want me to be shorter, but they want five performance levels and subscores.  They want me to cost less, but they want to use authentic texts and measure high-level skills. They want me to tell them if kids are on track to be college- and career ready and they don’t even know what that means.  I try to adapt, but it’s really hard.  You know, real people used to take such care in putting me together; now it seems algorithms just grab items off of a shelf like a shopper on Christmas Eve and like magic, Happy Birthday, a test is born ready to administer!  You know Santa, sometimes I don’t even feel like I am the same test when they put me on a computer.

holly-161840_960_720

I have to tell you Santa, I am a little worried about 2019.  Can you believe that in a couple of months I have to test NAEP Reading and Mathematics again?  It seems like they just reported results from my 2017 tests.  I hope that goes more smoothly this time around.

And then there are all of things they are asking me to do to assess the next generation science standards.  There are just so many changes and things that have never been tried before. Everyone tells me I look phenomenal, but I am not so sure.

Does anyone really understand what the performance expectations mean?

Has anyone tried to define proficient performance on different combinations of performance expectations?

Has anyone even thought about what proficient performance across a whole science test is supposed to look like?

I am afraid that we might be putting the sleigh before the reindeer here, Santa.

I mean, what’s the rush? I would really hate for this to be the 1990s all over again – the last time they tried to introduce next generation assessments before they were ready.  A whole lot of promising young assessments were cut down before they reached their prime in that purge.

Santa, I can’t take another heartbreak. Lately it feels like everything I do turns into a disaster. I guess I really don’t know what large-scale testing is all about. Santa, isn’t there anyone who knows what large-scale testing is all about?

So Santa, if you can bring me only one gift this year it would be to help people remember the true meaning and purpose of large-scale assessment.  Help them understand where I fit within a coherent and balanced system of assessments.

I know that’s a lot to ask; but I believe, Santa.  I believe.

 

santa-sleigh-pulled-by-reindeer-vector-clipart

 

How Arne Works

Charlie DePascale

During my August trip to Minnesota I was able to check two books off of my summer reading list: Relativity – The Special and the General Theory by Albert Einstein and How Schools Work by Arne Duncan.  As the old joke goes, one was a book that asked me to rethink basic concepts and ideas long-held as fundamental truths, and the other was a book by Einstein.

I will attempt to reconcile Relativity and large-scale assessment in a later post.  Today’s post is devoted to my five takeaways from Arne Duncan and How Schools Work.

how schools work

1. Lies and Incentives

“Education runs on lies.”  This is the first sentence of the first chapter titled Lies, Lies Everywhere.

The in-your-face focus on lies no longer has the same shock value that it did when most of us were introduced to Arne in 2009; no, not after eight years of life in the honesty gap that rolled into the current era of fake news and alternative facts.

What was surprising, however, was how freely he uses the word lie. In some circles, the word lie implies more than a simple departure from the “truth” or reality.  To say that a person has lied or is a liar suggests an intent to deceive or mislead. Arne, however, uses the word lie to describe a broad array of statements and actions that one might refer to as myths, misconceptions, misinterpretations, untested beliefs, or defense mechanisms.  In one example involving a Chicago principal, Arne begins the section stating, “One such principal told this lie directly to Mrs. Daley and me, and I’ll never forget it..”  He ends the same story about the same principal stating, “I loved Chester’s honesty throughout – first when he challenged Mrs. Daley and then when he told me he’d been mistaken about his kids.”

In the end, perhaps actions based on lies or misperceptions cause the same problems and have the same negative impact on children. If your role is to solve those problems, however, understanding whether you are dealing with a lie or a misperception should influence your approach to a solution.  And if you are counting on current teachers, administrators, and policy makers to be part of the solution, starting off by call them liars might not be the best approach.

Incentive is another special word in the Arne lexicon.  Arne rightfully notes the importance for school improvement efforts to include incentives as well as the sticks associated with NCLB.  One example he offers of an incentive, however, is firing Chicago teachers caught cheating on a standardized test. I believe his argument is that the district ensuring that bad behavior is not rewarded is an incentive for the good behavior of all of the other teachers.  A second incentive he discusses is related to the teacher evaluation requirements associated with Race to the Top and the administration’s NCLB waivers.  I don’t know many teachers who viewed state-designed educator evaluation systems as an incentive.

You can only show me a stick and tell me it’s a carrot for so long before I figure out that’s a lie.

2. Story Driven

After reading How Schools Work, it is clear to me that Arne is story-driven.  By story-driven, I am not referring to the many stories that drive the narrative in How Schools Work.  Rather, I am referring to the concept of story-driven described by Bernadette Jiwa in her 2018 book Story Driven – you don’t need to compete when you know who you are.  Story-driven individuals and the organizations they lead have a “clear sense of purpose and identity” that defines and drives them.

Jiwa’s story-driven framework is defined by five words –    Backstory, Values, Purpose, Vision, and Strategy. The backstory is our journey to now, which create our values (guiding beliefs)  and purpose (reason to exist).  In a story driven organization, those are the forces that drive the organization’s vision (aspiration for the future) and strategy (align opportunities, plans, and behavior).

Arne’s backstory that defines his identity, values, and purpose are his experiences growing up in Chicago with his mother’s inner-city after-school program.  As he describes it, the Chicago that he saw with her program was just two miles but a world away from the section of Chicago where he lived. That’s not a bad backstory for a U.S. Secretary of Education.

3. No place for states

Virtually my entire career has been spent closely connected to state departments of education, as an assessment contractor, an employee, and for the last 16 years as a consultant. It appears, however, that state departments play, at best, a minor supporting role in Arne’s world.  At worst, they are another one of the liars, a barrier to improving schools.

There are three direct references to state departments of education that stand out in the book.  The first is a reference to low achievement standards set on the Illinois state assessment; offered as a direct instance of the lies told to students and parents in Chicago and as a general example of the so-called  race to the bottom by states across the country as they prepared for NCLB accountability requirements.  In the second reference, a DOE official in New Jersey is simply a pawn in a story detailing how the arrogance/incompetence of the Christie administration led to the state not being awarded millions of dollars of Race to the Top funding.  The third was a reference to speaking with Connecticut’s “chief education officer” on the day of the Sandy Hook shooting in the emotional and powerful chapter on guns in schools and society.

I guess this should not be a surprise.  Arne made his mark at the district level and it is clear that his vocation is in schools.  He does acknowledge the role that strong (and weak) governors can play in improving education, but like many in education does not seem to have a handle on the role that a state department of education can and should play.

Can the department be more than simply an agent implementing the policies of the federal government, governor or state chief? Can a state department of education be a change agent on its own? It behooves those of us who have centered our careers at the state level to be proactive in answering that question.

4. Time Travel

Reading How Schools Work, I felt that I had traveled back in time.  It is the same feeling that I get when I read remarks from former President Obama; and I am sure I would feel the same if I spent the $500 for an Intimate Conversation with Michele Obama.  It is the sense of hope and change that had me sitting in a store front office in Portsmouth, New Hampshire in the summer and fall of 2007 updating databases and making phone calls for an upstart candidate for president.

Then I remember that it is 2018.  This group had their eight years in office.  Yes, they made some improvements, but they fell far short of achieving their vision.  I understand the obstacles in their way.  What I have not yet determined for myself is how hard they tried to overcome those obstacles. And the frightening thought, if they did do their absolute best then what will it take and how long will it take to truly make a difference?

5. The Public School Model

It may be confirmation bias, but after reading How Schools Work I am convinced now more than ever that our public school model is not only broken but is outdated and is not something that we should try to repair.

To be clear, the ideal and concept of public education (i.e., the right to access for all to a high quality education) is as important as it ever was, arguably more important.

Also, there are fine schools and educators in suburbs, rural towns, and cities across the country where children are receiving a world-class education.

Our general model of K-12 public education, however, is broken at its core.  The funding model is not sustainable. We are well beyond the point where it is possible to fit the student-centered policies of the last 50 years into an educator-centered system.  We have burst through the age-based boundaries of the K-12 system at both ends and we long ago passed the point where the internal markers of grade levels have any meaning.

Everything in Arne’s book from his mother’s after school program to the foundation(s) he founded to his experiences in Chicago and USED to his plans for the future tell us that we need a new model for public education.

Arne and many of the rest of us have spent our lives trying to improve education from within the current system.  Arne’s mother worked outside of the system – although not necessarily by choice. I think that it is time to abandon a K-12 system clinging to a past that no longer exists for a new system that reflects the present and anticipates the future.

My Miss Brooks

Charlie DePascale

Our Miss Brooks was a highly successful comedy series on radio and early television that followed the life and career of a fictional high school English teacher, Connie Brooks. My Miss Brooks, Ann Brooks, was a highly successful teacher of the fifth and sixth grade Advanced Work Class at the Mather School in Dorchester, Massachusetts when I entered her class in the fall of 1969.

The Advanced Work Class (AWC) was, and still is, a program within the Boston Public Schools “that provides an accelerated academic curriculum for highly motivated and academically capable students. Coursework is challenging, and performance standards are high.”  According to BPS and borne out by data, a major benefit of the program is “[s]tudents who successfully complete AWC are well prepared to compete for admission to the three BPS exam schools or to other accelerated programs.”

In my 1969 instantiation of the AWC, 20 students from elementary schools throughout Dorchester (the largest “neighborhood” in Boston) were selected to spend 5th and 6th grade in Miss Brooks’ class at the Mather School.  There may have been some testing involved in the selection process, perhaps including IQ testing, but I was unaware of that.

The class included 10 girls and 10 boys and we were diverse by Boston/Dorchester standards of the time; that is, there were students from Irish and Italian backgrounds (along with a few other ethnic groups) and the class was 90% white.  We were from a mix of blue- and white-collar middle class families. Almost all of the original 20 students completed the two years, but there were a couple of replacements along the way.

From the beginning of the fifth grade, the openly acknowledged goal was that at the end of the two-year program all of us would pass the entrance exam to one of the city’s Latin schools: Boston Latin School (aka Boys Latin) for the boys and Girls Latin for the girls.  (The two single-sex grade 7-12 Latin schools became coed as we were entering the eighth grade and remain separate coeducational schools today.)

Although passing the standardized, multiple-choice test administered in the spring of sixth grade was the goal, as I think back there is nothing that I recall from those two years that now would be considered test prep. I am certain that I am forgetting some things through the fog of 50 years.  Surely, we must have had some basic English and mathematics lessons.  There were quizzes, tests, grades, and lots of homework. Those things, however, were not what defined the class, and they are not what I remember from this pivotal time in my K-12 school career.

It was clear that this was going to be a different experience the moment we walked through the door of Room 8 at the Mather School. For the first time since kindergarten, this was not a classroom with rows of wooden desks bolted to the floor.  This room contained shiny modern desks that were arranged around the room in four u-shaped clusters of five, but could be easily rearranged or cleared away, when necessary – and there were plenty of times when it was necessary.  And then there was the first class activity.

A boy and girl were selected to stand at the front of the class, introduce themselves to each other and have a conversation.  Looking back, we could have gone with where did you go to school last year or what did you do this summer; and sure, we were just weeks removed from minor events like the first moon landing, Woodstock, and Chappaquiddick.  But, standing at the front of that class we had nothing but fidgeting and uncomfortable silence. I tried unsuccessfully for two years to talk to that little red-haired girl…  No wait, I was the one with red hair and that’s a different Charlie’s story.

Anyway, those first awkward conversations were just the beginning of two years of constant interacting, collaborating, performing, and celebrating with each other. The biggest event was the annual Christmas play our class performed; rehearsals throughout the fall culminating in two school-wide performances – for grades 1-3 and 4-6. These were full performances with hand-painted, wood-frame sets, costumes, and props.  Our 5th grade performance of Charles Dickens’ A Christmas Carol was followed in the 6th grade with the heart-wrenching The Birds’ Christmas Carol by Kate Douglas Wiggin. (It was at the class Christmas party following the 6th grade performance that I learned that Jeremiah was a bullfrog.)

In addition to the Christmas plays, other examples of special activities included.

  • Our class newspaper complete with school and local news, sports, entertainment, and comic sections. Mimeographed copies were widely distributed.
  • The Greek festival at the end of our unit on Ancient Greece where we made presentations, displayed the results of our efforts working with wet clay, and most of us had our first taste of feta cheese and baklava.
  • Our performance of Raindrops Keep Fallin’ On My Head, in costume, and in French at the annual schoolwide Mother and Daughter night.
  • Keeping with the French theme, the end-of-the-year French festival where we tried our hand at making various French dishes and produced a mimeographed collection of recipes. The French custard recipe became a Father’s Day tradition at our house.

All of those activities supplemented the discussions, collaborative projects, and presentations that were a regular part of our daily routine. And we constantly rearranged those desks into various small groups where we pushed, challenged, and supported each other.

In the spring of sixth grade we all passed the standardized entrance exam and were admitted into our respective Latin schools. And six (or seven) years later, most of us graduated from either Boston Latin School or the newly named Boston Latin Academy. We were prepared for the school and not simply for the test.

As I look back on it now, preparing us to succeed at the school was much more important than preparing us for the test because, in reality, the entrance exam was not a high-stakes or high-risk test for us. Yes, the Latin schools were selective and admission was competitive.  In the early 1970s, I estimate that there were about 10,000 sixth graders in the Boston Public Schools. If equally divided among boys and girls that would be 5,000 boys competing for the approximately 500 seventh grade seats at Boston Latin.

There was probably little doubt, however, that our carefully selected group of 10 boys would perform in the top 10% of BPS students on the entrance exam. What we didn’t understand at the time was that the hard part was staying in the school and graduating.  Although approximately 500 students entered in the seventh grade in 1971 and an additional batch of students entered our class in the ninth grade, our graduating class in 1977 had just over 200 students. During 7th grade orientation we received the Latin school version of look at the boy on your left and the boy on your right, two of you won’t be here by 12th grade.

In 1971, the entrance exam was a broad net, collecting three times as many students as would ultimately graduate.  Additional filtering was done at the school.   There was a large tolerance for selection error on the test.

The admissions math changed, of course, the very next year when the school became coed, potentially doubling the pool of applicants.  It changed again as enrollment in the Boston Public Schools dwindled and a much greater portion of the Latin School class came from private elementary schools. And at some point in the intervening years, the admissions philosophy changed.  The goal was to do what was necessary to ensure that all admitted students had the opportunity to make it to graduation.  Last year, Boston Latin had 417 seventh grade students and 412 twelfth grade students.

All of the changes described above raise the stakes associated with the entrance exam.  I wonder what the impact has been on the Advanced Work Class.

room 8

Recipe and Play Script

Ten Years of Taylor

Ten Years of Taylor cropped

Charlie DePascale

Ten years ago, August 22, 2008, my daughter and I attended our first Taylor Swift concert in Hartford, Connecticut. The original plan was to attend the concert a few weeks later in Massachusetts – a bit closer to our home in Maine.  About to start high school, however, she was a bit worried (i.e., panicked): what if I have too much homework …  So, we made the late August trip to Hartford.  (The September concert was our second concert.)

This summer, my wife and I visited our daughter in Maryland.  On a very hot and humid night at FedEx field, the three of us attended my 30th Taylor Swift concert.

There are so many memories across 10 years and 30 shows – from that 8-song set opening for Rascal Flatts to the 2-hour reputation Stadium Tour show.

With my daughter –

  • Finding a public library with wifi close to my meeting in Rhode Island, waiting for Fearless tickets to go on sale and then sending her the two-word e-mail, We’re In!
    • Later that summer driving home from that concert in Connecticut through a tropical storm.
  • Eating lunch in the Wesleyan dining hall before driving home for Christmas and deciding, sure we can make the drive to Philadelphia for a concert during spring break; then buying tickets when they went on sale at noon.
  • Foxboro in the rain (more on that later)
  • Stopping in DC for a concert on the way to college visits in North Carolina and Virginia
  • Driving to Tanglewood on the rumor that Taylor would perform at the James Taylor concert. (she did)
  • Capping off a family trip to Colorado, Wyoming, and Mt. Rushmore with a concert in Denver
  • Walking around Georgetown last November taking pictures with ‘reputation’ UPS trucks.

And on my own –

  • Trips driving around North Carolina for back-to-back shows in the Raleigh – Greensboro – Charlotte triangle (extended to Charlottesville/UVA and back-to-back-to-back shows in 2013)
  • The 20-hour visit to NYC via Amtrak during Thanksgiving week for a concert at Madison Square Garden – with surprise guest James Taylor
  • Combining concerts in the Twin Cities with visits to old friends from the University of Minnesota

There are two memories, however, that will always stand out above the rest.

Taylor in Maine – aka Hello, Boats!  (August 27, 2010)

DSC09279

Hello, Boats!

Taylor was in Maine for the premiere of the video for Mine, to be broadcast that evening on CMT. The radio was full of Taylor Sightings and rumors about the location of the secret broadcast. I was spending one of the last Fridays of the summer with the family.  At the end of lunch, I decided to turn on my laptop to check my e-mail – no smartphone at that time.

There in my Inbox was an e-mail from Taylor Nation with the subject line “You’re Invited: Taylor’s CMT Taping in Maine”.  The e-mail contained instructions about how to dress/behave for a live television show, and directions to a school in Kennebunkport where a bus would take guests to/from the still unnamed location of the taping.

This had to be some sort of prank.  Sure, I lived in Maine, had attended a few concerts, and had already purchased way too much merchandise, but still…   But what if it’s real?  All we had to lose was the time for a short drive to Kennebunkport.

The e-mail didn’t mention anything about bringing a guest.  If it’s real, will it work for two people?  We made a plan and my daughter and I drove to the pickup location.

The good news, there was a bus and a small group of people.  The bad news, all of the other people seemed to have had some involvement in the taping of the video earlier that summer – as extras, volunteers, or staff in places Taylor visited.  There was a local person with a check-in list.  Of course, we weren’t on her list and she knew nothing about the e-mail invitation. Thankfully, after reading the invitation and looking at my daughter she told us to get on the bus!

 

DSC09240

 

While the bus was on its way, the secret location was announced on the radio.  When we arrived, there was a crowd lining the street and driveway leading up to the seaside property where the show would take place.  With orange wristbands secure, we were led through the crowd to our designated area on the lawn.  We were told what to expect and how to react during the run through without Taylor, the rehearsal with Taylor, the live show, and that there would be a small concert following the show.  We were told that some of George H.W. Bush’s grandchildren were there and that the former President would arrive soon.

I stayed off to the side and watched Taylor arrive, and then a bit later, President Bush.  My daughter, being better able to fit in small places worked her way up toward the front of the crowd.  It all went perfectly.  Taylor came out for the interview, turned and waved hello to the crowd of boats off shore, and the video premiered.  After a short break Taylor returned with her band returned and performed a short concert.  What a night!

Foxboro in the Rain (June 25, 2011)

DSC01404

I‘m not a fan of stadium shows, but I have attended five of Taylor’s ten concerts at Gillette Stadium (only half, clearly not obsessed). Each Gillette show has been a memorable experience, but none will ever match Foxboro in the Rain.

My daughter and I didn’t sit together.  I had a seat on the floor near the stage. She didn’t want to sit on the floor, so she had a seat near the top of the lower level – in what fortuitously was one of the few covered rows at Gillette Stadium.

For me, the two highlights of the night were Mean and the rain.  Coming into the show, I didn’t understand how strongly young fans related to Mean.  Many of Taylor’s songs were autobiographical, but Mean was the first that was not a generic experience. It was explicitly about her and the claim that she couldn’t sing – “drunk and grumbling on about how I can’t sing.”  I thought that the direct reference made the song less relatable. Apparently, however, the magic of Mean is that direct connection to Taylor’s experience.

Walking around before the show we saw a sea of ‘why you gotta be so mean’ t-shirts and small groups singing the song all around the stadium.  And then the concert; sitting in front of the stage and hearing 50,000 young voices singing out strong

But someday I’ll be living in a big old city
And all you’re ever gonna be is mean, yeah
Someday I’ll be big enough so you can’t hit me
And all you’re ever gonna be is mean
Why you gotta be so mean?

It was an overwhelming experience.

DSC01050

And then there was the rain.  It started out as a light shower over part of the stadium.  My daughter thought it was part of the show – Oh, they even have fake rain. There had already been fake snow earlier in the show (Back to December). It all fit together.  No, the rain was real.  It rained hard and it kept raining.  Taylor kept singing, dancing, and lying in puddles of water on the stage when the choreography called for it.  And fans kept singing, dancing, and taking pictures with their iPhones. This was so different than the time I was soaked to the skin in my wool band uniform at a Harvard-Cornell football game.

(I was amazed that the phones were not damaged by the rain.  Reading the online forums the next day, many of them were.)

As it turns out, it’s not a good idea to lie in puddles of water singing in the rain. Within a week, Taylor was sick and had to postpone shows.  I was there for her first show back in Montreal in mid-July, but that’s another story…

It Was The Start of a Decade…

When my daughter and I drove to Hartford in August 2008 I had no idea it was the start of a decade of 30 shows in eleven states plus Washington, DC plus Montreal.   The first time I listened to her first album, ‘Taylor Swift’, I knew these were great stories and this was a great storyteller. When I heard Fifteen those were words I wanted to tell my daughter.  And when I heard The Best Day those were words I hoped to hear from my daughter someday.  And when I heard All Too Well, well ….

So, next weekend I will sit in U.S. Bank Stadium in Minneapolis, sobbing through Long Live one more time.  And if fate steps in and there are no more Taylor Swift tours or concerts for me after this summer, I know that for as long as I live, and I hope as long as my daughter lives, these last ten years will be remembered.

 

 

 

 

 

Rebranding Educational Measurement

Charlie DePascale

When I think about educational measurement the first thing that comes to mind is a high-fructose corn syrup commercial from about 10 years ago.

 

 

On one side there is the man who holds, but cannot articulate, the widespread, but ill-defined, perception (misperception?) that high-fructose corn syrup is inherently bad.  On the other side is the woman with the tempting treat who provides a couple of carefully selected facts and makes the claim that high-fructose corn syrup is fine in moderation.  Man takes the treat from Woman and all is right in their world 30-second commercial world. It is truly an Adam and Eve moment – although that’s probably not the allusion the sponsors of the commercial were after.

In 2018, as a field and as an industry, educational measurement finds itself in much the same place as high fructose corn syrup.  We developed an appealing, inexpensive product (i.e., large-scale standardized tests), exulted in its success, and then could do little when we lost control of the product that defines us.   For much of this century, we have taken the role of the woman in the HFCS commercial.   We ensure people that there is nothing harmful or evil about educational measurement used properly and in moderation; all the while watching test use soar beyond anything that can be called moderation. Assuming that it will be impossible to produce a cute 30-second video on the benefits of educational measurement that will be as effective as the counterargument that John Oliver has already produced, where do we go from here?

I think that the only solution is to engage aggressively in rebranding. Educational Measurement is ripe for a makeover or perhaps even a complete do-over.  Now is the time to change not only the surface image of educational measurement, but to actually change what we mean when we talk about educational measurement.

Two assessment industry icons have already started this rebranding.  College Board began by changing the name of the SAT (much like KFC), conducted a major overhaul of their flagship instrument, and then created a new suite of products and services aimed at a new market.  ACT has gone even further as it redefines itself as a learning company rather than an assessment company.  As described in an EdSurge article earlier this year, ACT CEO Marten Roorda “wants the ACT to become more involved in the learning process, and provide more analytics solutions to teachers and students. “

Reforms to the ACT and SAT assessments, of course, are just the tip of the iceberg. Learning analytics, big data, personalized instruction, and adaptive learning are trending topics in education which are already impacting the measurement community.  At the ITC 2018 conference in July, John Hattie and Alina von Davier delivered keynote addresses on visible learning and computational psychometrics, respectively, which forced those listening to reconsider how we think about and do educational measurement.  As Kathleen Scalise explained at the 2018 NCME conference in April, it is not a question of if or when big data and learning analytics will impact educational measurement, they are already here and they already have.

Back to our roots

We must begin any attempt to rebrand, redefine, or refocus educational measurement by revisiting our roots.  And the best place to reconnect with those roots is the first edition of the so-called bible of our field, Educational Measurement, published in 1951.  We choose this as a starting point because as explained by E.F. Lindquist (editor), “prior to its publication … no book had yet been published that would even begin to fill an urgent need…for a comprehensive handbook and textbook on the theory and technique of educational measurement.”

It can also be argued that among the four editions of Educational Measurement (1951, 1971, 1989, 2006), the initial edition made the best attempt to ask and answer the Why? Question; that is, to define the purpose of educational measurement.  It is from understanding the purpose of educational measurement that we are able to glean the core values and guiding principles of our field which is the first step

In Part 1, The Functions of Measurement in Education, the 1951 edition begins with four chapters that address fundamental issues related to the primary functions of measurement in education at that time:

  • The Functions of Measurement in the Facilitation of Learning
  • The Functions of Measurement in Improving Instruction
  • The Functions of Measurement in Counseling
  • The Functions of Measurement in Educational Placement

Part 3, Measurement Theory, begins with a chapter on The Fundamental Nature of Measurement that ends with a section titled, Explanation as the End of Measurement, and the following admonition:

The primary concern of measurement, however, should be for an understanding of the entire field of knowledge rather than with statistical or mathematical manipulations upon observations.

Knowledge will be advanced by recognizing what the empirical methods of measurement ignore…The aim of measurement must ever be the explanation of, or the meaning for, observed phenomena.

A practical application of those statements is provided by Ralph Tyler in describing the organization of his chapter on the functions of measurement in improving instruction:

Since the purpose of this chapter is to outline the ways in which educational measurement, that is, achievement testing, can serve to improve instruction, we shall consider first what steps are involved in an effective program of instruction and then indicate the contributions that achievement testing can make to each of these steps.  In this connection it will be noted that educational measurement is conceived, not as a process quite apart from instruction, but rather as an integral part of it.

Tyler then goes on to describe four sequential phases of instruction

  1. To decide what ends to seek; that is what changes in student behavior to try to bring about
  2. To determine the content and learning experiences likely to attain those ends
  3. To determine an effective organization of those learning experiences to bring about the desired ends effectively and efficiently
  4. To appraise the effects of the learning experiences to determine whether they have brought about the desired ends or changes in student behavior.

He argued in 1951 that educational measurement as a field had become stuck on Step 4 (documenting the effects of instruction); not focusing enough on how educational measurement can and should inform and support the other three steps of instruction.

This problem has only been exacerbated in the last 60+ years as our field has become more technical, more specialized, and more separated from instruction.  While we may pay lip service to the notion that a key purpose of educational measurement is to facilitate learning and improve instruction, we do little to understand and support that function.

Educational measurement must find a way to support all aspects of the instructional process; toward the ultimate goal of improving student learning.  And having taken on that task, we must find a way to convey the message that measurement is more than a test of student outcomes.

Rebuilding and Rebranding

Obviously, rebuilding and rebranding educational measurement will not be simple.  It will require more than a quick fix like a 30-second commercial, a catchy new slogan, or a name change. However, although not sufficient, I do think that a name change is necessary.  The term ‘educational measurement’ is too closely associated with achievement testing to continue to serve a useful purpose.  Additionally, it does not accurately reflect either what we have been doing as a field for the last 60 years or the new directions in which the field is moving (e.g., with a focus on personalization and computational psychometrics).

My suggestion for a starting point is to replace the term ‘measurement’ with ‘modeling’ – Educational Modeling.  What is the case for modeling? Just for starters …

  1. With a few notable exceptions, modeling is a much more accurate description of what we do as a field than measurement. (Yes, I see you out there Rasch folks.)
  2. By its very nature, the term modeling conveys a sense of concern with an entire process or an entire system and the interactions among the components of that system.
  3. Measurement, not just educational measurement, is an outdated 20th century concept. The 21st century world is just much too complex to measure. Our field, and psychology in general, latched on to the term measurement last century because it was cool and gave the field credibility. Modeling is the new measurement.
  4. Finally, through the Common Core State Standards (and its offspring) we have invested nearly a decade in spreading the word to K-12 educators, students, and the general public of the importance of modeling and its central role in all that we do as intelligent human beings.  Let’s take advantage of that and create coherence between what we say and what we do in education.

The Common Core defines modeling as “the process of choosing and using appropriate mathematics and statistics to analyze empirical situations, to understand them better, and to improve decisions.”  That sounds like what we are doing (or should be doing) in educational measurement.  In further describing modeling, the Common Core further states “Real-world situations are not organized and labeled for analysis; formulating tractable models, representing such models, and analyzing them is appropriately a creative process. Like every such process, this depends on acquired expertise as well as creativity.”

Again, isn’t that what we are supposed to be doing in educational measurement?

Let the games begin

To get this ball rolling, I call on NCME, consistent with their vision to be the recognized authority in measurement in education, to take the first step by changing their name to the National Council on Modeling in Education.  They won’t even have to change their logo, URL, or Twitter handle,

The next step would be for NCME and co-editors Linda Cook and Mary Pitoniak to make the upcoming 5th edition of our bible, Educational Measurement, a New Testament for our field.  Educational Modeling has a nice ring to it as a title.

We have to start somewhere to restore the reputation of educational measurement.

Are you ready for it?