assessment, accountability, and other important stuff

Implausible Values

Equating in the early part of the 21st century

Charlie DePascale

Equating

Our field is facing a crisis brought on by implausible values. The values which threaten us, however, are not the assessment results questioned above.  Those are only the byproduct of the values which our field and society have adopted with regard to K-12 large-scale assessment.

That is, the values which lead us to wait more than a year for the results of the 2017 NAEP Reading and Mathematics tests while expecting nearly instantaneous, real-time results from state assessments like Smarter Balanced, PARCC, and the custom assessments administered by states across the country.

NAEP, the “nation’s report card,” is an assessment program that does not report results at the individual student, school, or even district level (in most cases). NAEP results have no direct consequences for students, teachers, or school administrators.

State assessment results for individual students are sent home to parents.  School and district results are reported and analyzed by local media and may have very real consequences for teachers, administrators, and central office staff.

It is a paradox that equating for annual state assessment programs with a dozen tests and multiple forms is often carried out within a week while results for the four NAEP tests administered every other year can be delayed indefinitely with the explanation that

“Extensive research, with careful, detailed, sophisticated analyses by national experts about how the digital transition affects comparisons to previous results is ongoing and has not yet concluded,” 

Of course, it is precisely because NAEP results have no real consequences or use that we are willing to wait patiently, or disinterestedly, until they are released.  Can anyone imagine a state department of education posting a Twitter poll such as this?

naep 2017

The reality, however, is that the NAEP model is much closer to the time and care that should be devoted to equating (or linking) large-scale state assessment results across forms and years than current best practices with state assessments.

Everything you wanted to know about equating but were afraid to ask

To a large extent, equating is still one of the black boxes of large-scale assessment.  It is that thing that the psycho-magicians do so that we can claim with confidence that results are comparable from one year to the next – not to mention valid, reliable, and fair.

Well, let’s take a quick peek inside the black box.

There are two distinct parts to equating – the technical part and the conceptual/theoretical part.

In reality, the technical part is pretty straightforward; at most it is a DOK Level 2 task.  There are pre-determined procedures to follow, most of which can be automated. It’s so simple that even a music major from a liberal arts college can pick it up pretty quickly (self-reference). That’s what makes it possible to “equate” dozens of test forms in a week; or made it possible for a former state department psychometrician to boast that he conducted 500 equatings per year.

Unfortunately, the technical part leaves you few options when the results just don’t make any sense.

That brings us to the conceptual and theoretical part of equating, which involves few, if any, complicated equations, but is the much more complex part of equating.

As a starting point, it’s important that we don’t confuse the concepts and theory behind the technical aspects of equating with the theoretical part of equating.  That’s a rookie mistake or a veteran diversion.

The concepts and theories that should concern us are those related to how students will perform on two different test forms or sets of items; or on the same test form taken on two different devices; or on a test form administered with accommodations; or on a test form translated into another language or adapted into plain English; or on test forms administered under different testing conditions; or on test forms administered at different times of the year or at different points in the instructional cycle.  The list goes on and on.

Unfortunately, we know a lot less about each and every one of those condititions than we do about the technical aspects of equating.

In the past, our go to solution was to try to develop test forms that required as little equating as possible. That approach, sadly, is no longer viable.  We have now moved beyond equating test forms to applying procedures to add new items to a large item pool; that is, to place the items on a “common scale” with the other items in the pool.

It was also tremendously helpful that in the past we didn’t really expect any change in performance at the state level from one year to the next.  That is, we had a known target, or a fixed point, against which to compare the results of our equating.  If the state average moved more than a slight wobble, we went back to find the problem in our analyses.  It was a simpler time.

Where we go from here

We cannot return to that simpler time, but neither can we abandon some if its basic principles.

When developing new technology, as we are doing now with large-scale and personalized assessment, it is important to have a known target against which to evaluate our results.  When MIT professor Harold ‘Doc’ Edgerton was testing underwater SONAR systems, he is quoted as saying edgertonthat one of the advantages of testing the systems in the Boston Harbor was that the tunnels submerged in the harbor didn’t move.  He knew where they were and they were always in the same place.

We need the education equivalent of a harbor tunnel against which to evaluate our beliefs, theories, procedures, and results.  We are now in a situation where the amount of change in student performance that has occurred from one year to the next is determined solely by equating.  There is no way to verify (of falsify) equating results outside of the system.  That is not a good position from which to operate a high-stakes assessment program; particularly at a time when so many key components of the system are in transition.

Finding such a fixed target is not impossible, but it is not something that can be done on the fly. We cannot continue to move from operational test to operational test.

Our current model of test development for state assessment programs rarely includes any opportunity for pilot testing.  That has to change.

We need to rely less on the technical aspects of equating and invest more in understanding the concept of equating.

We need a better understanding of student learning, student performance, and how student performance changes over time before we build our assessments and equating models.

We need to be humble and acknowledge our limitations.  A certain degree of uncertainty is not a bad thing, if its presence is understood.

Finally, we need to move beyond the point where whenever I think about equating, this scene from Apollo 13 immediately comes to mind.

My 12 Memories of Christmas

Charlie DePascale

 

As time goes by, certain memories of Christmases past become stronger than others.  Most are filled with family, food, music, and fun; but a few other things manage to creep in as well.  On Christmas 2017, here is a list of my 12 memories that mean Christmas to me.

  1. The First Noel 

My sister, 2 years younger than me, was at a stage where she recognized letters but could not yet read.  One Sunday afternoon before Christmas, my father was at the sink washing dishes and my sister was in the pantry looking at the box of Christmas cookies with the word NOEL written across the package.  She asks my father, “What does N-O-E-L spell?”.  She hears his response “No L” and responds “Yes, there is an L”  After 5 minutes of my father trying to explain, my sister becoming increasingly frustrated, and me laughing hysterically, I knew that Christmas always was going to be one of my favorite holidays.

  1. The Meaning of Christmas –

Beginning with Rudolph (1964) and Charlie Brown (1965) , followed by the Grinch (1966) and Santa Claus is Coming to Town (1970), I learned the “specials” meaning of Christmas  –

There’s always tomorrow for dreams to come true, believe in your dreams come what may

Maybe Christmas, he thought… doesn’t come from a store.  Maybe Christmas, perhaps… means a little bit more!

Christmas Day is in our grasp! So long as we have hands to clasp!

You put one foot in front of the other, And soon you’ll be walking ‘cross the floor. You put one foot in front of the other, And soon you’ll be walking out the door

 

And, of course.

 

 

And just in time in 1974, along came The Year Without A Santa Claus with the Heat Miser, Cold Miser and the advice to “just believe in Santa Claus like you believe in love.”

 

  1. Smile and say Santa

It wouldn’t be Christmas, of course, without the photo card of me and my sister (and later the dog).  In the days before Shutterfly, digital cameras, and even 1-hour photo that meant family “photo shoots” beginning in late summer or early fall to ensure a good picture.  Most years that involved bring a few Christmas decorations up from the basement and dressing us in nice clothes.  One year, my Mom decided that an outside shot would be nice.  So, in August we were hanging tinsel and decorations on the evergreen tree in my aunt’s front yard and donning our winter coats.

 

IMG_4765

 

  1. 1968

In 1967, I was aware of the space program and the Red Sox.  In 1968, the rest of the outside world came crashing through the door.  From the Pueblo to the Tet Offensive to Martin Luther King to Bobby Kennedy to the Democratic Convention to the protest at the Olympics to the election of Richard Nixon everything seemed to be spinning out of control.  And then came Apollo 8 and its Christmas Eve broadcast while orbiting the moon.

 

  1. Grand Funk Railroad – We’re an American Band

Throughout my childhood the annual Smyth family Christmas party brought together three generations of my mother’s side of the family. Grandparent, aunts and uncles, and most of our gaggle of 17 cousins gathered for an afternoon of food, games with Christmas-themed trinkets as prizes, Christmas music, etc. In the mid-1970s one of my older teenage cousins decided to replace the Christmas album on the record player with a pretty yellow album and music I had never heard before.  A few minutes later, parents entered from the kitchen, the album was flying across the room toward the wall, and our annual Christmas parties were no more.  I still use We’re American Band as the song on my alarm clock.

IMG_4767

 

  1. Here We Come a Caroling

From 1971 through 1977, I attended Boston Latin School – the oldest public school in the country and a place steeped in tradition.  One of the more recent, informal, and beloved informal traditions was members of the senior class singing Christmas Carols on the balcony overhanging the cafeteria during lunches on the day before Christmas vacation.  Nothing, even traditions, however, lasts forever. In 1972, the school became coed.  Sometime in that period, they took away the one minute of silent meditation and reflection at the beginning of the day.  In my senior year, they told us Christmas carols were no longer allowed in school and administrators and faculty did everything they could to thwart their singing.  Growing up in Boston, I had not realized that Christmas was controversial.

  1. Our 2nd Christmas Together

My wife, Lisa, and I were married in September 1984, and our first two years of married life were spent in Minnesota while I completed coursework for PhD program.  This meant leaving our apartment in Minneapolis in mid-December to return “home” to Boston for Christmas. The first year, we decided not to decorate the apartment because we would not be there for Christmas.  A bad idea. The second year, around Thanksgiving we headed to the Goodwill store up the street, picked up a silver aluminum tree, and all of the boxes of old ornaments that we could get for $25 (and could fit into our Sentra).  We had our first Christmas tree and have had one every year since then (although never silver again).  Upon returning to Minneapolis after Christmas break, we donated the tree and ornaments back to Goodwill.

  1. The Brown Box

Lisa is one of those lovely people with the tremendously annoying ability to pick up a wrapped present and know immediately what it is.  One year, my parents were determined to stymie her.  They bought her a sewing box from Italy made of beautiful brown wood.  It looked like they had her.  As she removed the wrapping paper and stared at the brown cardboard box with no clue what it contained, she reported what she saw – a brown box.  My Dad, mistakenly thinking that she had guessed that it was the “brown sewing box” was beside himself, gave away the surprise, hilarity ensued, and an annual Christmas story was born.

  1. Christmas Eve (part 1)

Everywhere else, December 24 was Christmas Eve.  For our family, however, it was first and foremost my grandfather’s birthday.  That meant a wonderful Italian dinner followed by an evening with his entire family (see #5 above) filling the two-family home we shared with them.  After lots of food, laughter, and probably just a little drinking, the evening invariably ended in our living room with my sister at the piano and my uncles leading the singing of Christmas carols and old standards from the 1930s, 40s, and 50s.

  1. Christmas Eve (part 2)

After my grandfather’s passing, Christmas Eve became a night to celebrate with my wife’s family.  That became a bit more complicated when her sister married a “nice Jewish boy from Long Island” and they were raising their children in that faith.  In the spirit of family peace and harmony, the compromise was “Chanukah presents”, wrapped in a limited variety of blue Chanukah paper (buy as many rolls as you can when you see it), and place carefully under the Christmas tree.  Nothing confusing there for a child, right?  One year, when my niece was around 4 years old, I asked her what all the presents were for.  She looked up and replied matter-of-factly, “After dinner.”

  1. Dashing Through The Snow

In 2002, our Christmas Day visit to my parents’ house was threatened by an impending snowstorm.  The storm was expected to start by late morning and dump about a foot of snow on us in Southern Maine.  Christmas was canceled this year.  Not so fast, in the spirit of Rudolph, we loaded up the sleigh (car with presents) right after breakfast and started out on the 90-minute drive to my parents.  We called my parents from the car as we were approaching their house (during which call, my Mom told us to hold on someone was pulling into their driveway).  They got to give Christmas presents to their granddaughter in a whirlwind visit (which was not quite as much of a whirlwind as it should have been), and we made it almost all the way back home to Maine before the roads were covered with snow.

  1. Our child is born

 

IMG_4769

In December 1993, we were preparing for the birth of our first child – who was due in late January.  Mary had other plans and was born on December 15.  Starting out a just a bit under 4 pounds, she spent the first ten days of her life in the hospital in an isolette (or baby incubator).  When we arrived at the hospital on Christmas morning, however, she was out of the isolette for the first time and dressed in a bright green Christmas onesie.

 

We spent that day holding our Christmas miracle and starting a whole new set of Christmas memories with her.  To my staples of Rudolph, Charlie Brown, etc. were added Elmo Saves Christmas, Arthur’s Perfect Christmas, Elf, and The Polar Express.

So, as the song says ….

 

Have yourself a merry little Christmas
Let your heart be light
From now on, our troubles will be out of sight
Have yourself a merry little Christmas
Make the Yuletide gay
From now on, our troubles will be miles away
Here we are as in olden days
Happy golden days of yore
Faithful friends who are dear to us
Gather near to us once more
Through the years we all will be together
If the fates allow
So hang a shining star upon the highest bough
And have yourself a merry little Christmas now

The Physics of Psychometrics

Charlie DePascale

A metaphysical tale of the past, present, and future relationship between psychometrics and educational assessment.

 

A cautionary tale

ybr

Charlie DePascale

Earlier this month I traveled to Lawrence, Kansas to attend the NCME special conference on the confluence of classroom assessment and large-scale psychometrics. In a panel discussion titled, “I’ve a feeling we’re not in Kansas anymore” Kristen Huff, Karen Barton, Paul Nichols, and I shared the perspective that when bringing together classroom assessment and large-scale psychometrics, co-existence might be a better goal than confluence.  Our goal  was to engage in a critical discussion about those differences between the purposes and uses of classroom and large-scale assessment that give each unique measurement and psychometric needs. To accompany that discussion, we offered the following look over the rainbow at what might result if we blindly attempt to apply large-scale psychometrics in the classroom.

 

Gillikin County Schools Embrace Classroom Psychometrics

Emerald City –  (September 13, 2017)

As a new school year opens, teachers and administrators in Gillikin County, just north of Emerald City, are clicking their heels with excitement over the impact of their use of a new process called Classroom Psychometrics. Some call it science, others magic, but all agree that it has changed the way that they look at teaching and at their students.

Superintendent Glinda Marvel describes how the district made the leap to classroom psychometrics

Our teachers have been hooked on data since we introduced TestWiz ™ back at the turn of the century.  Over time, however, they found that the data wizard just did not provide them with enough information – did not let them see what was going on behind the curtain.  Then last year, this new computerized adaptive test (CAT) system – TOTO is dropped in our laps.  Now that’s a horse of a different color.

At first, all of the thetas and sigmas were just Greek to me. And many of our teachers were fearful.  They didn’t understand how test questions that discriminate and just getting rid of the misfits could be a good thing.  But then we saw that with TOTO none of our little munchkins ever have to get a question wrong.  The little lights on their individual dashboards are always green.

S. C. Strawman, doctor of thinkology and president of Emerald Associates, explains that the beauty of classroom psychometrics is that it untangles the messy webs one often encounters with learning progressions and popular learning map models – “once teachers realize that the manifest monotonicity of classroom psychometrics is their students’ manifest destiny not even a horde of flying monkeys can pull them off course.”

Or as Castle High School teacher, Almira Gulch put it, “understanding that everything can be explained by just one dimension made our lives so much easier.”  And principal Frank Marvel added, “accepting that there will be some kids who just don’t fit the model no matter what we do, took so much pressure off of our teachers.”

Of course, not everyone is sold on TOTO and classroom psychometrics. Dorothy Gale, longtime school board member, summed it up this way

Well…I think that it’s just that … Sometimes we go looking over the rainbow for the answer when it was really right there in our back yard all along.  Classroom Psychometrics really isn’t telling us anything new about our classrooms and students.  We just needed a little help seeing what was already there right in front of our noses.

Remember the Alamo

alamo1

The Alamo

 

Charlie DePascale

This spring  I returned to San Antonio to attend the 2017 NCME conference.  The trip brought back memories of my many visits to the Harcourt office there as a member of the MCAS management team for the Massachusetts Department of Education. My last MCAS trip was in August 2002.  Some things in San Antonio were exactly as I remembered them fifteen years ago, others had changed a little, and some are now merely memories.

HEM Badge

Harcourt, of course, is gone.  Schilo’s still has the best root beer I’ve ever tasted.  The Alamo, is still nestled in the midst of office buildings, souvenir shops, and tourist “museums”.  I still would have found the convention center with the street map I used in 2002 (no smart phone back then).  However, unlike the Alamo, the convention center had moved just a bit from where I remembered it being in 2002.

Like Harcourt, the original MCAS program is gone; finally replaced this spring by the next generation MCAS tests. But, what about assessment and accountability, in general?  How have they changed since the summer of 2002?  What is the same, what has changed a little, and what is now a memory?

Still crazy after all these years

Like the Alamo, much of the facade of assessment and accountability remains unchanged.

  • State content standards, achievement standards, annual testing, and accountability ratings are still the foundation of our work.  
  • Individual state assessments and achievement standards are still the norm despite the best efforts of a lot of earnest folks.
  • NAEP and its trends are still the gold standard. The nation’s report card appeared to be teetering on the brink of irrelevancy back in 2009 with the advent of the Common Core and state assessment consortia.  Like the cockroach and Twinkies, however, it appears that NAEP is impervious to any kind of attack.

What’s new

NCLB has come and gone, but its effects on state assessment and accountability remain. Technology has also had an impact on how we test, and to a lesser extent for the time being, what we test.

  • Annual testing of all students at grades 3 through 8 remains in place – the primary legacy of NCLB.  

We have not yet come up with a good reason for testing all students in English language arts and mathematics every year, but there is surprisingly strong support for doing so. No, telling parents the truth and being able to compute growth scores are not good reasons for administering state assessments to every student every year.

  • Growth as a complement to status and improvement – I place growth scores as a close second to the annual testing of all students when considering the legacy of NCLB; but second and not first only because annual testing was a sine qua non for growth scores.  

Despite all of the value they add to the interpretation of student and school performance, I remain skeptical about our use of growth scores. Much like the original ‘1% cap’ on alternate assessments, growth scores emerged as a means to include more students in the numerator when computing the “percent proficient” for NCLB.  Yes, there are sound reasons for giving schools credit for growth, but growth scores would have died a quick death if they could not have been used to help schools meet their AMO.

I am worried about what I have referred to as the slippery slope of growth; that is, the use of growth scores as the 21st century approach to accepting different expectations for different groups of students.

I am worried about the use of growth scores as another example of our predilection for treating the symptom rather than the disease.  People are not fit – take a pill to lose weight.  College debt is too high – improve student loans.  Students have no reasonable chance to master grade level standards within a school year – compute growth scores.

  • Online testing (née computer-based testing) replacing paper-and-pencil testing.  For years, it appeared that online testing was the proverbial carrot being dangled in front of the horse – always just a bit out of our reach.  Now, however, it appears that we have finally reached a tipping point.  Although there are still glitches, in states across the country the infrastructure is in place to support large-scale online testing.

Online testing itself, of course, is only one way in which technology has impacted large-scale assessment.  Figuring out the best way to make use of things like automated scoring, student information systems, adaptive testing, dynamic reporting, and the wealth of process data available from an online testing session should keep the next generation of testing and measurement specialists quite busy for years to come.

  • Communication across states There is now constant communication among states, and that communication occurs at multiple levels (commissioners/deputy commissioners, assessment directors, content specialists, etc.).  Some might argue that there is often more and better communication within levels across states than across levels within a state, but let’s save that discussion for another day.

The increased communication across states can be attributed to the common requirements and pain of NCLB, technology, the assessment consortia, or all of the above.  Cross-state communication, however, deserves its own place in any list of things that are new since 2002.   The bottom line is that although it may not have resulted in common assessments to the extent expected, increased communication across states has changed how we think about and how we do state assessment and accountability.

Gone, but not forgotten (I hope)

Nothing lasts forever (except NAEP trend lines, see above), but there are a few things that faded away or simply disappeared over the last fifteen years that surprised me.

  • District accountability – It feels as if there is a lot less of a focus on district accountability as something distinct from school accountability now than there was in 2002; even after NCLB was first enacted.   

District report cards are often simply aggregated school report cards, with districts evaluated on the same metrics and indicators as schools.  Although technology and laws/regulations have made it much easier for states to interact directly with schools, there is something critical in the hierarchy among states, districts, and schools that must be maintained.  Perhaps as the accountability pendulum swings slightly back from an exclusive focus on outcomes (i.e., test scores), the impact of district inputs, processes, and procedures on student performance will receive greater attention.

  • Standardization – So critical to large-scale assessment that it was actually part of the name (i.e., “standardized testing”), standardization is dead; and it will not be returning any time soon.  To some extent, standardization, in general, was a casualty of the backlash against traditional “fill in the bubble tests” and test-based accountability, but that is only part of the story.  

We (i.e., the assessment and measurement field) gave up standardization in administration conditions in stages.  Time limits were abandoned in the name of proficiency, “standards-based” education, and allowing kids to show what they could do  (don’t ask me to explain how that made sense).  Standardized administration windows were abandoned to accommodate the use of technology.  Concerns about validity and the appropriateness of accommodations for students with disabilities and English language learners gave way to “accommodations” for all in the name of equity and fairness.

We were never truly invested in standardization of content for individual students, so we gave that up willingly as it allowed us to play with measurement toys like adaptive testing, matrix sampling, and “extreme equating” that is worthy of the name.

As for standardization of scoring, well, if we limit the concept of scoring to mean the scoring of an item then as soon as we moved away from machine-scored, multiple-choice items, standardization of scoring took a hit.  The larger concept of scoring, however, is a bit too complicated to address adequately near the end of a post such as this.  For now, let’s just end our discussion of standardization of scoring with the question that has been posed in countless stories and songs, can you lose something that you never really had?

  • Time – I believe that the single most important change in large-scale assessment since 2002 may be the loss of time available to design, develop, and implement an assessment program.  

The RFP for the original MCAS program was issued in 1994; after some delay the contract was awarded in 1995; and the first operational MCAS tests were administered in spring 1998.  As new tests were added to the MCAS program, it was the norm for a test to be introduced via a full-scale pilot administration one year prior to the first operational administration.  

In contrast, the RFP for the next generation MCAS tests was issued in March 2016; the contract was awarded in August 2016; and the first operational MCAS tests were administered in spring 2017.

And Massachusetts is not alone.  In states across the country, it is becoming normal practice to go from RFP to operational test in less than a year.  This would not be as much of a problem if states were simply purchasing ready-made, commercial tests that had been carefully constructed and validated for use in the state’s particular context.  For most state assessments, however, that is not the case.

Four years from initial design to implementation may or may not be too long, but there is no question that 10 months is too short.  And I don’t want to hear “the perfect is the enemy of the good” or “building the plane while we are flying it” as arguments for minimizing the test development cycle.  First, those concepts don’t really apply to things like planes (and high-stakes assessments).  Second,  they do not apply in one-sided relationships in which mistakes are met with fines, lawsuits, and loss of contracts. Third, what’s the rush?

Where do we go from here?

I identified the loss of time as the most significant change not so much because of its negative impact on the quality of state assessments, but rather because of what it signifies.  The requirement to design, develop, and implement custom assessments in less than a year is a clear indication that the measurement and assessment community may have lost what little control or influence we had over assessment policy.  Recent trends in the area of comparability is another example.  We are more likely to be asked after the fact  “how to make things comparable” or to “show that they are comparable” than to be asked to offer input in advance on “whether an action will produce results that are comparable” To paraphrase an old expression, policymakers find it much easier to ask the measurement community for justification or a rationalization than permission.

In part, this may be a reflection of policymakers’  facing many constraints and limited options. In part, however, I believe that we have ourselves to blame; we are a victim of our success or at least of our own public relations machine.  Psychometricians can do anything with test scores!  Unfortunately, the field cannot survive as a scientific discipline by simply saying yes to every request that is made, regardless of how unrealistic or unreasonable that request may be.  (Note that I am making a distinction between testing companies surviving and the assessment/measurement field surviving.)

The fate of the field, however, may have already been sealed by the emergence of Big Data. Only time will tell.  

I will meet you at Schilo’s in 2032 and we can discuss it over a frosty mug of root beer. The first round is on me.

schilos

rootbeer

 

Charlie DePascale

 

Badges

12 days, 3 conferences: PowerPoint presentations, posters, uncomfortable chairs, and a few random thoughts.

  1. Conference presentations are an art form: whether it’s a keynote address, a 15-minute research presentation, an “electronic board” or a poster a good presentation must tell a story, make a point, and deliver a message. A picture can be worth thousands of words.  Thousands of words on a slide are not worth quite as much.

 

  1. Long breaks between sessions are nice. Finding the right break:session ratio can make or break a conference.

 

  1. With the right speakers and topics, two days of plenary sessions can be magical.

 

  1. It can be good to put yourself in situations where you’re the oldest person in the room for two days. (but you know, not in a creepy way)

 

  1. I hate effect size. It may have been an improvement over reporting simple significance levels, but that’s not enough.  You still have to be able to explain the impact that your treatment will have.  Even dreaded econometricians tell me that a certain treatment will result in a student earning $5 a week more for the next 40 years.  I don’t believe any of it, but I appreciate the effort.

 

  1. Every conference should have a few sessions where we share all of the things that didn’t work as planned, describe all of the treatments that had absolutely no impact at all. We can learn so much from things that didn’t work.

 

  1. I miss overhead transparencies. Sure they were messy and limited, but they seemed to breed spontaneity.  Pretty much every session in the transparency era included somebody grabbing a blank transparency and a marker to rebut or support a presenter. That just doesn’t happen with PowerPoint.

 

  1. How about a conference with “anything can happen Thursday”? Imagine a session on Thursday afternoon with a block of concurrent sessions, but no descriptions.  You don’t know what the topic is or who the presenters will be.  You just pick a room, sit back, and engage with the presentations.  The downside is minimal, and you may learn something new.

 

  1. Researchers have to ask big, important questions; or at least they have to know and understand the big important questions. Even if your project is to make a better widget, you should have some idea how the better widget might be used.

 

  1. As Seen on TV: Nothing ever works the same way at home as it did on TV.  That’s the feeling I get with many presentations.  This treatment will work with high quality instruction, appropriate administrative support, sufficient training, proper implementation, and kids who want to learn.  And the cold remedy will work if taken for 10 days with lots of fluids and plenty of rest.

 

And why is it so hard to get a comfortable temperature in hotel conference rooms?

Bridging the Gaps

Charlie DePascale

Apparently, it’s all about gaps.

I have attended two research conferences so far this month; and at both conferences there was lots of discussion about lots of gaps.  At the NEERO conference, the discussion focused on achievement and opportunity gaps.  At the CEC convention, the gap between educational research and practice as well as the gap between the promise and reality of technology were added to the conversation.  Each of those gaps was exacerbated by communication gaps and, ultimately, policy gaps.

In this post, let’s focus on the achievement and opportunity gaps.  At the conferences, they were often presented as a forced choice test: you could choose to focus on the achievement gap OR you could focus on the opportunity gap; but not both.  On one level that makes sense to me.  After you have confirmed the existence of an achievement gap five, ten, or fifteen times, there is little to gain from simply showing that the gap exists once again.  At some point, the focus has to shift to identifying and eliminating the causes of that achievement gap.  That conversation should quickly lead to the opportunity gap.  There is little doubt that factors within and outside of schools that impact students’ opportunity to learn are significant contributors to the achievement gap.  In that sense, I view the achievement gap and opportunity gap as inseparable issues – two sides of the same coin. I am concerned about the opportunity gap because it leads to gaps in academic achievement and other inequities.

There is a danger, however, in focusing totally on the opportunity gap and ignoring the existence of the achievement gap (or worse, denying its existence).  Eliminating the opportunity gap is difficult, expensive, requires long-term commitments from a variety of stakeholders and will not occur overnight.  Statistically controlling for the impact of the opportunity gap, on the other hand, is relatively simple.  And when faced with a choice between a difficult and simple solution, Well, we now how that game ends.

There are certainly legitimate uses for approaches that attempt to control for differences in opportunities when reporting results, and particularly when holding schools and districts accountable.  The similar school bands of the 1980s and 1990s or the more recent value-added models reflect attempts to make fair comparisons between schools or to hold schools to reasonable standards.  The danger is that under the wrong circumstances conditional expectations can easily morph into lowered expectations.  It is a slippery slope.  And that is why it is important to keep a balanced focus on both the opportunity gap and the achievement gap.

Going too far down the path of explaining away the achievement gap also increases the likelihood that people will fall back on the ability gap rather than the opportunity gap as the cause of differences in student achievement.  (Again, policymakers like other physical objects seek the path of least resistance.) In many ways, it is a belief in the ability gap and not the existence of an achievement gap which should be regarded as the real threat to improving opportunities and eliminating inequities in education.  Always lurking below the surface, emerging on occasion, the ability gap renders the achievement gap immutable; and in a strange way there is something comforting, albeit destructive, in allowing oneself to think of a difficult challenge as unsolvable.

There can be little doubt that eliminating opportunity gaps (within and outside of school) is by far the most important factor to improving student learning and eliminating achievement gaps.  In an age of accountability, however, it can be quite difficult for stakeholders to focus simultaneously on long-term solutions and short-term ratings. One need only read the statement of purpose for Title 1 (see below) to get a sense of why this is so difficult.  The constant shifting back and forth between the need for programs to provide equal opportunities and accountability systems and assessment to measure gaps is enough to make your head spin.  But that is the task before us.

We need to bridge the current gap between the programs being implemented under Title 1 and the measures of the effectiveness of those programs.  Bridging that gap will require an understanding of the realistic, research-based outcomes that can be expected from those programs in the short term when they are implemented under existing (less than ideal) conditions and over the long term.  Acquiring that understanding will require honest communication about both the opportunity gap and the achievement gap. And that understanding and communication will have to lead to sound policies.  That’s the annoying thing about bridging gaps.  There are no short cuts.

 

 


Title I — Improving The Academic Achievement Of The Disadvantaged

SEC. 101. IMPROVING THE ACADEMIC ACHIEVEMENT OF THE DISADVANTAGED.

Title I of the Elementary and Secondary Education Act of 1965 (20 U.S.C. 6301 et seq.) is amended to read as follows:

TITLE I–IMPROVING THE ACADEMIC ACHIEVEMENT OF THE DISADVANTAGED

SEC. 1001. STATEMENT OF PURPOSE.

The purpose of this title is to ensure that all children have a fair, equal, and significant opportunity to obtain a high-quality education and reach, at a minimum, proficiency on challenging State academic achievement standards and state academic assessments. This purpose can be accomplished by —

(1) ensuring that high-quality academic assessments, accountability systems, teacher preparation and training, curriculum, and instructional materials are aligned with challenging State academic standards so that students, teachers, parents, and administrators can measure progress against common expectations for student academic achievement;

(2) meeting the educational needs of low-achieving children in our Nation’s highest-poverty schools, limited English proficient children, migratory children, children with disabilities, Indian children, neglected or delinquent children, and young children in need of reading assistance;

(3) closing the achievement gap between high- and low-performing children, especially the achievement gaps between minority and nonminority students, and between disadvantaged children and their more advantaged peers;

(4) holding schools, local educational agencies, and States accountable for improving the academic achievement of all students, and identifying and turning around low-performing schools that have failed to provide a high-quality education to their students, while providing alternatives to students in such schools to enable the students to receive a high-quality education;

(5) distributing and targeting resources sufficiently to make a difference to local educational agencies and schools where needs are greatest;

(6) improving and strengthening accountability, teaching, and learning by using State assessment systems designed to ensure that students are meeting challenging State academic achievement and content standards and increasing achievement overall, but especially for the disadvantaged;

(7) providing greater decisionmaking authority and flexibility to schools and teachers in exchange for greater responsibility for student performance;

(8) providing children an enriched and accelerated educational program, including the use of schoolwide programs or additional services that increase the amount and quality of instructional time;

(9) promoting schoolwide reform and ensuring the access of children to effective, scientifically based instructional strategies and challenging academic content;

(10) significantly elevating the quality of instruction by providing staff in participating schools with substantial opportunities for professional development;

(11) coordinating services under all parts of this title with each other, with other educational services, and, to the extent feasible, with other agencies providing services to youth, children, and families; and

(12) affording parents substantial and meaningful opportunities to participate in the education of their children.