Discussion: (0 comments)
There are no comments available.
View related content: K-12 Schooling
Frederick M. Hess sat down recently with EdNews.org to discuss the state
of education reform efforts.
Frederick M. Hess
Q: Rick, you recently published an article in Educational Leadership
arguing that the ways in which we rely on data to drive decisions in
schools has changed over time. Yet, you note that we have unfortunately only
succeeded in moving from the “old stupid” to the “new stupid.” What do you do
you mean by this?
A: A decade ago, it was only too easy to find education leaders who dismissed
student achievement data and systematic research as having only limited utility
when it came to improving schools. Today, we’ve come full circle. You can’t
spend a day at an education gathering without hearing excited claims about
“data-based decision making” and “research-based practice.” Yet these phrases
can too readily serve as convenient buzzwords that obscure more than they
clarify and that stand in for careful thought. There is too often an unfortunate
tendency to simply embrace glib solutions if they’re packaged as “data-driven.”
Today’s enthusiastic embrace of data has waltzed us directly from a petulant
resistance to performance measures to a reflexive reliance on a few simple
metrics–namely, graduation rates, expenditures, and grade three through eight
reading and math scores. The result has been a race from one troubling mindset
to another–from the “old stupid” to the “new stupid.”
Q: Can you give us an example of the “new stupid”?
A: Sure, here’s one. I was giving a presentation to a group of aspiring
superintendents. They were eager to make data-driven decisions and employ
research to serve kids. There wasn’t a shred of the old stupid in sight. I
started to grow concerned, however, when our conversation turned to value-added
assessment and teacher assignments. The group had recently read a research brief
highlighting the effect of teachers on achievement and the inequitable
distribution of teachers within districts. They were fired up and ready to put
this knowledge to use. One declared to me, to widespread agreement, “Day one,
we’re going to start identifying those high value-added teachers and moving them
to the schools that aren’t making AYP.”
Now, I sympathize with the premise, but the certainty worried me. I started
to ask questions: Can we be confident that teachers who are effective in their
current classrooms would be equally effective elsewhere? What effect would
shifting teachers to different schools have on the likelihood that teachers
would remain in the district? Are the measures in question good proxies for
teacher quality? My concern was not that they lacked firm answers to these
questions–that’s natural enough even for veteran superintendents–it was that
they seemingly regarded such questions as distractions.
Q: What’s a concrete example of where educators and advocates
overenthusiastically used data to tout a policy, but where the results didn’t
pan out? What went wrong?
A: Take the case of class-size reduction. For two decades, advocates of
smaller classes have referenced the findings from the Student Teacher
Achievement Ratio (STAR) project, a class-size experiment conducted in Tennessee
in the late 1980s. Researchers found significant achievement gains for students
in small kindergarten classes and additional gains in first grade. The results
were famously embraced in California, which in 1996 adopted a program to reduce
class sizes that cost nearly $800 million in its first year. But the dollars
ultimately yielded disappointing results, with the only major evaluation–by AIR
and RAND–finding no effect on achievement.
What happened? Policymakers ignored nuance and context. California encouraged
districts to place students in classes of no more than 20–but that class size
was substantially larger than those for which STAR found benefits. Moreover,
STAR was a pilot program serving a limited population, which minimized the need
for new teachers. California’s statewide effort created a voracious appetite for
new educators, diluting teacher quality and encouraging well-off districts to
strip-mine teachers from less affluent communities. The moral is that even
policies or practices informed by rigorous research can prove ineffective if the
translation is clumsy or ill considered.
Q: You and I both know that experimental studies are
based on very rigid standardized approaches where there is experimental control.
What is wrong with using these studies and trying to apply them to the “real
A: The class size example cited above points to one enormous
challenge–generalizing findings across place and time. Details about policies
and the contexts in which they are implemented vary across locales. Just as an
employee pay system that has been shown to work for Google will not necessarily
work for Citigroup, so too a study that shows a pay system that works in one
school or district shouldn’t be casually presumed to translate everywhere.
Why is this? Employees in two different organizations may have been attracted
by different incentives, have varying levels of trust in management, be
organized differently, and so forth. Meanwhile, policies may take time to
mature, or early success may be due to the skill of early adopters, enthusiasm
associated with hot ideas, and foundation support. None of this will necessarily
translate. The lesson is that research findings must be interpreted thoughtfully
and with an eye toward what may change from the experimental site to other
Q: In medical research, there are specific parameters when it comes
to devising and conducting experiments. Do these same parameters apply to
A: Efforts to adopt the medical model in schooling have been plagued by a
flawed understanding of just how the model works in medicine and how it
translates to education. The randomized field trial, in which drugs or therapies
are administered to individual patients under explicit protocols, is enormously
helpful when recommending interventions for particular medical conditions. But,
for example, it is far less useful when determining how much to pay nurses or
how to hold hospitals accountable.
In education, curricular and pedagogical interventions can certainly be
investigated through randomized field trials. Even here, however, there is a
tendency for educators to be cavalier about research-based practice. When
medical research finds a certain drug regimen to be effective, doctors do not
casually tinker with the formula. Yet, in areas like reading instruction,
schools routinely alter the sequencing and elements of a curriculum, while still
touting their practices as research based. When it comes to policy, officials
must make tough decisions about questions like management and compensation that
cannot be examined under controlled conditions. Although research can provide
valuable insights, studies of particular school choice plans or accountability
systems (for the reasons I discussed a moment ago) are unlikely to answer
whether such policies “work”.
Q: In your mind, what are some of the main limitations of research as
they apply to schooling?
A: First, let me be clear: Good research has an enormous contribution to
make–but, when it comes to policy, this contribution is more tentative than we
might prefer. Scholarship’s greatest value is not the ability to end policy
disputes, but to encourage more thoughtful and disciplined debate.
In particular, rigorous research can establish parameters as to how big an
effect a policy or program might have, even if it fails to conclusively answer
whether it “works.” For instance, quality research has quieted assertions that
national-board-certified teachers are likely to have heroic impacts on student
achievement or that Teach For America recruits might adversely affect their
Especially when crafting policy, we should not expect research to dictate
outcomes but should instead ensure that decisions are informed by the facts and
insights that science can provide. Education leaders should not expect research
to ultimately resolve thorny policy disputes over school choice or teacher pay
any more than medical research has ended contentious debates over health
insurance or tort reform.
Q: Let me take a kind of liberal stance for a minute. You have two
schools–one has good test scores, but a lot of dropouts and juvenile
delinquents and teenage pregnancy. The second school has some academic problems,
but the kids are involved in sports, extra-curricular activities, and those kids
are good citizens, dare I say “church goers” and drug/alcohol abuse is minimal.
What does the data say to us in regard to these two schools?
A: Ultimately, it depends on whether we are collecting the right data and how
we want to read those data. When judging schools, do we think tests gauging
student achievement should hold pride of place or do we regard those results as
one part of a broader body of information. Personally, I think that
determination depends on how much confidence we have in those achievement tests
and on the nature of the school in question. Many of our state assessments are
so lacking that I would be skeptical of results that ran counter to a number of
other data points. At the same time, our inability to reliably measure other
kinds of outcomes forces us to rely more heavily on simple achievement metrics
than we might like.
In your example, if the low-achieving school is catering to low-performing
students or an at-risk population, for instance, we should be careful to weigh
positive trends and other student outcomes when considering the level of
Similarly, if the seemingly high-achieving school is attracting advantaged
students, then the good test scores should provide little comfort in the face of
the other evidence. The real question for both schools would be how much
confidence we have that students are learning and growing in the course of their
studies–and that requires finding ways to gauge the “value-added” of these
classrooms (whether in terms of achievement, cognition, behavior, etc.).
Q: Why do we seem to be giving “short shrift” to management data and
what does that imply about how school leaders make decisions on a daily
A: While embracing student achievement data, policymakers and practitioners
have paid scant attention to collecting or using data that are more relevant to
improving schooling. State tests provide results that are too coarse to offer
more than a snapshot of student and school performance, and few district data
systems link student achievement metrics to teachers, practices, or programs in
a way that helps determine what is working. Ultimately, student achievement
measures are largely irrelevant to judging the performance of many district
employees. It simply does not make sense to evaluate the performance of a
payroll processor or human resources recruiter–or a foreign language
instructor–primarily on the basis of reading and math test scores for grades 3
Student achievement data alone really only tell us what comes out of the
“black box” of schooling–they don’t tell us what is happening inside that box.
They illustrate how students are faring but do not enable an organization to
diagnose problems or manage improvement. It is as if a CEO’s management
dashboard consisted of only one item–the company stock’s price. Helping schools
and school systems improve operations and teaching and learning requires
tracking an array of indicators, such as how long it takes books and materials
to be shipped to classrooms, whether schools provide students with accurate and
appropriate schedules in a timely fashion, how quickly assessment data are
returned to schools, and how often the data are used. A system in which leaders
possess that kind of data is far better equipped to boost school performance
than one in which leaders merely have a palette of achievement data.
Q: How should we “steer clear,” as you put it, of the “new
A: It requires at least three key things. First, educators should be wary of
allowing data or research to substitute for good judgment. When presented with
persuasive findings or promising new programs, they must ask the simple
questions: What are the benefits of adopting this program or reform? What are
the costs? How confident are we that the promised results are replicable? What
might complicate projections? Data-driven decision making does not simply
require good data; it also requires good decisions.
Second, schools must seek out the kind of data they need as well as the
achievement data external stakeholders need. Despite leaps in state assessment
systems and continuing investment in longitudinal data systems, school and
district leaders are a long way from having the management data they require to
support high-performing schools and systems. In practice, there is a rarely
acknowledged tension between collecting data with an eye toward external
accountability (measurement of performance) and doing so for internal management
(measurement for performance).
Third, school systems should reward education leaders and administrators for
pursuing more efficient ways to deliver services. Indeed, superintendents who
use data to eliminate personnel or programs–even if these superintendents are
successful and vindicated by the results–are often more likely to ignite
political conflict than to reap professional rewards. So long as leaders are
revered only for their success at consensus building and gathering stakeholder
input, moving from the rhetorical embrace of data to truly data-driven decision
making will remain an elusive goal in many communities.
Q: What do you see as the main motivation behind the “new stupid”? Is
it simply an example of good intentions gone awry?
A: In a word: yes. It’s a strategy pursued with the best of intentions. But
the problem is threefold. First, as we’ve discussed, too many times those of us
in K-12 are unsophisticated about what a particular study or a particular data
set can tell us. Second, the very passion that infuses the K-12 sector creates a
sense of urgency. People want to fix problems now, using whatever tools are at
hand–and don’t always stop to realize when they’re trying to fix a Swiss watch
with a sledgehammer. Third, the reality is that we still don’t have the kinds of
data and research that we need. So, too often, the choice is to misapply extant
data or simply go data-free. Everyone involved means well; the trick is provide
the right training, the right data, and for practitioners, policymakers, and
reformers to ensure that compassion doesn’t swamp common sense.
Frederick M. Hess is a resident scholar and the director of education
policy studies at AEI.
There are no comments available.
1150 17th Street, N.W. Washington, D.C. 20036
© 2016 American Enterprise Institute for Public Policy Research