DNA by Shutterstock.com
- Encode claims the long stretches of DNA, previously thought of as junk, are crucial to the way our genome works.
- The old theory states a small percentage of the human genome contains the instructions for how we look, feel and act.
- Most mammalian DNA—more than 98 percent of it—was considered accumulated evolutionary “junk.” @JonEntine
- Professor Dan Graur claims the Encore group confused biological activity with functional importance in the cell.
As Jon Entine of the Genetic Literacy Project reports, the boxing gloves have come off in the battle over the meaningfulness of ENCODE’s assessment of how the human genome works.
Is most of our genetic material akin to evolutionary detritus that has accumulated over the course of millions of years during which single cell organisms evolved into modern humans, as most geneticists have long believed? Or is it something a great deal more functional, as a recently-fashionable theory hyped by the media has it?
To quote from one prominent scientist watching this fast-developing Battle Royale play out: The critics of the new theory “haven’t just poked a hole in the balloon, they’ve set it on fire (the humanity!), pissed on the ashes, and dumped them in a cesspit.”
Rough stuff. Here’s the skinny.
Last fall, scientists with the international ENCODE (the Encyclopedia Of DNA Elements) consortium, launched by the National Genome Research Institute in 2003, announced what the authors said was a breakthrough in identifying all the functional elements in the human genome sequence. Published across 30 papers in Nature, the consortium claimed that long stretches of DNA, previously dismissed as “junk”, are in fact crucial to the way our genome works.
“In 2000, we published the draft human genome and, in 2003, we published the finished human genome and we always knew that was going to be a starting point,” said Dr. Ewan Birney, of the European Bioinformatics Institute near Cambridge, one of the project’s principal investigators. “We always knew that protein-coding genes were not the whole story.”
The Encore analysis was actually tedious reading designed for the genetic über-insider. After all, it was a sprawling project designed to deliver a reference manual for the genome. Suddenly, the scientists were global rock stars, and reveling in the attention.
Nature played up the story big time, with a number of firsts: cross-publication topic threads, a dedicated iPad/eBook App and web site and a virtual machine. Journalists, by and large, were rapturous, devoting pages of articles and elaborate mockups and online tutorials to explain this apparent breakthrough. “Far From ‘Junk’,” headlined Gina Kolata of The New York Times, credulously, with nary a hint of doubt about the paradigm-shifting conclusions. Robin McKie, The Guardian (UK)’s top flight science editor, as recently as this past weekend, gushed that last September’s announcement was the scientific surprise of 2012.
Actually, the broad strokes of what the consortium found had been known for years. Evolution is unforgiving. If the roulette wheel of genetics lands on our number, and we get a beneficial mutation, our descendants are likely to thrive and reproduce. Future generations unlucky enough to inherit a harmful mutation are history. But where amongst our 20,000 or so genes can we find the DNA materials that matters—the proteins that code for our physical and behavioral characteristics?
Until the past decade or so, it had been accepted wisdom in the genetics community that only the tiniest percentage of the human genome contains the instructions that determine how we look, feel and act—whether we (or our ancestral population group) are more likely to be grumpy or gregarious, impetuous or cautious, generous or a Grinch, a speedster or a marathoner, slow-witted or a math ace. Most mammalian DNA—more than 98 percent of it—was considered accumulated evolutionary “junk.”
Scientists long likened this genetic material, which they sometimes called the “dark matter” of the human genome, to a discarded heap of outdated books with the relevant wisdom incorporated in newer, revised volumes squeezed into the most usable <2 percent. This vast majority of genetic material known as functional noncoding DNA, mostly embedded within and around the genes, was thought to play an important but murky role in regulating how the coding genes go about their business.
That Encode announcement last September—contested by many scientists but embraced mostly uncritically by science journalists—challenged the established view. The focus of most researchers had largely been on looking for glitches within genes themselves. The Encode research suggested we should look elsewhere in our DNA sequence—to the junkyard. It was said to usher in a new chapter in our understanding of how genes operate.
The consortium claimed to have identified more than 10,000 new “genes” that code for components that control how the more familiar protein-coding genes work. Up to 18% of our DNA sequence is involved in regulating the less than 2% of the DNA that codes for proteins, they asserted. Encode scientists claimed about 80% of the DNA sequence can be assigned some sort of biochemical function—a startling claim that marked the supposed “death of junk DNA” as Discover magazine put it.
This new perspective, they said, would open new leads for scientists looking for treatments for conditions such as heart disease, atherosclerosis, type 2 diabetes, psoriasis diabetes and Crohn’s disease that have their roots partly in glitches in the DNA.
Less sexy reality
Most scientists had expected that the Encode researchers would uncover some new functions for non-coding DNA, but the 80% figure was way out of proportion to what everyone had expected. The problem was that they used a very low bar for “function”. In a rebuke startling to credulous journalists but not to the genetics community, a caustic and often sarcastic critique, astonishingly titled, “On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE”, and just published in the journal Genome Biology and Evolution takes down the Encode scientists in language usually encountered only in late night pub brawls.
“Everything that Encode claims is wrong,” charged the lead author of the paper, Professor Dan Graur, of the University of Houston. Graur and his co-authors, who among the leading geneticists in the world today, claim the Encore group made a Genetics 101 mistake: confusing biological activity with functional importance in the cell. “They completely exaggerated the amount of human DNA that has a role to play inside our cells. Most of the human genome is devoid of function and these people are wrong to say otherwise.”
“This is not the work of scientists,” he added, scathingly and ungenerously. “This is the work of a group of badly trained technicians.”
Birney immediately fired back. “The nature of the attacks against us is quite unfair and uncalled-for,” he said. “Our work has very important implications for understanding disease susceptibility.” The Encode project involved 442 researchers, based at 32 institutes around the world, and required 300 years of computer time and five years in the lab to get their results.
For perspective, let’s be clear that even Birney himself was chagrined by the media coverage in the days after the release of the Encode report last September. As he pointed out at the time on his blog Genome Informatician, what should have gotten the most attention—the publication of years of accumulated raw data for general use by the scientific community—was lost in all the hullabaloo of the hyped conclusion, which was never what the study was supposed to be about.
“The overall importance of consortia science can not be assessed until years after the data are assembled,” Birney wrote in his Nature article last September. “But reference data sets are repeatedly used by numerous scientists worldwide, often long after the consortium disbands.” As a result, the data production scientists, the real heavy lifters, got short shrift when credit was being distributed.
Birney also identified the elephant in the room: the wording of the claim that the genome is 80% functional. He wrote it was a real mistake, although he helped craft the original news release. Yes, there might be biological activity, but “functional”? Much of it was almost certainly just “biological noise,” he said—although this candor did not make its way into the news release that spurred thousands of overheated stories.
As P.Z. Meyers, a respected biologist at the University of Minnesota-Morris, noted over the weekend on his popular Pharyngula blog, the Encode research consortium that claimed to have identified function in 80% of the genome, actually discovered that a formula of 80% hype gets you the attention of the world press—a point he pointedly made in his analysis last fall. The “Encode delusion,” he called it. Within days of the Encode announcement, a US Circuit Code heard arguments challenging California’s warrantless DNA collection program based on the claim that most of our DNA is functional.
Meyers, like most sentient geneticists—almost all of who were ignored by the major media—was grateful for the raw data but underwhelmed by the overarching conclusion. It’s “patently ridiculous,” he wrote. “That isn’t function,” he said of what was identified. “That isn’t even close. And it’s a million light years away from “a critical role in controlling how our cells, tissue and organs behave”. All that says is that any one bit of DNA is going to have something bound to it at some point in some cell in the human body, or may even be transcribed. This isn’t just a loose and liberal definition of “function”, it’s an utterly useless one.”
What’s the real story? In a desire to create a neat narrative, the Encode team appeared to have bewitched themselves. Our DNA is incomprehensibly complex—like “opening a wiring closet and seeing a hairball of wires,” said Mark Gerstein, an Encode researcher from Yale University, last fall. “We tried to unravel this hairball and make it interpretable.” In their understandable zeal to ‘make things comprehensible,’ many key scientists in the project and most journalists stumbled badly.
This saga has yet to fully play out. More recriminations and nastiness (some of it over the top) undoubtedly lie ahead. But the biggest disappointments are the many science journalists, who by and large let their critical instincts lapse, exchanging the greyer and perhaps duller reality for a sensationalistic headline.
Jon Entine, executive director of the Genetic Literacy Project, is a senior fellow at the Center for Health & Risk Communication and STATS (Statistical Assessment Service) at George Mason University.