Junk DNA and creationist lies

by Donald Prothero, Feb 27 2013

One of the common tropes you hear among modern creationists is the denial of the idea that there is any non-coding DNA, or “junk DNA.” To them, the idea that a large part of the genome is simply unread leftovers, carried along passively from generation to generation without doing anything, is clearly a contradiction with the idea of an “Intelligent Designer.” So the Discovery Institute and numerous other creationist organizations that are actually sophisticated enough to recognize the issue (including Georgia Purdom of Ken Ham’s “Answers in Genesis” organization) keep spreading propaganda that “junk DNA is a myth” or “every bit of DNA has function, even if we don’t know what it is.” Moonie Jonathan Wells, who has written crummy books misinterpreting fossils and embryology, wrote a whole book denying the subject—even though he hasn’t done any research in molecular biology since 1994. Do they actually do any research to explore this topic, or trying to test the hypothesis that all DNA is functional? No, their labs and their “research” are not that sophisticated. Instead, their entire output on the topic (just as in every other topic) is based on cherry-picking statements of the work of legitimate scientists, quote-mined to distort the meaning of the original scientific publication. Either they don’t understand what they are reading and their confirmation bias filters screen out all but a few words that seem to agree with them, or they are consciously lying and distorting the evidence—or both.

First, some background on the issue. Before 1966, nearly all biologists were “panselectionists,” convinced that every part of an organism was under the constant scrutiny of natural selection, even if we couldn’t detect it. Even in the 1970s, I had professors in evolutionary biology at Columbia University who were hard-core panselectionists, and could not imagine the possibility that nature was not very efficient, but could carry along structures from one generation to the next that either had no function, or were suboptimal in their function (as Stephen Jay Gould had been advocating). But as early as 1966, Lewontin and Hubby (using the then-new technique of gel electrophoresis, which has long since been replaced by direct DNA sequencing) showed that the variability in the protein sequences in many organisms was far in excess of what was needed to explain their anatomical complexity, and that there was no correlation between complexity and genetic information: there were simple worms with huge genomes, and complex organisms with small genomes.

I vividly remember the next step in the debate, because it was raging when I was in grad school in the 1970s. As soon as the basic nucleotide sequence that specified the type of protein was decoded, it became apparent that a large number of mutations could be “silent.” Typically, in the three-nucleotide “codon” that determines a given protein, the first two “letters” (nucleotides) determined which protein would  be produced, but the nucleotide in the third position could be any of the four possibilities (adenine, guanine, cytosine, and tyrosine or uracil) and it would make no difference—the same protein would result. Thus, a mutation in the third position is usually invisible to natural selection, and could randomly change from one condition to another without any phenotypic effect. Also, at this time scientists were first discovering evidence of “molecular clocks”, which would only work if a large portion of the genome were not under the supervision of natural selection, but operating like a ticking “clock,” randomly changing without any external modification. By the late 1970s, “neutralism” (the idea that a lot of the genome was selectively neutral) was all the rage in evolutionary biology, spearheaded by Motoo Kimura’s book on the topic.

The genetic code. The first two “letters” in the triplet sequence normally is sufficient to specify a particular protein; changes in the third “letter” make no difference in protein, and thus are selectively neutral.

Fast-forward to the present day, and as more and more molecular studies have been undertaken, more and more evidence of non-coding DNA has been discovered. These discoveries come from three major lines of evidence. First, as Lewontin and Hubby noticed almost 50 years ago, there is no correlation between the amount of genetic material and the complexity of the organism. As one textbook put it:

The DNA content per cell also varies considerably among closely related species. All insects or all amphibians would appear to be similarly complex, but the amount of haploid DNA in species within each of these phyla varies by a factor of 100. The same variation in DNA content per cell is common within groups of plants that have similar structures and life cycles. For example, the broad bean contains about three to four times as much DNA per cell as the kidney bean.

Other examples abound: one species of deer has 20% more DNA than its closest relative. Even one species of fugu (pufferfish) has almost 100x as my DNA as another. Clearly, since there are no significant differences between such pairs of species that could account for so much variability in DNA content, much of it must be junk. This is one line of evidence that the creationists never address, since they have no explanation for it—and nor does any legitimate biologist.

Second, it’s possible to delete some of this repetitive non-coding junk sequence and nothing happens. In 2004, Nabrega et al. deleted almost 3% of the mouse genome that appeared to be repetitive and non-coding, and the mice continued to reproduce with no ill effects. If this DNA were functional, how could the mice keep on reproducing without it?

But the most convincing evidence has come from our greatly improved understanding of entire DNA sequences, and what they consist of. Creationists play this game of “we just don’t know what most the DNA codes for, so we can’t rule out that it has a function,” but this has been a lie for decades. There are many different kind of DNA sequences that are clearly non-functional. These include:  1) pseudogenes, which look like they were once functional DNA, but have lost the ability to be expressed; 2) Transposons, or “jumping genes,” which can jump from one part of the DNA to another and yet are not expressed; 3) SINEs (short interspersed nucleic elements) and LINEs (long interspersed nucleic elements) which are segments of DNA stuck in the middle of a coding sequence that have no function or ability to code for proteins; 4) highly conserved non-coding non-essential DNA, which is very consistent in the sequences of many organisms suggesting that it is important, yet can be removed with no effect whatsoever (Westphal, 2004); and 5) repetitive DNA, which repeats the same sequence over and over again hundreds of times, and none of this repetitive genome seems to code for anything. Perhaps the most interesting and surprising of all of these junk sequences are endogenous retroviruses (ERVs). These are gene sequences of retroviruses that once infected us by inserting their DNA into our genome, but are no longer active. Instead, every time one of our cells divide we make new copies of this “fossil DNA” from a long-ago viral infection and carry it on through millions of generations. Clearly, these “DNA fossils” hiding in our genome no longer code for the virus, or for anything else—they are clearly “junk” that we passively carry around with no ill effects. In the few cases where the retroviral DNA can be re-activated, it  usually causes disease or other bad effects. Either way, it is of no comfort to creationists.

Naturally, creationists have trouble addressing all this evidence, and instead cling to any bit of biology that seems to support their belief system. In 2012, the ENCODE Consortium studies were published, which made a big splash because they posited that maybe 80% of the genome did produce some kind of protein. Naturally, both the media and creationists jumped on this to confirm their belief that all of the DNA was functional. But although this still concedes that at least 20% of the DNA is clearly non-coding and no comfort to creationists and the “intelligent design” idea, the creationists haven’t noticed this, but proclaimed that they had been vindicated. Sorry, but it turned out that the ENCODE studies were too good to be true. Graur et al. (2013) have just published a paper that completely demolishes their work, and reaffirms that indeed most of the genome (at least 90% of it, perhaps as much as 98%) is non-coding. P.Z. Myers goes over the study in detail. The salient point is that all the ENCODE study managed to show is that some of the genome called “junk” codes for a protein. What they didn’t show is whether these random isolated proteins actually are part of a functional biochemical pathway, or lead to phenotypic consequences. In fact, if a protein results from “junk DNA” but doesn’t do anything, it’s still “junk.”

But if I’ve learned anything from the battle with creationists, not only will they continue to misinterpret the ENCODE Consortium studies to argue that all DNA is functional, but they will ignore the debunking by Graur et al. (2013) and all future studies with the same effect. Creationists have consistently demonstrated that their confirmation bias filters are very strong, and their habit of cherry-picking and distorting the meaning of real scientific studies is part of their DNA.



  • Graur,D., Zheng, Y., Price, N., Azevedo, R.B.R., Zufall, R.A., and Elhaik, E. 2013. On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE. Genome Biology and Evolution (published online)
  • Nabrega, M. Y.  Zhu, I. Plajzer-Frick, V.  Afzal and E. M. Rubin. 2004. Megabase deletions of gene deserts result in viable mice. Nature 431:988-993.
  • Westphal, S.P., 2004. Life still goes on without ‘vital’ DNA, NewScientist No. 2450, P.18.
17 Responses to “Junk DNA and creationist lies”

  1. Luara says:

    The scientific theories of religion are hedged around with the delicacy people show towards others’ religious beliefs. That is a lot of what helps them to persist.
    There’s also a lot of fringe science that isn’t religious, like alien intervention in our evolution.
    Someone commented, this might come from popularization of science: people read handwaving popularizations of science; they take science to be the handwaving popularization; so they believe any-old handwaving theory is credible.
    People don’t have a meta-understanding of science, in other words; they don’t understand the scientific process.

  2. Crabe says:

    “The first two “letters” in the triplet sequence normally is sufficient to specify a particular protein; changes in the third “letter” make no difference in protein, and thus are selectively neutral.”

    About the third letter of the triplet sequence, the system is still rather optimal: 20 “standard” amino acid are commonly expressed by living cells. If you try to code it with two bases “letters” you only have 4×4=16 possibilities and no start and stop codon. The third base “letter” is required, even if it allow the coding of 64 possibilities, which is far too much.
    You can argue that in general the differences in chemical property of amino acids whose codon differs only on the third letter, but still there is a difference, and in the case of the stop codon, you can switch from a normal protein to a truncated inactive one.

    Concerning the non-coding DNA, part of it have a structural role too, and perhaps some of it also regulate the expression of coding sequences.

    You are right in your arguments, but some of your examples are flawed here.

    PS I am not working in genetics, but concerning the triplet, it is basic mathematics.

  3. Michael Finfer, MD says:

    The Encode results, along with some other new data, suggest what I think is a rather elegant answer to the question of how entirely new genes evolve. Since most of the genome, including regions that have no function, is being transcribed, then what might happen is that a transcript acquires a function by mutation. If that function is beneficial, then natural selection will refine it, eventually producing a gene. There is now some evidence that this happens much more frequently than had been assumed in the past.

  4. Sarah says:

    You don’t quite understand the ENCODE project. They didn’t find that 80% of the genome produces protein. They said that 80% of the genome is FUNCTIONAL. There is a difference. They found a lot more pervasive transcription, but they didn’t provide evidence that those transcribed segments are also translated. There are a lot of DNA sequences that are functional but non-protein coding (microRNAs, snRNA, rRNA to name a few) and in some case you can make arguments that regulatory sequences are also functional. In fact the rebuttal paper gives the example of the TATA box, which is a DNA element that is not transcribed but binds transcription factors.

    There are a couple arguments to be made about the ENCODE project:
    1) is the data correct and unbiased? I’ve read the paper, but only glanced so far at the rebuttal, but there is an expectation that authors will report on the data itself, and that data is data. In order to be biased they would have had to expect that there would be that much extra transcription and DNA that bound TF. However if their methodology used was not the correct methodology, that is a useful critique (the rebuttal paper correctly identifies a number of problems with their methods, like the use of pluripotent stem cells)

    2) is the data real? Is it an artifact of the system and methodology? Can it be replicated by outside labs using different cells and tissues? The problem is that most labs don’t have the funding and facilities for large projects like this, which is why ENCODE is a consortium, involving a large collaboration. What other labs can do is try to replicate smaller portions of what ENCODE found.

    3) is the data meaningful? This is a large part of what the rebuttal paper was discussing, which is the meaning of the word “functional”. Because simply being transcribed does not necessarily make it functional. There is the case of pervasive transcription: the cell is not perfect, and transcription can occur but the transcript may be degraded and ultimately recycled. And even if it is made into a protein, if the protein is ultimately degraded without performing a function, is it still functional? There are a lot of ways the cell regulates the transcription and translation of RNA and protein.

    I personally think that the ENCODE project was very interesting, and I hate the fact that the creationists use it as a way of promoting Intelligent Design. But it sparked a lot of discussion, conversation and research. And that’s what research is supposed to be for! There was a time, back in the day, when researchers would publish on rather hypothetical work. And sometimes they would be wrong, and sometimes right. But it was important to acknowledge that even wrong data has a place and purpose in science, if only to help point us down the paths that will lead to new and more correct hypotheses.

  5. The creationists exploit increased complexity in our understanding of DNA. Some features that were thought were non-funcioning might have a function, but that does not mean that all junk DNA has a function. Donald points out compelling lines of evidence for why not.

    One thing not mentioned is that recent research has found that some “silent” mutations – changes to the third nucleotide that do not change the amino acid coded for, can affect transcriptions rates, how much of the protein is made, and this can have phenotypical effects. So silents mutations are not always entirely silent.

    But this is easting around the edges of junk DNA, not refuting the idea, as creationists would like to believe.

  6. Mike G says:

    As always, though, creationists are successfully able to exploit research because the vast majority of their audience has little, if any, science literacy. Because of jargon and the lack of plain language (depending on the source), many would not know what do with accurate information about DNA, genomes, transcription, ribosomes etc. even if they had it.
    The audience for creationist literature depends on someone who at least appears to know what they’re talking about. As the saying goes, “In the land of the blind, the one-eyed man is king.”

    • RCAF says:

      They can exploit it the same way they do with the Bible – their adherents rarely read the source material. Of those that do, few seem to understand what they are reading.

      • tmac57 says:

        Robert Price likes to say that they use they bible as a ventriloquist dummy. They make it say what they want it to say.

  7. Jack S says:

    As a non-Creationist but a computer programmer, I see most ‘Junk DNA’ as being ‘not-Junk’ but ‘data’ or ‘inactive code’. In fact, I would not at all be surprised to find that there is no such thing as ‘junk-DNA’.

    I see ‘inactive DNA’ as being used as one-off code, much as a a program might have an ‘initialisation’ or ‘banner’ function, something which is used just once when the program starts, then never again, no matter how long the program runs. Similarly for ‘data DNA’, it may be used just once to specify how something is to be shaped or constructed, then never again.

    • Cobus van Eeden says:

      Jack, I would say the Junk DNA is much more like a function in your code which you used to call, but because it was buggy and too complex to debug it was rewritten but since nobody knew if it was still being used it was left in the source just in case, while in reality nobody calls it anymore and it is entirely unused, yet we still maintain it in source control and we still back it up and perhaps even document it.

      If it was used, even once, it would still be functional.

      Large portions of our DNA is entirely non-functional. Telomeres are a pretty obvious example of this. Telomeres have an obvious function but the number of repetitions can vary greatly, essentially without affecting any chromosome function. Excessively long telomeres represent a good and very well understood repetition example.

  8. Gail H says:

    Lay person here, and I hope you all don’t mind an ignorant question. The discovery that great swathes of the genome are nonfunctional or noncoding is fascinating to me, but not too surprising, and when thinking about how it could happen, I have a homely analogy: It’s like comments in computer code. I’m no more of a programmer than I am a geneticist, but I have occasion to write scripts in Matlab, Stata and other computational languages. And my scripts are loaded with comments, not only to help me read the program, but to tweak the script to do something different by commenting out (or in) chunks of code. One question is, why do believers in ID insist that every bit of the genome is functional? My computer code is loaded with nonfunctional statements, but it’s clearly a product of design, intelligent or not. I don’t understand why ID advocates (of which I am not one) burden themselves with the hypothesis that all code is functional when such a hypothesis doesn’t seem to be necessary. Especially when that hypothesis, as other commenters have noted, is testable. Is there some requirement by ID folks that intelligence in design requires perfection or elegance?

    • Luara says:

      So if DNA is commenting, what are its comments ;)
      God is humming to his/her self?
      composing poetry on the margins of the book of life?
      why do believers in ID insist that every bit of the genome is functional
      Why does anyone try to fulfill an emotional need by coming up with a theory about biology? In other words, why ID at all?
      It seems an obvious error to conflate one’s emotional needs and the origins and evolution of life.
      ID isn’t really a theory in other words, it’s rationalized wishful thinking. So, no reason to expect it to make sense in the way you suggest.

  9. ttaerum says:

    As always, I am puzzled at what creationists and evolutionists consider to be evidence in favor of their positions. I throw the dice and it, being random, comes up random. Was there an intelligent being behind the throwing of the dice? Some might reasonably argue there was not. :-)

    What makes the “debate” particularly discomforting is when the side I’d like to root for, uses arguments based on waggles, codons, and short repeating sequences while ignoring start codons and mitoDNA conservation. Why does the conversation get bogged down on introns when the biochemical process of replication and transcription is enough to boggle and amaze the mere mortal mind. It is so astonishing that I stand speechless – in awe. And yet, we debate the equivalent of, “how many random angels can stand on the head of a codon”.

  10. Mike G says:

    Not that I’m particularly inclined to look this up, but I do wonder what creationists make of the fact that roughly 8% of our DNA is from ancient retroviral germ cell infections. I don’t know what’s particularly intelligent about carrying around useless viral DNA.

  11. S DuBois says:

    @Mike G

    Totally off topic, but do you find HERVs incredibly creepy? The anicent husks of diseases that managed to not quite kill us. Hitching a ride with us while they slowly rot away into background noise in the genome. Perhaps an as yet undiscovered HERV could activate and kill us all. We would have lost all resistance to it in the years since it was last active

    • Mike G says:

      Ha! Yeah, no doubt about it being creepy. Makes me think of shingles which result from reactivated chicken pox viruses that lay dormant in nerve root ganglia. It’s not known what makes them go dormant or what wakes them up again.

  12. Bill Rabara says:

    The creationist propagandists focus on technical scientific data that requires a base knowledge that their target audience lacks, e.g. Genetics, radioactive decay, lithification, etc. They avoid at all costs talking about the horrifying reality of geologic column, I.e, the oldest rocks show no life, slightly younger only bacteria,ma tad younger, primitive pants, then jawless fish, all the way up to when you finally find dinasaurs and still no mammals, then you get finally to some mammals then eventually some apes, a little shallower some apish things, then finally humans. This evidence when fully presented is conclusive and requires virtually no technical knowledge and only minimal intelligence.