Authors: Beth Shapiro
M E T H O D S I N M O L E C U L A R B I O L O G Y ™
John M. Walker
School of Life Sciences
University of Hertfordshire
Hatfield, Hertfordshire, AL10 9AB, UK
For further volumes:
Methods and Protocols
Department of Ecology and Evolutionary Biology, University of California Santa Cruz, A414 Earth & Marine Sciences, Santa Cruz, CA 95064, USA Michael Hofreiter
Department of Biology, The University of York, Wentworth Way, Heslington, York YO10 5DD, UK
Department of Ecology and Evolutionary Biology
Department of Biology
University of California Santa Cruz
The University of York
A414 Earth & Marine Sciences
Wentworth Way, Heslington
Santa Cruz, CA 95064, USA
York YO10 5DD, UK
Springer New York Dordrecht Heidelberg London
Library of Congress Control Number: 2011944024
© Springer Science+Business Media, LLC 2012
All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Humana Press, c/o Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.
Printed on acid-free paper
Humana Press is part of Springer Science+Business Media (www.springer.com)
Research in ancient DNA began more than 25 years ago with the publication of short mitochondrial DNA sequence fragments from the quagga, and extinct subspecies of the plains zebra. This publication was soon followed by a study reporting a 3.4 kilobase sequence of human nuclear DNA from an Egyptian mummy. Although today many researcher believe this later fi nding was the result of contamination with modern DNA, it nevertheless had substantial infl uence on the early phase of ancient DNA research. Despite the attention received by these early studies, research on ancient DNA only really gained momentum after the invention of the polymerase chain reaction, or PCR. This technology suddenly allowed millions of copies to be made of the few remaining ancient DNA molecules that in fortunate circumstances were preserved in fossils and museum specimens. In fact, without the invention of PCR, it is unlikely that ancient DNA research would ever have resulted in more than a few reports of short DNA fragments with little biological signifi cance.
The use of PCR in ancient DNA research has been a double-edged sword. It has not only made possible many interesting studies, but has also facilitated the publication of some spectacularly wrong results. The best-known example of this is probably the publication of presumed dinosaur DNA sequences, which were later shown to be derived from modern human contamination. Presumed ancient DNA sequences were also reported from insects and plants embedded in pieces of amber and from waterlogged plant fossils that were many millions of years old. Today, all of these are assumed to have been the result of contamination of samples, reagents, or experiments with modern DNA. These false positive results, which at the time were mostly published in high-profi le journals, damaged the scientifi c reputation of the fi eld, and it has taken many years to recover from this damage.
To some extent, these spectacular failures obscured the many sound, albeit less daz-zling, studies that were published at the same time. The fi rst Pleistoceneage DNA sequences from mammoth and cave bears were reported in 1994, and the fi rst attempt to determine the phylogenetic position of the extinct moa within ratite birds was published in 1992. The potential of ancient DNA to investigate temporal changes in genetic diversity in populations was recognized even earlier: the fi rst study, albeit only spanning a temporal period of approximately 70 years, was published in 1990. This was followed some years later by a study of European rabbits that extended the time frame for population genetics using ancient DNA to the Pleistocene/Holocene boundary, some 10,000 years ago.
For the next 10 years, the fi eld of ancient DNA saw steady progress with regard to the age and type of samples used, the length of sequence analyzed, and the number of specimens included. In 2000, the fi rst population study using Pleistoceneage DNA was published. This study, which focused on brown bears in Alaska, was important in that it showed that long-held beliefs regarding the evolution and establishment of modern phylogeographic patterns (the spatial structure of genetic diversity in a species) were incorrect. This work had a profound infl uence on the understanding of long-term population dynamics and dispersals during the Pleistocene and Holocene, and was followed by numerous studies showing that populations are far more dynamic units than previously assumed.
Only a year later, the fi rst complete mitochondrial genomes of an extinct species were published independently by two research groups working on moa. These studies showed that despite the fragmented and damaged nature of ancient DNA molecules, it is possible to obtain longer DNA sequences from at least some ancient samples.
In parallel to the overall increase in length of the ancient DNA sequences obtained, the fi eld also saw a signifi cant increase in the age of the samples from which DNA sequences could be retrieved. Although, as noted above, all the extreme claims of millions of years old DNA were later shown to be false positives, the age of truly endogenous ancient DNA sequences was increasing considerably. The only authenticated ancient DNA sequences from the pre-PCR area, those of the quagga, were only 140 years old. Soon after PCR, maize sequences of about 1,000 years were reported in 1988, and by 1994, the oldest authentic DNA sequences dated to 40,000 years old. At the time of writing, the oldest published sequences come from a Greenland ice core and date to at least 500,000 years.
Overall, over the lifetime of ancient DNA as a research fi eld, the age of the investigated sequences has increased by more than four orders of magnitude.
Finally, the types of substrates used for ancient DNA extraction also have broadened tremendously. The fi rst ancient DNA studies used soft tissue, building on the assumption that as these tissues, such as muscle, contain a lot of DNA in living organisms, they should also retain more DNA
than other, less DNA-rich tissues. As for many assumptions made about ancient DNA, this proved to be false. The fi rst ancient DNA sequences isolated from bone were reported in 1989, and, as it turned out, ancient bone contains on average much more DNA than ancient soft tissue, despite that in the living organism it contains much less DNA. Bone appears to preserve DNA much better than soft tissue, presumably because DNA adheres to the bone hydroxyl-apatite, and part of the DNA may even be preserved inside small hydroxyl-apatite crystals where it is protected from degradation. For almost 10 years, researchers concentrated mostly on bone as a source of ancient DNA, not only because it preserves DNA quite well, but also because it is rather abundant in the fossil record. In 1998, another, more unusual source of ancient DNA was opened up: coprolites, or subfossil faeces, which are found most often in cave sites in dry areas, especially in south-western North America. Since then, the variety of ancient DNA sources has increased steadily, with hair in 2001, packrat middens in 2002, sediment in 2003, feathers in 2009 and, most recently, eggshells in 2010. Thus, it is probably fair to say that most available substrates have by now been probed for ancient DNA and almost all yield DNA at least occasionally.
All the progress described above was mainly driven by the invention of and subsequent modifi cations to PCR. However, in 2005, a second revolution in ancient DNA research began with the introduction of the fi rst of many so-called next-generation sequencing (NGS) technologies. The fi rst generation of NGS machines resulted in an approximately 300-fold increase in DNA sequence throughput compared to traditional Sanger sequencing. Since then, DNA sequence throughput of NGS technologies has increased by another four orders of magnitude. Similar to PCR, these new technologies were rapidly adopted by the ancient DNA research community, and the fi rst publication reporting ancient DNA sequences obtained by NGS was published only a few months after the technology itself had been published. Although this fi rst publication was a mere proof-of-principle study, as it reported “only” 13 million base-pairs of mammoth nuclear DNA, it paved the way for more ambitious projects. Thus, in 2008, the fi rst low-coverage (0.8-fold) draft genome of an extinct species, the mammoth, was published, and in 2010, the fi rst high-coverage (20-fold) ancient human genome, obtained from the hair of a 4,000-year-old palaeo-eskimo
was released. This was followed by 1.3-and 1.9-fold coverage genomes of Neanderthals and another, previously unrecognized hominid from Denisova Cave in Siberia.
NGS not only allows genomes to be sequenced from ancient remains. It has also resulted in the reconstruction of multiple, complete, ancient mitochondrial genomes, either via shotgun sequencing or in combination with multiplex PCR or hybridization capture approaches. Multiple (up to 30) complete or almost complete mitochondrial genomes have been obtained for cave bears, mammoths, and Neanderthals, and smaller numbers of mtDNA genomes have been obtained from ancient remains of other species including mastodon, short-faced bear, aurochs, Tasmanian tiger, and polar bear, and also from fossils of anatomically modern humans.
While the inventions of PCR and NGS clearly mark the two major revolutions in ancient DNA research thus far, progress has also been made in many smaller steps, including improved DNA extraction techniques, modifi cations to the PCR such as two-step multiplex PCR, and analytical approaches facilitating the analysis of time-structured data.
Progress in ancient DNA research has been inherently technology-driven. It may therefore come as a surprise that despite this importance of the appropriate methodological approaches in ancient research, no publication exists so far that summarizes current approaches toward the retrieval and analysis of ancient DNA sequences. This book attempts to close this gap. The chapters that follow describe a wide range of technologies, beginning with guidelines for the setup of an ancient DNA laboratory, describing extraction protocols for a wide range of different substrates and instructions for PCR and NGS library preparation, and fi nally suggesting appropriate analytical approaches in order to make sense of the sequences obtained. The chapters are written in a protocol-like style to make them accessible for every-day use in the lab. In addition, several chapters describe case studies linked to a protocol that illustrate what can actually be done using the described approaches. Due to these comprehensive but at the same time easily accessible protocols and illustrative case studies, we hope this book will be an interesting and useful source of information for the beginner and experienced researcher in ancient DNA alike.