Passed down through the generations, fragments of the genomes of long-gone ancestors exist today in the genomes of their living descendants.
Those fragments can actually be used to recover parts of those ancestors’ genomes – without having to resort to some more morbid techniques for obtaining their DNA. That means a potentially easier way for you to be able to trace your freckles, for example, back to a particular ancestor.
The AncestryDNA science team is excited to unveil new developments in the science of ancestral human genome reconstruction using genetic data of living people. Using an approach similar to reassembling a document that has been shredded, we have attributed an unprecedented proportion of a human genome to a 19th Century American and his two successive wives using the genome-wide genetic material of their descendants.
This scientific feat is a step forward in the use of consumer genetics in family history.
Attributing pieces of DNA from living individuals to a particular ancestor requires both pedigree data in addition to genetic data – and lots of both. AncestryDNA, with over half a million DNA samples and over 60 million family trees, was thus in a unique position to attempt it.
For all pairs of individuals in the member database, AncestryDNA determines whether they “share” DNA; that is, share a haplotype that is nearly identical likely because they likely both inherited it from a common ancestor. Then, we leverage our database of member family trees to search for their shared ancestors.
Recently, we released a new statistical algorithm that integrates these two analyses to identify groups of likely descendants of a particular ancestor. These groups, called DNA Circles™, are sets of individuals who all share DNA with one another and all have a particular individual as their most recent common ancestor. To date, we have identified over 150,000 DNA Circles – connecting groups of descendants of over 150,000 distant ancestors.
To prove out the task of genome reconstruction, we used DNA Circles to first identify a set of individuals with pedigree and genetic evidence to suggest that they were likely descendants of a man named David Speegle, born in the early 1800’s in Alabama.
Why David Speegle? With many children between his two successive marriages to wives Nancy and Winifred, Mr. Speegle and his spouses were excellent candidates for reconstruction. That’s because having lots of children means having a large number of living descendants potentially carrying pieces of their DNA.
Validating the statistical methods underlying DNA Circles, Ancestry family historian Crista Cowan conducted extensive genealogical research, verifying that the individuals in David Speegle’s DNA Circle were indeed his descendants. In the process, she also discovered how most of them were related to one another genealogically.
The science team then applied two different statistical models of genetic inheritance to these identified genetic and genealogical relationships. They were able to infer, with a high degree of confidence, which parts of David, Nancy, and Winifred’s genomes were passed down to these descendants – thereby piecing together parts of their genomes.
Developed by the science team in-house, the first method used is a computational algorithm that efficiently identifies and stitches together chunks of DNA from a set of ancestors (like David, Winifred, and Nancy) given genetic data shared among their descendants. Unlike the first method, the second approach requires a pedigree linking all descendants and uses methods specifically designed in the academic research community for inferring inheritance of DNA in family trees.
Both methods showed strong agreement in identifying DNA segments that could be attributed to the Speegle family ancestors and used to piece together portions of their genomes. For example, they identified pieces of the genome indicating that potentially David Speegle or his spouses had a version of a gene attributed to a higher likelihood of having male pattern baldness. And while Mr. Speegle did not likely pass down versions of the genes for darker hair or freckles to his descendants, he did likely pass along the version of a gene needed for blue eyes.
In addition to these selected traits, for nearly half of the length of the human genome the team was able to find representation of at least one of the copies of the genomes of David, Nancy, and Winifred Speegle. And, because David Speegle had two wives, for roughly 12% of the length of the genome the team was able to identify genomic material that likely belonged to David Speegle himself.
These proportions are remarkable considering the number of generations separating Speegle from his descendants – an average of over six generations (that’s your great-great-great-great grandparent!). The fact that the team could reach these numbers attests to the power of AncestryDNA’s massive dataset – and its power as it continues to grow.
Although we’re still refining our methods for reconstructing pieces of the genome of human ancestors from genetic material from their descendants, we’re excited about the implications of this research in genetic genealogy and in the genomics industry. Future insights gained may come in the form of tracing the source of particular traits in a population, reaching a better understanding of recent population history, and enabling more targeted genetic genealogy research.
The new DNA Circles experience and the genome reconstruction project are just a fraction of the AncestryDNA science team’s ongoing research efforts to further personalize findings from big data. By leveraging AncestryDNA’s continually expanding database of DNA samples paired with Ancestry family tree data, the team will continue to innovate to provide unique insights to both consumers and the scientific community — potentially even elucidating the genetic makeup of many more distant ancestors.