AncestryDNA Scientists Achieve Advancement in Human Genome Reconstruction

Posted by Julie Granka on December 16, 2014 in DNA, Science, Uncategorized

Passed down through the generations, fragments of the genomes of long-gone ancestors exist today in the genomes of their living descendants.

Those fragments can actually be used to recover parts of those ancestors’ genomes – without having to resort to some more morbid techniques for obtaining their DNA.  That means a potentially easier way for you to be able to trace your freckles, for example, back to a particular ancestor.

The AncestryDNA science team is excited to unveil new developments in the science of ancestral human genome reconstruction using genetic data of living people. Using an approach similar to reassembling a document that has been shredded, we have attributed an unprecedented proportion of a human genome to a 19th Century American and his two successive wives using the genome-wide genetic material of their descendants.

This scientific feat is a step forward in the use of consumer genetics in family history.

Diving deeper

Attributing pieces of DNA from living individuals to a particular ancestor requires both pedigree data in addition to genetic data – and lots of both.  AncestryDNA, with over half a million DNA samples and over 60 million family trees, was thus in a unique position to attempt it.

For all pairs of individuals in the member database, AncestryDNA determines whether they “share” DNA; that is, share a haplotype that is nearly identical likely because they likely both inherited it from a common ancestor.  Then, we leverage our database of member family trees to search for their shared ancestors.

Recently, we released a new statistical algorithm that integrates these two analyses to identify groups of likely descendants of a particular ancestor.  These groups, called DNA Circles™, are sets of individuals who all share DNA with one another and all have a particular individual as their most recent common ancestor. To date, we have identified over 150,000 DNA Circles – connecting groups of descendants of over 150,000 distant ancestors.

To prove out the task of genome reconstruction, we used DNA Circles to first identify a set of individuals with pedigree and genetic evidence to suggest that they were likely descendants of a man named David Speegle, born in the early 1800’s in Alabama.

 

Image obtained from therestorationmovement.com

Image obtained from therestorationmovement.com

Why David Speegle?  With many children between his two successive marriages to wives Nancy and Winifred, Mr. Speegle and his spouses were excellent candidates for reconstruction.  That’s because having lots of children means having a large number of living descendants potentially carrying pieces of their DNA.

Validating the statistical methods underlying DNA Circles, Ancestry family historian Crista Cowan conducted extensive genealogical research, verifying that the individuals in David Speegle’s DNA Circle were indeed his descendants.  In the process, she also discovered how most of them were related to one another genealogically.

The science team then applied two different statistical models of genetic inheritance to these identified genetic and genealogical relationships.  They were able to infer, with a high degree of confidence, which parts of David, Nancy, and Winifred’s genomes were passed down to these descendants – thereby piecing together parts of their genomes.

Developed by the science team in-house, the first method used is a computational algorithm that efficiently identifies and stitches together chunks of DNA from a set of ancestors (like David, Winifred, and Nancy) given genetic data shared among their descendants. Unlike the first method, the second approach requires a pedigree linking all descendants and uses methods specifically designed in the academic research community for inferring inheritance of DNA in family trees.

Our discoveries

Both methods showed strong agreement in identifying DNA segments that could be attributed to the Speegle family ancestors and used to piece together portions of their genomes. For example, they identified pieces of the genome indicating that potentially David Speegle or his spouses had a version of a gene attributed to a higher likelihood of having male pattern baldness. And while Mr. Speegle did not likely pass down versions of the genes for darker hair or freckles to his descendants, he did likely pass along the version of a gene needed for blue eyes.

In addition to these selected traits, for nearly half of the length of the human genome the team was able to find representation of at least one of the copies of the genomes of David, Nancy, and Winifred Speegle.  And, because David Speegle had two wives, for roughly 12% of the length of the genome the team was able to identify genomic material that likely belonged to David Speegle himself.

These proportions are remarkable considering the number of generations separating Speegle from his descendants – an average of over six generations (that’s your great-great-great-great grandparent!).  The fact that the team could reach these numbers attests to the power of AncestryDNA’s massive dataset – and its power as it continues to grow. 

Although we’re still refining our methods for reconstructing pieces of the genome of human ancestors from genetic material from their descendants, we’re excited about the implications of this research in genetic genealogy and in the genomics industry. Future insights gained may come in the form of tracing the source of particular traits in a population, reaching a better understanding of recent population history, and enabling more targeted genetic genealogy research.

The new DNA Circles experience and the genome reconstruction project are just a fraction of the AncestryDNA science team’s ongoing research efforts to further personalize findings from big data. By leveraging AncestryDNA’s continually expanding database of DNA samples paired with Ancestry family tree data, the team will continue to innovate to provide unique insights to both consumers and the scientific community — potentially even elucidating the genetic makeup of many more distant ancestors.

3 comments

Past Articles

Lessons Learned from a Monster Artist

Posted by Dan Lawyer on November 19, 2014 in Inside our Offices

Yes, we made monsters out of clay. If you happened to be in Midway, Utah at the very end of September you might have bumped into the Ancestry product team holding our annual product summit. About 80 of us gathered for an action packed two-day event filled with team building, strategic conversations, and a few… Read more

Monitoring progress of SOA HPC jobs programmatically

Posted by Chad Groneman on October 17, 2014 in C#, Development, Distributed Computing

Here at Ancestry.com, we currently use Microsoft’s High Performance Computing (HPC) cluster to do a variety of things.  My team has multiple things we use an HPC cluster for.  Interestingly enough, we don’t communicate with HPC exactly the same for any distinct job type.  We’re using the Service Oriented Architecture (SOA) model for two of… Read more

2 Talks and 4 Posters in 4 Days at the ASHG Annual Meeting

Posted by Julie Granka on October 15, 2014 in DNA, Science

For the AncestryDNA science team, October brings more than fall foliage and pumpkins.  It also brings us the yearly meeting of the American Society of Human Genetics (ASHG), the main conference of the year in our field. On Saturday, we’ll arrive in San Diego to join thousands of other scientists for a four day conference… Read more

Ancestry Opens Its Doors for NewCo.SF

Posted by Melissa Garrett on September 8, 2014 in Technology Conferences

Ancestry was selected as a 2014 NewCo.SF host company. Come join us at our San Francisco office on Thursday, Sept. 11 at 4:30pm PT to hear from Eric Shoup, EVP of Product at Ancestry.com. He will provide an inside look at the unique and meaningful business of family history and the tech, science, and product… Read more

Stop using anchors as buttons!

Posted by Jason Boyer on September 2, 2014 in Accessibility, CSS/HTML/JavaScript, UX, Web

Semantic buttons and links are important for usability as well as accessibility. Hyperlinks indicate a URL change, whereas buttons are used to perform an action. I thought this post up in response to a question asked on Stack Overflow over 5 years ago. Which one should you use? <a href=”#” onclick=”doSomething()”>Do Something</a> <a href=”javascript:void(0);” onclick=”doSomething()”>Do… Read more

The DNA matching research and development life cycle

Posted by Julie Granka on August 19, 2014 in DNA, Science

Research into matching patterns of over a half-million AncestryDNA members translates into new DNA matching discoveries  Among over 500,000 AncestryDNA customers, more than 35 million 4th cousin relationships have been identified – a number that continues to grow rapidly at an exponential rate.  While that means millions of opportunities for personal discoveries by AncestryDNA members,… Read more