Tech Roots » Science http://blogs.ancestry.com/techroots Ancestry.com Tech Roots Blogs Mon, 14 Apr 2014 16:49:18 +0000 en-US hourly 1 http://wordpress.org/?v=3.5.2 AncestryDNA Regions by the Numbershttp://blogs.ancestry.com/techroots/ancestrydna-did-you-know/ http://blogs.ancestry.com/techroots/ancestrydna-did-you-know/#comments Tue, 25 Mar 2014 22:37:20 +0000 Julie Granka http://blogs.ancestry.com/techroots/?p=2134 Since May of 2012, when we first released AncestryDNA, we’ve returned results to over a quarter of a million customers. Based on feedback that we have received, those 300,000 customers have learned a great deal about their family history – their deep ancestral origins and their genetic relatives. As it turns out, AncestryDNA has also… Read more

The post AncestryDNA Regions by the Numbers appeared first on Tech Roots.

]]>
Since May of 2012, when we first released AncestryDNA, we’ve returned results to over a quarter of a million customers.

Based on feedback that we have received, those 300,000 customers have learned a great deal about their family history – their deep ancestral origins and their genetic relatives.

As it turns out, AncestryDNA has also learned a great deal from our customers.  We’ve uncovered some interesting statistics about ethnicity estimates that may help you to learn a bit more about your own family history – and we’ll share them with you in this blog post.

At AncestryDNA, we estimate a customer’s genetic ethnicity as a set of percentages in 26 regions around the world. See map a map of these regions below.

Ethnicity-all-regions-map

We estimate the amount of DNA that a customer likely inherited from each of these regions by comparing a customer’s DNA with a reference set of DNA samples – with corresponding documented family trees – from each of these regions. For a deeper dive into the science of ethnicity estimation, take a look at my previous blog post on the subject.

Below is an example of an AncestryDNA ethnicity estimate.  In this post, we’ll explore what AncestryDNA ethnicity estimates look like across all of our customers – specifically, how many of these 26 regions show up in someone’s estimate?

Ethnicity example

Based on the percentages estimated for a customer, we place each region into one of three categories.  Main Regions are the primary regions from which you likely inherited DNA (the regions, pictured above, that you see when you first view your ethnicity estimate); Trace Regions have less evidence of being part of your genetic ethnicity (and are viewed by clicking on the “+” button); Other Regions Tested have even less or no evidence, and do not show up as part of your ethnicity estimate.

In exploring the aggregated genetic ethnicity results of customers who opted in to scientific research, here are a few fun facts we’ve found about the diversity of regions found in customers’ estimates:

  • Ethnicity at a continental level – First, it’s interesting to view a person’s ethnicity estimate by continent. Our 26 regions can be broken into six different continental regions – such as Africa, Europe, and West Asia (see the estimate above). On average, we see that customers can trace their DNA back to 2.3 different continents.  While half of our customers have 2 continents or more as part of their ethnicity estimate, some have only one continent — and others have all six!
  • Main Regions in an ethnicity estimate – According to U.S. Census data on census.gov, “the overwhelming majority (97 percent) of the total U.S. population reported only one race in 2010. This group totaled 299.7 million. Of these, the largest group reported white alone (223.6 million), accounting for 72 percent of all people living in the United States.”  This is thought-provoking because while most Americans self-identify with only one ethnicity, our database shows that some customers can be linked to as many as 11 main regions (or ethnicities), and the average is nearly four regions!  See a graph representing the number of main ethnicity regions per customer, here.  A person’s ethnicity is likely far more nuanced than they may report on a census.
  • Expanding to include Trace Regions – While main regions are those with strong evidence that they are part of someone’s genetic ethnicity, trace regions are those that have a smaller amount of evidence (and that you must click on the “+” sign to view). When we count up regions in both of these categories, customers can be traced back, on average, to 8.5 different regional ethnicities.  This really affirms that our customers hail from a variety of cultures and regions across the world.  Some customers even have 24 out of the possible 26 regions as part of their estimate!
  • African regions – We made an exciting new finding recently that African Americans have on average more than three African regions in their estimates on average. This shows that African Americans too are a melting pot of many unique African ethnicities. 

These statistics and averages demonstrate the diversity of regions often found in an AncestryDNA customer’s ethnicity estimate — and prove that Americans are truly a mix cultures and influences from across the globe.

Advances in science and DNA research are just now beginning to make a significant impact on how we understand ourselves and society at large. While DNA testing often confirms the expected, it can also reveal the completely unexpected. How do your AncestryDNA results compare to our findings?

The post AncestryDNA Regions by the Numbers appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/ancestrydna-did-you-know/feed/ 6
DNA and the Masses: The Science and Technology Behind Discovering Who You Really Arehttp://blogs.ancestry.com/techroots/dna-and-the-masses-the-science-and-technology-behind-discovering-who-you-really-are/ http://blogs.ancestry.com/techroots/dna-and-the-masses-the-science-and-technology-behind-discovering-who-you-really-are/#comments Wed, 12 Mar 2014 19:02:58 +0000 Melissa Garrett http://blogs.ancestry.com/techroots/?p=2075 Originally published on Wired Innovation Insights, 3-12-14. There is a growing interest among mainstream consumers to learn more about who they are and where they came from. The good news is that DNA tests are no longer reserved for large medical research teams or plot lines in CSI. Now, the popularity of direct-to-consumer (DTC) DNA tests… Read more

The post DNA and the Masses: The Science and Technology Behind Discovering Who You Really Are appeared first on Tech Roots.

]]>
Originally published on Wired Innovation Insights, 3-12-14.

There is a growing interest among mainstream consumers to learn more about who they are and where they came from. The good news is that DNA tests are no longer reserved for large medical research teams or plot lines in CSI. Now, the popularity of direct-to-consumer (DTC) DNA tests is making self-discovery a reality, and is leading individuals to learn more about their genetic ethnicity and family history. My personal journey has led to discoveries about my family history outside of the United States. On a census questionnaire I am White or maybe Hispanic. My genetics, however, show I am Southern European, Middle Eastern, Native American, Northern African, and West African. And who knew that DNA would connect me with several cousins that have family living just 20 miles of where my mom was born in central Cuba?

Major strides have been made in recent years to better understand and more efficiently analyze DNA. Where are we today?

  • Easier: DNA testing required a blood draw. Now, you can spit in a tube in the comfort (and privacy) of your own home.
  • Cheaper: In 2000, it took about 15 years and $3 billion to sequence the genome of one person. Today you could get your genome sequenced for a few thousand dollars. To put that into context, if a tank of gas could get you from New York to Boston in 2000, and fuel efficiency had improved at the same pace as DNA sequencing, today you could travel to Mars (the planet) and back on the same tank of gas.
  • Faster: Companies of all kinds are quickly innovating to keep up with demand and to make DNA testing more readily available and affordable. Illumina recently announced a whole-genome sequencing machine that could sequence 20,000 entire genomes per year.
  • More information: We can now tell you things about your ethnicity, find distant cousins, tell you whether a drug is likely to benefit or harm you, and tell your risk of diseases like breast and colon cancer.

It isn’t all roses. There is a joke among the genetic community that you can get your DNA sequenced for $1,000, but it will cost $1,000,000 to interpret it. DNA is complex. Each of us contains six billion nucleotides that are arranged like letters in a book that tell a unique story. And while scientists have deciphered the alphabet that makes up the billions of letters of our genome, we know woefully little about its vocabulary, grammar and syntax. The problem is that if you want to learn how to read, you need books, lots of them, and up until recently we had very few books to learn from.

To illustrate how complex it can be, let’s look at how to determine a person’s genetic ethnic background. Say you are given three books written in English, Chinese and Arabic. Even if you don’t speak the languages you can use the letters in those books to determine what percent of a fourth book is written in each of the respective languages, since those three languages are so distinct. But that is like determining whether someone is African, White or Asian, which doesn’t require a genetic test. What if the three books were written in English, French and German that use a similar alphabet? That is like telling someone that is White that they are a mix of various ethnic groups. That is a much harder problem and one that usually requires a genetic test.

So how do we distinguish the different ethnicities using DNA? Since we don’t have a genetic dictionary that tells us what we are looking for, scientists use the genetic signatures of people who have a long history in a specific region, religion, language, or otherwise practiced a single culture as a dictionary. Once enough of those genetic sequences are gathered, teams of geneticists and statisticians use the dictionary to define what part of your genome came from similar regions.

How does big data play into all of this science?

DNA has been “big data” before the term became popularized. The real question should not be about how much data you have, but what you do with the data. Big data allows companies like Ancestry.com to compare 700,000 DNA letters for a single individual against the 700,000 DNA letters of several hundred thousand other test takers to find genetic cousins. That’s a lot of computational power, and the problem grows exponentially. To make all of this possible, big data and statistical analytics tools, such as Hadoop and HBase, are used to reduce the time associated with processing DNA results.

Given how far we have come in such a short time, what should we expect for the future of consumer DNA? The technology is moving so fast that it is almost worthless to predict. But what is clear is that we won’t come out of this genetic revolution the same. We are going to live better, healthier lives, and we are going to learn things about our species and ourselves we never dreamed of. And importantly, putting genetic ethnicity and family connection in the hands of individuals is going to tear down our notion of race and show how we are all family – literally. Maybe we’ll even treat each other a little better.

Ken Chahine is Senior Vice President and General Manager for Ancestry.com DNA.

 

The post DNA and the Masses: The Science and Technology Behind Discovering Who You Really Are appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/dna-and-the-masses-the-science-and-technology-behind-discovering-who-you-really-are/feed/ 0
Imagine Future Technology for Family History Simulationshttp://blogs.ancestry.com/techroots/imagine-future-technology-for-family-history-simulations/ http://blogs.ancestry.com/techroots/imagine-future-technology-for-family-history-simulations/#comments Tue, 19 Nov 2013 19:14:22 +0000 Lincoln Cannon http://blogs.ancestry.com/techroots/?p=1538 Ancestry.com is a technology company that knows family history – not just a family history company, and not even a family history company that just happens to use technology. Technology, and particularly computing, is essential to our mission to help everyone discover, preserve and share family history. Without it, we could still tell family stories… Read more

The post Imagine Future Technology for Family History Simulations appeared first on Tech Roots.

]]>
Ancestry.com is a technology company that knows family history – not just a family history company, and not even a family history company that just happens to use technology. Technology, and particularly computing, is essential to our mission to help everyone discover, preserve and share family history. Without it, we could still tell family stories to our children, but we certainly couldn’t substantiate those stories from 12 billion historical records into 55 million family trees through the work of 2.7 million subscribers, as Ancestry.com does today across all its websites.

In the 1960s, Intel co-founder Gordon Moore observed that the ratio of computing capacity to cost was doubling predictably, every couple years or faster. In other words, a computer built in 1969 had twice as much capacity as a computer built at the same cost in 1968, and over a hundred times as much capacity as a computer built at the same cost in 1962; a computer built in 1969 would also reliably have half the capacity of a computer built at the same cost in 1970, and less than a hundredth the capacity of a computer built at the same cost in 1976.

Moore's Law

By Courtesy of Ray Kurzweil and Kurzweil Technologies, Inc. (en:Image:PPTMooresLawai.jpg) [CC-BY-1.0 (http://creativecommons.org/licenses/by/1.0)], via Wikimedia Commons

That trend, known as Moore’s Law, has continued to the present. Today, a $150 smartphone can store about a million times more data and process that data about a thousand times faster than the $150K Apollo Guidance Computer that took astronauts to the moon in 1969. The smartphone also has wireless access to extended computing capacity on the Internet, including systems like Ancestry.com, which stores over 10 petabytes of data, and processes over 40 million searches daily.

Suppose Moore’s Law continues. Within decades, whatever replaces smartphones would have millions, billions and then trillions of times the overall computing capacity at the same cost. Within a century, $150 could purchase more computing capacity than that of all human brains combined. If that were to happen, what might the intersection of family history and technology look like? What might Ancestry.com look like? Of course we don’t really know, but let’s imagine.

Moore's Law Projected

By Coutesy of Ray Kurzweil and Kurzweil Technologies, Inc. (en:PPTExponentialGrowthof_Computing.jpg) [CC-BY-1.0 (http://creativecommons.org/licenses/by/1.0)], via Wikimedia Commons

One of the things we might do is tell stories about our family and ancestors at a much more massive scale and at a far deeper level, by computing highly detailed family history simulations. Maybe they would be something like a mix of Google Earth enhanced with a full history of maps derived from geological and astronomical research; Oculus Rift enhanced with brain-computer interfacing for an immersive tactile experience; and Second Life enhanced with avatars generated from family trees, photos, journals, and DNA, and abstracted to sub-neuronal degrees of detail to enable artificial intelligence. In deeper more meaningful ways, we could understand and even feel our family history, as the characters, settings, plots and conflicts unfold before us – as our stories come to life, and we walk in our ancestors’ shoes (literally?).

As it turns out, if ever we compute such family history simulations, detailed to the point of enabling the characters with fully immersive consciousness, there would be a rather shocking philosophical ramification – more on that next time I post.

The post Imagine Future Technology for Family History Simulations appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/imagine-future-technology-for-family-history-simulations/feed/ 0
Unraveling the Science Behind Ethnicity Estimationhttp://blogs.ancestry.com/techroots/unraveling-the-science-behind-ethnicity-estimation/ http://blogs.ancestry.com/techroots/unraveling-the-science-behind-ethnicity-estimation/#comments Thu, 24 Oct 2013 15:54:02 +0000 Julie Granka http://blogs.ancestry.com/techroots/?p=1390 A small tube of your saliva can reveal a lot about your family history hundreds and even thousands of years ago.  At AncestryDNA, we study the DNA in that saliva – using sophisticated science – to reveal your ethnic origins.  We recently announced an update to our ethnicity results which provides customers with a more… Read more

The post Unraveling the Science Behind Ethnicity Estimation appeared first on Tech Roots.

]]>
A small tube of your saliva can reveal a lot about your family history hundreds and even thousands of years ago.  At AncestryDNA, we study the DNA in that saliva – using sophisticated science – to reveal your ethnic origins.  We recently announced an update to our ethnicity results which provides customers with a more in-depth look at where their ancestors once lived.

How does the DNA in your saliva record your family history in the first place?

To understand how, we’ll turn to language, since there are quite a few parallels with genetics.

Language and Geography

You “inherit” your dialect, using similar phrases, sayings, and words as your parents and the people around you.  For example, there are a number of words people use to describe a “sweetened, carbonated beverage.”  The colors in the map below show how often people living in the U.S. use each of three particular words.

Pop soda coke

You can see some clear geographic patterns. Based on their term for soda (I’m a Northeasterner), coke-drinkers from the South cluster together, as do pop-drinking Midwesterners.

So if we met a person who called the sugary drink in their hand a “coke,” we could feel confident in guessing he was from the south.  If he used the word “pop,” he is probably from somewhere in the Midwest.

Back to DNA

When AncestryDNA estimates your genetic ethnicity, we use a similar approach – but instead of comparing your language patterns to those of other people, we’re comparing your DNA.

Just like certain regions of the U.S. appear different based on dialect, human groups can often be distinguished based on lots and lots of genetic data.  By finding the clusters of human groups to which you are similar, based on your DNA (rather than your dialect), we can estimate your genetic ethnicity.

Both DNA and language can help to trace someone’s origin, since both DNA and language are inherited.

But unlike language, which you can “inherit” from people around you, you only inherit DNA from your parents, who inherited their DNA from their parents, and so forth. Thus, our DNA is a mosaic of the DNA of our ancestors.  That DNA tells us about where our ancestors came from.

This is due to the fact that the variation in our DNA represents ancient and modern migrations of humans as we populated the globe.  As humans moved from Africa, to Europe, Asia, and the Americas settling new areas, groups split apart, taking with them their DNA.  By chance, the DNA of groups settling one area could be different than the DNA of those that settled in another.

Over time, individuals from a group of people usually had children with people from the same group.  In so doing, they passed their DNA to their children – generation after generation.  And if a group of people remained relatively isolated from other groups, there wouldn’t be much new DNA entering that group from others.  In this process, the DNA of human populations becomes slightly differentiated.

Going back to our analogy, southerners may have started to say “coke,” and in passing the word to their neighbors and kids, have continued to do so generation after generation.  Similarly, chance movements of humans across the world allow us to see DNA evidence of this history.

At AncestryDNA, we leverage the fact that the DNA of individuals from across the globe shows evidence of human population history.

We examine DNA samples of thousands of people from all over the world who have deep ancestry in a specific global location – for example, individuals whose grandparents were all born in SpainWe then cluster their DNA into 26 overlapping  worldwide regions based on DNA patterns observed between and within the regions.

More simply, we construct a DNA map, similar to a soda/pop/coke dialect map. Some DNA samples represent the Great Britain region, some represent East Asia, and others represent North Africa.  

Ethnicity

Then, we compare your DNA to these individuals to identify from which of the 26 regions you are likely to have ancestry.  When you have DNA that is similar to the DNA of people with deep ancestry in a specific location, you very likely also had ancestors from that same place.  Similar to the linguistics map, we have a good idea of where you might be from if we hear you say “pop.”

In the most recent update to AncestryDNA ethnicity results, we have increased the number of individuals to whom we compare you as well as the amount of your DNA used in the comparison – allowing us get even more specific in certain regions.  This gives us a highly refined estimate of your genetic ethnicity.

It’s important to note that DNA differences between human groups are subtle: the DNA sequences of two random people are on average 99.9% identical.  But, that still means that two random individuals differ at about 3 million DNA positions.  This makes for an often difficult, but exciting challenge in determining ethnicity.

Interpreting your genetic ethnicity

There are a few other important parallels and differences between the linguistics example and a genetic ethnicity estimate.

Let’s say you currently live in the Midwest, but since your parents grew up in the Northeast, you use the word “soda.”  While you identify as a Midwesterner, your dialect might indicate that you’re a Northeasterner instead – like your parents.

Similarly, your genetic ethnicity estimate tells you about your historical origins, not about where you live today.  AncestryDNA estimates go back hundreds to a thousand years, when “populations” and their boundaries were very different than those we know today. This might cause you to have a different genetic ethnicity estimate than you might expect.

But while an individual’s dialect may change when he or she moves to a new location, an individual’s DNA doesn’t.  This also affects your genetic ethnicity.  For instance, if the ancestors of your Italian ancestors migrated from Eastern Europe hundreds of years ago, you might show up as having Eastern European ethnicity instead of Italian.

Pop soda coke

Take one final look at the linguistic map and notice that there are areas that appear to be a mix of others.  For instance, in Oklahoma, people use a combination of “pop” and “coke,” influenced by the regions around them.  This means that it would be difficult to identify someone specifically as an “Oklahoman.”

The genetics of human populations can be similarly affected by migrations between neighboring groups.  This makes it harder to disentangle genetic ethnicity from some regions, like Western Europe, where people and borders have moved quite a bit in the past thousand years.

All of this – estimating someone’s ethnicity from genetics – involves cutting edge science.  By looking at more data, developing novel methodologies, and discovering new patterns in our DNA, we continue to advance AncestryDNA.

That means that the AncestryDNA science team will be up late, drinking pop, soda, coke, and, according to the British scientist on our team, fizzy drink.

The post Unraveling the Science Behind Ethnicity Estimation appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/unraveling-the-science-behind-ethnicity-estimation/feed/ 80
AncestryDNA: part of the scientific communityhttp://blogs.ancestry.com/techroots/ancestrydna-part-of-the-scientific-community/ http://blogs.ancestry.com/techroots/ancestrydna-part-of-the-scientific-community/#comments Wed, 16 Oct 2013 21:33:19 +0000 Julie Granka http://blogs.ancestry.com/techroots/?p=1356 Next week, the AncestryDNA science team will be flying across the country with a tube full of posters. Scientific posters, that is.  We’ll be presenting them at the annual American Society of Human Genetics conference (ASHG) in Boston.  This will mark AncestryDNA’s second year presenting our latest research at the largest worldwide conference in human… Read more

The post AncestryDNA: part of the scientific community appeared first on Tech Roots.

]]>
Next week, the AncestryDNA science team will be flying across the country with a tube full of posters.

Scientific posters, that is.  We’ll be presenting them at the annual American Society of Human Genetics conference (ASHG) in Boston.  This will mark AncestryDNA’s second year presenting our latest research at the largest worldwide conference in human genetics.

Over 6,000 researchers are projected to be at the conference – from academia and industry alike.  Over five days, the science team will be listening to scientific talks, discussing our research with other scientists, and staying abreast of the newest and coolest topics in the field.

At AncestryDNA, we strongly believe in being highly involved in the scientific community.

Discussions with other scientists can lead to eureka moments and plant the seeds for novel research ideas and possibilities.  By engaging with other scientists, we can get feedback on our current endeavors at AncestryDNA – ensuring that we are incorporating the latest developments in population genetics into our own research.

But it’s a two-way street.  Good science requires give and take and the exchange of ideas and criticism.  We too will share our experiences and knowledge.  Just as AncestryDNA learns from other scientists, much of our research can inform the future research of other human geneticists. In some cases, we’ll be collaborating with other scientists to do the research together.

Most importantly, we will maintain an ongoing rapport and relationships with other scientists from academia and industry. As a community, we can together continue to advance our knowledge about human genetics and how it relates to family history.

The ASHG conference is just one of many opportunities for these important interactions with the scientific community.  Throughout the year, through other conferences, guest lectures at nearby universities, discussions with our scientific advisory board, and research collaborations, we’re keeping AncestryDNA’s science fresh and of the highest caliber.

We’re excited for a week of genetics!

The post AncestryDNA: part of the scientific community appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/ancestrydna-part-of-the-scientific-community/feed/ 0
Ancestry.com Employee Honored with Women Tech Awardhttp://blogs.ancestry.com/techroots/ancestry-com-employee-honored-with-women-tech-award/ http://blogs.ancestry.com/techroots/ancestry-com-employee-honored-with-women-tech-award/#comments Thu, 03 Oct 2013 22:44:21 +0000 Melissa Garrett http://blogs.ancestry.com/techroots/?p=1266 Recently, Catherine Ball, VP of Genomics and Bioinformatics for AncestryDNA was announced as a winner for the Women Tech Awards, presented by the Women Tech Council, under the Trailblazer category. The award recognizes technology-focused women who are driving innovation, influencing technology companies, and are passionate about the community.  Other award winners came from companies such… Read more

The post Ancestry.com Employee Honored with Women Tech Award appeared first on Tech Roots.

]]>
Recently, Catherine Ball, VP of Genomics and Bioinformatics for AncestryDNA was announced as a winner for the Women Tech Awards, presented by the Women Tech Council, under the Trailblazer category. The award recognizes technology-focused women who are driving innovation, influencing technology companies, and are passionate about the community.  Other award winners came from companies such as ATK Aerospace Group, Domo, and eBay.

Cathy is a truly remarkable woman. For almost two decades, she has worked as a genomic scientist to help physicians, citizens and other scientists get the most out of genome data. From analyzing data related to large-scale biomedical experiments – well before the current “Big Data” trend – to collaborating on the annotation of the first eukaryotic genome (brewer’s yeast), Cathy has positioned herself and her team on the cutting edge of technology and science. Over the course of her career, Cathy has authored scores of scientific publications, organized several scientific conferences, given dozens of guest lectures, and reviewed hundreds of federal grant proposals and scientific manuscripts that have been key to shedding further light on diverse research topics.

Cathy’s most recent efforts include leading a team of population geneticists, statisticians, and computer scientists to create the analytical approaches behind the AncestryDNA direct-to-consumer genotyping services.

She was born and raised in a small beachside town on the island of Oahu in Hawaii.  Exploring tide pools, streams and rain forests helped spark Cathy’s interest in biology; and living in such a multicultural community provided insights into the ways a person’s life can be affected by family history.

Cathy applies the scientific method to everything she does, which is why this award and The Women Tech Council truly embody Cathy’s desire for each of us to stay curious, be honest, have a sense of humor and search out ways to help those around you grow.

Thanks to the Women Tech Council for providing recognition to women that are driving innovation and influencing technology companies. The Women Tech Council’s mission to provide leadership, resources and mentoring for women, while maintaining a strong bond with the business community has been pivotal to developing top technology talent.

Cathy photo 2

The post Ancestry.com Employee Honored with Women Tech Award appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/ancestry-com-employee-honored-with-women-tech-award/feed/ 0
AncestryDNA Makes Scientific Breakthrough in West African Ethnicityhttp://blogs.ancestry.com/techroots/ancestrydna-makes-scientific-breakthrough-in-west-african-ethnicity/ http://blogs.ancestry.com/techroots/ancestrydna-makes-scientific-breakthrough-in-west-african-ethnicity/#comments Thu, 12 Sep 2013 16:54:27 +0000 Julie Granka http://blogs.ancestry.com/techroots/?p=1130 The AncestryDNA science team presented the results of their latest research today at the Smithsonian Institute’s symposium on The African Diaspora in Washington D.C. Using unique proprietary DNA samples and a variety of statistical approaches, our science team has been able to separate West Africa into six separate population groups based on genetic data.  This… Read more

The post AncestryDNA Makes Scientific Breakthrough in West African Ethnicity appeared first on Tech Roots.

]]>
The AncestryDNA science team presented the results of their latest research today at the Smithsonian Institute’s symposium on The African Diaspora in Washington D.C. Using unique proprietary DNA samples and a variety of statistical approaches, our science team has been able to separate West Africa into six separate population groups based on genetic data.  This advancement will provide a finer-resolution genetic ethnicity estimate for individuals with West African ancestry.

West African ethnicity

AncestryDNA’s six new ethnicity regions of West Africa include Senegal, Mali, Ivory Coast/Ghana, Benin/Togo, Nigeria, and Cameroon/Congo, each of which has a distinct set of tribal affiliations.  The division of West Africa into these groups marks the first time that West African genetic ethnicity estimates can achieve this level of detail, bringing AncestryDNA’s total number of reported genetic ethnicity regions in Africa to ten.

The announcement of the new genetic ethnicity regions were presented at The African Diaspora event earlier today by Dr. Jake Byrnes, population genomics senior analyst on the AncestryDNA science team.  Although these new ethnicity updates will not be made available to all AncestryDNA users for a few more months, we wanted to give the inside scoop on Jake’s Smithsonian presentation detailing the West African ethnicity update as well as additional research findings on the genetics of African Americans.

It can be extremely difficult to research one’s African ancestry using historical records alone, as most African American individuals in the U.S. are unable to find detailed records of their ancestors before the 1870s. Our AncestryDNA test can help family historians use genetics to pick up where the paper trail ends.

AncestryDNA leverages a unique proprietary collection of DNA samples from individuals with well-documented family trees to conduct innovative research in population genetics, human evolution, and migration. The science behind AncestryDNA is continually evolving and improving. During this ongoing process, the science team demonstrated that genetic data reliably shows population structure in Western Africa.  What this means is that the DNA of individuals from Western Africa clusters into a number of distinct groups. As a result, AncestryDNA can now more finely define genetic ethnicity regions in Western Africa. (See the visual representations below.)

 Caption: The graph on the left depicts the distinct genetic clusters of individuals from West Africa. Each point is an individual with deep ancestry in West Africa from our proprietary sample database. The color of each point corresponds to the country (shown in the map on the right) where a majority of that individual’s ancestors lived. The x and y axes indicate two primary axes of genetic differentiation (called principal components, or PCs) as inferred from sample DNA. Points closer together on the plot are more similar genetically. Comparison of the graph on the left and the map on the right reveals the similarity of the genetic and geographic structure.


Caption: The graph on the left depicts the distinct genetic clusters of individuals from West Africa. Each point is an individual with deep ancestry in West Africa from our proprietary sample database. The color of each point corresponds to the country (shown in the map on the right) where a majority of that individual’s ancestors lived. The x and y axes indicate two primary axes of genetic differentiation (called principal components, or PCs) as inferred from sample DNA. Points closer together on the plot are more similar genetically. Comparison of the graph on the left and the map on the right reveals the similarity of the genetic and geographic structure.

Population structure such as this is not new, and even exists in the U.S today.  Here’s an example from the 2010 census data.  Each point is an individual, colored by their self-reported ethnicity.

West African ethnicity 3

You’ll notice that people of similar backgrounds tend to stay and live in the same general geographic areas.  Imagine now if we could roll this map back in time to see where an individual’s ancestors immigrated to the U.S.!

The AncestryDNA science team is looking toward a future where we could reveal, in the absence of a family tree, the most probable locations where one’s ancestors lived – both in the U.S. and abroad.  To do this, the science team hopes to harness the power of collectively analyzing family trees of individuals with similar genetic profiles.

Though this project is still in its infancy, the science team has made some progress. First, we looked at the birth locations of individuals in the trees of all African Americans. Then, we looked for locations where, relative to all African Americans, there appeared to be an over-representation of birth locations in trees of individuals with a particular West African ancestry.  For individuals with Senegalese genetic ethnicity, we found what seems to be an over-representation of birth locations in South Carolina and Georgia in the 1700’s and 1800’s.

This might be an example where the genetics matches up with history.  In the 18th century, plantation owners in South Carolina and Georgia knew little about rice cultivation and preferred to import slaves from Sierra Leone, Gambia, and Senegal (the Windward Coast), where rice is a commonly grown crop. It is thought by some scholars that the Gullah people, who today live in coastal Georgia and South Carolina, descend from slaves imported from the Windward Coast to work specifically on rice plantations.

Providing more detailed ethnicity estimates for West African populations is crucial for American family historians.  Approximately 85-90% of today’s African Americans are descendants of enslaved Africans brought to America between 150 and 450 years ago – leaving many African Americans without a known family history prior to this time. AncestryDNA’s new West African ethnicity update will help to link African American individuals to specific locations in West Africa. In the future, more detailed analyses of genetic data and family trees have the potential to reveal important historical stories.

West African ethnicity 4

Thanks to the science team’s findings of genetic structure in West Africa, the new African ethnicity regions will be a breakthrough for many African Americans and may even reunite the origins of disrupted families.  But more is to come, as we are only scratching the surface of what is possible.

The post AncestryDNA Makes Scientific Breakthrough in West African Ethnicity appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/ancestrydna-makes-scientific-breakthrough-in-west-african-ethnicity/feed/ 17
How Ancestry.com Practices Agile to Solve Challenges with Consumer DNA Testinghttp://blogs.ancestry.com/techroots/how-ancestry-com-practices-agile-to-solve-challenges-with-consumer-dna-testing/ http://blogs.ancestry.com/techroots/how-ancestry-com-practices-agile-to-solve-challenges-with-consumer-dna-testing/#comments Thu, 29 Aug 2013 18:47:54 +0000 Aaron Ling http://blogs.ancestry.com/techroots/?p=1072 A typical web application starts with a blank page. Then in further sprints, you can add features to it. (I sound like one of your Agile coaches, don’t I?) But in reality, the business needs you to deliver more value than a blank page. So, how can you quantify the minimum value you are delivering… Read more

The post How Ancestry.com Practices Agile to Solve Challenges with Consumer DNA Testing appeared first on Tech Roots.

]]>
A typical web application starts with a blank page. Then in further sprints, you can add features to it. (I sound like one of your Agile coaches, don’t I?) But in reality, the business needs you to deliver more value than a blank page. So, how can you quantify the minimum value you are delivering in a product release?

Here was our approach in the release of AncestryDNA by using Agile processes alongside our DNA Backend Engineering team:

Let me start with some background. In May of 2012, Ancestry.com produced a revolutionary new DNA testing service—AncestryDNA. At a high level, this test gives users a percentage breakdown of their ethnicity, and connects them to distant cousins based on DNA matches.

In preparation for the launch, we kicked off the software development of the DNA backend pipeline late in 2011. We faced two main challenges: first, the pipeline needed to be able to process the DNA raw data to yield ethnicity and matching prediction; second, the performance needed to be acceptable.

The first task was easy; we defined the acceptance criteria of our ethnicity and matching prediction accuracy using Test Drive Development (TDD) to make it reach the done-done stage.

The second challenge of performance proved to be more difficult because the reality says, “it depends” on multiple factors. Our pipeline processes DNA samples in batches. As our business grows and the size of the DNA database increases, we will need to have bigger batches. We calculated, “if we don’t improve this,” the numbers will be “X” in two months. Add to this, that different parts in our DNA pipeline respond differently—some static, some linear and some quadratic.

Our next step included a plan to address the growth: first, upgrade the hardware; second, adopt Apache Hadoop to address ethnicity; third, improve disc management to adopt HBase for the academic algorithm Germline, which finds hidden family relationships within a reservoir of DNA (my colleague’s series of posts address how we scaled this academic algorithm). As you can imagine, this original Agile plan evolved as our “what-if” scenarios changed. We then juggled these scenarios again and planned performance enhancement features to solve the next “what-if” scenarios.

Final_Results

The above chart illustrates a snapshot of the running time by all pipeline parts at the end of 2012 when we resolved our scalability challenges. We made the pipeline scale horizontally in almost every part (we really love the “stable” flat line there). The pipeline turned out to be a constantly modified one.  As a result of the frequently done-done and code roll, we increased the batch size several times throughout the period, so the overall performance improvement was more than the scale drawn in the above chart. Our hard work on this project, appropriate planning and performance goals enabled us to deliver value to the business and customers early on. Creating a scalable pipeline also saved us from overinvesting in engineering resources. 2012 had a happy ending for the DNA team – we now have, in-hand, a capable and steady pipeline that allows us to process DNA samples at scale.

Now that you have the background on our DNA pipeline, in future posts, I or my coworkers will blog other development efforts in DNA.

 

The post How Ancestry.com Practices Agile to Solve Challenges with Consumer DNA Testing appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/how-ancestry-com-practices-agile-to-solve-challenges-with-consumer-dna-testing/feed/ 0
2013 WITI Summit – Where Tech Is Going Videohttp://blogs.ancestry.com/techroots/2013-witi-summit-where-tech-is-going-video/ http://blogs.ancestry.com/techroots/2013-witi-summit-where-tech-is-going-video/#comments Sat, 17 Aug 2013 00:24:50 +0000 Melissa Garrett http://blogs.ancestry.com/techroots/?p=1037 Recently, Ancestry.com Vice President of Genomics and Bioinformatics, Cathy Ball, participated in a panel discussion at WITI’s Women Powering Technology Summit where executives from leading tech companies shared thoughts on the future of where technology is going.  The session was moderated by Liz Gannes of AllThingsD and included executives from Qualcomm, CA Technologies, EMC, and… Read more

The post 2013 WITI Summit – Where Tech Is Going Video appeared first on Tech Roots.

]]>
Recently, Ancestry.com Vice President of Genomics and Bioinformatics, Cathy Ball, participated in a panel discussion at WITI’s Women Powering Technology Summit where executives from leading tech companies shared thoughts on the future of where technology is going.  The session was moderated by Liz Gannes of AllThingsD and included executives from Qualcomm, CA Technologies, EMC, and AT&T Labs, in addition to Ancestry.com.

The panel, titled “Where Technology Is Going,” was recorded and is now available on YouTube – link included below.  Topics of discussion covered the latest developments in technology, Big Data trends and the importance of using technology and data to tell stories.

Storytelling has always been at the heart of our business.  Our AncestryDNA product is backed by a brain trust of ten scientists with a passion for telling stories about family history.  As our DNA sample database grows, our science team continues to innovate and make new discoveries about where our ancestors lived hundreds, and perhaps even thousands, of years ago.

The post 2013 WITI Summit – Where Tech Is Going Video appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/2013-witi-summit-where-tech-is-going-video/feed/ 0
The Science Team at AncestryDNAhttp://blogs.ancestry.com/techroots/the-science-team-at-ancestrydna/ http://blogs.ancestry.com/techroots/the-science-team-at-ancestrydna/#comments Fri, 02 Aug 2013 22:54:17 +0000 Julie Granka http://blogs.ancestry.com/techroots/?p=962 If we already had all the answers, there wouldn’t be any more science to do. Pie charts and percentages tell AncestryDNA customers the story of where their ancestors probably lived, and lists of DNA matches help them to find living relatives and expand their family trees.  Behind those results are terabytes of data, years of… Read more

The post The Science Team at AncestryDNA appeared first on Tech Roots.

]]>
If we already had all the answers, there wouldn’t be any more science to do.

Pie charts and percentages tell AncestryDNA customers the story of where their ancestors probably lived, and lists of DNA matches help them to find living relatives and expand their family trees.  Behind those results are terabytes of data, years of hard work, and a lot of rigorous science.

Behind this science is the AncestryDNA science team.  This team of ten really smart people and 15 advanced degrees (nine of which are PhD’s) make current and future features of AncestryDNA possible.

On the science team, we ask lots of questions.  To answer those questions, we analyze vast amounts of genetic data, write and test (and debug) a lot of computer programs, calculate volumes of telling statistics, and make tons of detailed graphs. We then get really excited about all of those graphs and statistics, discuss our findings, generate more questions, and do it all over again.

This science is at the forefront of major advancements in human genetics.  Its ultimate goal is to provide our customers with the best information for making discoveries about their family stories.

By nature, our scientific research is iterative and cyclical.

Genetic ethnicity and relationships based on genetic data have to be estimated.  Our estimates deal with uncertainties, since inheriting DNA from your parents and through the generations is a random process.  We also make our estimates based on sets of assumptions and approximations that model recent human genetic history.

As you might imagine, that history is really complicated, and there are a number of different ways that we could model itWith further scientific advancements, we can continue to construct better and better approximations of the way things happen in the real world.  That means better AncestryDNA estimates.

That brings us back to the science team’s experiments and tests.  When we gather new data or develop methods that will improve DNA results, we push to implement them.  Our customers’ DNA results are based on a lot of thorough science, and our curiosity and scientific rigor continue to drive us.  Future advancements you’ll see in the AncestryDNA experience benefit from the scientific breakthroughs that we are making today.

So as the current science underlying our estimates of genetic ethnicity and genetic relatives progresses, so will the ethnicity and matching results.  Iterative improvements to AncestryDNA results are evidence of the sweat of the science team, and AncestryDNA customers directly benefit from our research.

By studying the unique collection of data that is part of our experiments, we are also generating novel scientific knowledge.  As part of this unique collection of data, our customers are helping us to uncover a great deal of fascinating information about human migrations and genetic genealogy.  Our findings have been and will continue to be released to the broader scientific community through publications, conferences, and this blog.

In future posts, I’ll give the technical details of our research projects. I’ll reveal our latest scientific conclusions, explain the challenging issues we are seeing and how we are addressing them, and describe how our research advances AncestryDNA – and ultimately affects you.

In the meantime, we’ll be doing some really cool science.

The post The Science Team at AncestryDNA appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/the-science-team-at-ancestrydna/feed/ 6