A small tube of your saliva can reveal a lot about your family history hundreds and even thousands of years ago. At AncestryDNA, we study the DNA in that saliva – using sophisticated science – to reveal your ethnic origins. We recently announced an update to our ethnicity results which provides customers with a more in-depth look at where their ancestors once lived.
How does the DNA in your saliva record your family history in the first place?
To understand how, we’ll turn to language, since there are quite a few parallels with genetics.
Language and Geography
You “inherit” your dialect, using similar phrases, sayings, and words as your parents and the people around you. For example, there are a number of words people use to describe a “sweetened, carbonated beverage.” The colors in the map below show how often people living in the U.S. use each of three particular words.
You can see some clear geographic patterns. Based on their term for soda (I’m a Northeasterner), coke-drinkers from the South cluster together, as do pop-drinking Midwesterners.
So if we met a person who called the sugary drink in their hand a “coke,” we could feel confident in guessing he was from the south. If he used the word “pop,” he is probably from somewhere in the Midwest.
Back to DNA
When AncestryDNA estimates your genetic ethnicity, we use a similar approach – but instead of comparing your language patterns to those of other people, we’re comparing your DNA.
Just like certain regions of the U.S. appear different based on dialect, human groups can often be distinguished based on lots and lots of genetic data. By finding the clusters of human groups to which you are similar, based on your DNA (rather than your dialect), we can estimate your genetic ethnicity.
Both DNA and language can help to trace someone’s origin, since both DNA and language are inherited.
But unlike language, which you can “inherit” from people around you, you only inherit DNA from your parents, who inherited their DNA from their parents, and so forth. Thus, our DNA is a mosaic of the DNA of our ancestors. That DNA tells us about where our ancestors came from.
This is due to the fact that the variation in our DNA represents ancient and modern migrations of humans as we populated the globe. As humans moved from Africa, to Europe, Asia, and the Americas settling new areas, groups split apart, taking with them their DNA. By chance, the DNA of groups settling one area could be different than the DNA of those that settled in another.
Over time, individuals from a group of people usually had children with people from the same group. In so doing, they passed their DNA to their children – generation after generation. And if a group of people remained relatively isolated from other groups, there wouldn’t be much new DNA entering that group from others. In this process, the DNA of human populations becomes slightly differentiated.
Going back to our analogy, southerners may have started to say “coke,” and in passing the word to their neighbors and kids, have continued to do so generation after generation. Similarly, chance movements of humans across the world allow us to see DNA evidence of this history.
At AncestryDNA, we leverage the fact that the DNA of individuals from across the globe shows evidence of human population history.
We examine DNA samples of thousands of people from all over the world who have deep ancestry in a specific global location – for example, individuals whose grandparents were all born in Spain. We then cluster their DNA into 26 overlapping worldwide regions based on DNA patterns observed between and within the regions.
More simply, we construct a DNA map, similar to a soda/pop/coke dialect map. Some DNA samples represent the Great Britain region, some represent East Asia, and others represent North Africa.
Then, we compare your DNA to these individuals to identify from which of the 26 regions you are likely to have ancestry. When you have DNA that is similar to the DNA of people with deep ancestry in a specific location, you very likely also had ancestors from that same place. Similar to the linguistics map, we have a good idea of where you might be from if we hear you say “pop.”
In the most recent update to AncestryDNA ethnicity results, we have increased the number of individuals to whom we compare you as well as the amount of your DNA used in the comparison – allowing us get even more specific in certain regions. This gives us a highly refined estimate of your genetic ethnicity.
It’s important to note that DNA differences between human groups are subtle: the DNA sequences of two random people are on average 99.9% identical. But, that still means that two random individuals differ at about 3 million DNA positions. This makes for an often difficult, but exciting challenge in determining ethnicity.
Interpreting your genetic ethnicity
There are a few other important parallels and differences between the linguistics example and a genetic ethnicity estimate.
Let’s say you currently live in the Midwest, but since your parents grew up in the Northeast, you use the word “soda.” While you identify as a Midwesterner, your dialect might indicate that you’re a Northeasterner instead – like your parents.
Similarly, your genetic ethnicity estimate tells you about your historical origins, not about where you live today. AncestryDNA estimates go back hundreds to a thousand years, when “populations” and their boundaries were very different than those we know today. This might cause you to have a different genetic ethnicity estimate than you might expect.
But while an individual’s dialect may change when he or she moves to a new location, an individual’s DNA doesn’t. This also affects your genetic ethnicity. For instance, if the ancestors of your Italian ancestors migrated from Eastern Europe hundreds of years ago, you might show up as having Eastern European ethnicity instead of Italian.
Take one final look at the linguistic map and notice that there are areas that appear to be a mix of others. For instance, in Oklahoma, people use a combination of “pop” and “coke,” influenced by the regions around them. This means that it would be difficult to identify someone specifically as an “Oklahoman.”
The genetics of human populations can be similarly affected by migrations between neighboring groups. This makes it harder to disentangle genetic ethnicity from some regions, like Western Europe, where people and borders have moved quite a bit in the past thousand years.
All of this – estimating someone’s ethnicity from genetics – involves cutting edge science. By looking at more data, developing novel methodologies, and discovering new patterns in our DNA, we continue to advance AncestryDNA.
That means that the AncestryDNA science team will be up late, drinking pop, soda, coke, and, according to the British scientist on our team, fizzy drink.
About Julie Granka
Julie has been a population geneticist at AncestryDNA since May 2013. Before that, Julie received her Ph.D. in Biology and M.S. in Statistics from Stanford University, where she studied genetic data from human populations and developed computational tools to answer questions about population history and evolution. She also spent time collecting and studying DNA using spit-collection tubes like the ones in an AncestryDNA kit. Julie likes to spend her non-computer time enjoying the outdoors – hiking, biking, running, swimming, camping, and picnicking. But if she’s inside, she’s baking, drawing, and painting.
[…] Unraveling the Science Behind Ethnicity Estimation […]
[…] are interested in the science behind the ethnicity estimates and how we analyze your sample, read, “Unraveling the Science Behind Ethnicity Estimation” by our population geneticist, Julie […]
[…] we estimate the ethnic origins of their ancestors from their DNA sample—what we call a “genetic ethnicity estimate.” AncestryDNA customers can currently trace their ancestral origins to specific parts of the […]