Posted by on January 11, 2013 in AncestryDNA

AncestryDNA™ is one of the most advanced autosomal DNA tests on the market, but that doesn’t mean our job is done. We are constantly working to improve our genetic ethnicity prediction models by deciphering the unique language of the human genome and employing some of the top geneticists and latest technology to help determine what it can tell us.

Dr. Ken Chahine, Sr. VP of AncestryDNA and one of our top scientists, explains some of the challenges we face when using DNA to predict ethnicity, including the work we do to innovate in this field and deliver the best product possible to our customers. Here’s some perspective on genetic ethnicity from Dr. Chahine.

Before AncestryDNA, ethnic origins were largely a breakdown of continental ethnicities. Most of us, however, don’t need a genetic test to determine whether we are European, African, or Asian. So, we challenged ourselves to push the boundaries of the science and attempt a more granular ethnic breakdown, especially within Europe.

Below is a map showing the detailed ethnicity coverage of AncestryDNA.

AncestryDNA Ethnicity Results Coverage

Why are we one of the first to launch a product that breaks down ethnic origins beyond the continental level? Simply put, it is very difficult. Europeans, Africans, and Asians are genetically very distinct. However, it is not as easy to ethnically distinguish between a British, a German and a French person, and it is especially difficult to decipher the ethnicity of an individual with ancestry from all three or some other comparable mixture.

 

The Language of Genetic Ethnicity

Since the genome was sequenced in 2000, we have made great strides in our understanding of human genetics and inheritance. The truth is, however, that the human genome is still largely a language that we don’t understand. Sure, we’ve deciphered the alphabet that makes up the 3 billion letters of our genome, but we know woefully little about its vocabulary, grammar and syntax.We know, for example, that height is inherited. Yet, using genetics alone still makes it difficult to predict height. Most of the genetic signatures (i.e., alleles) that we have identified as being associated with height contribute only a small percent to one’s ultimate vertical fate. In other words, while we understand how to read the letters of the human genome we don’t always know what it is telling us.

 

Piecing the Puzzle Together

To continue with the language metaphor, let’s assume I give you three books written in languages that you do not speak. Then I tell you, as a point of reference, what language each one is written in: one in English, one in Arabic, and the third in Chinese. Then imagine I give you another book written in one of those three languages and ask you to tell me which language it is. Using only the letters in the reference books, it would be relatively easy to not only determine which language the book is written in, but even what percent of the book was written in each language, if it contained a mixture of all three. This is because the alphabets of the three languages are distinct and don’t overlap.

 

Now imagine the same puzzle, but instead of English, Chinese and Arabic, the books are written in English, French and German. In this case, it is clearly more difficult to discern where one language ends and another begins, since all three use mostly the same basic alphabet. We must then rely on three basic strategies to distinguish the languages. First, the frequency of certain letters that appears to be used more or less frequently in French, English and German. Second, the relative position of letters, such as the combination of letters “ch,” “sch,”and “ing.” Third, letters such as “ç”,“ß,” and “ü” which are unique to certain languages. As you can see in the graph below, even though the languages are different, the frequency of the letters used in all three languages is relatively consistent. Therefore, most of the letters are of little use in distinguishing the languages.

 

Frequency of letter usage across three languages

 

There is one more important point to make—we don’t have a dictionary! That’s right, there is no genetic dictionary that tells us the frequency of the letters, the relative position of the letters, or even the unique letters that occur in different European populations. AncestryDNA is building this genetic dictionary by analyzing the genetic signatures of people who have a long cultural history in a specific country or region, have spoken a certain language, and practiced a single religion. Once we have the genetic sequences, our team of Ph.D scientists in genetics, bioinformatics, machine learning, and statistics work to find clues that help us distinguish genetic ethnicity and provide our customers their ethnic make-ups.

 

The European Challenge

The good news is that the genetic ethnicity prediction is working, albeit with some challenges. Central Europeans present the most significant difficulty, especially the French, Germans, and Dutch. With few geographic barriers and extensive human population movement, their genetic signatures are very similar and difficult to distinguish. The British Isles and Scandinavia are more genetically distinct, but their signatures partially overlap with each other, as well as with parts of Central Europe. All of this makes it difficult to assign predicted ethnicities. So, let’s say your German ancestry doesn’t seem to be showing up in your DNA ethnicity results or it seems like you’re getting a bit too much Scandinavian, know that the ethnicity prediction can be updated over time as we make advancements in this area.

This is just one example of why the ethnicity prediction portion of the AncestryDNA test is continually evolving. We are using the largest set of DNA reference samples from around the world and deeper genetic coverage in order to find those unique “letters” that will aid our analysis. In the meantime, we’re excited to have our AncestryDNA customers be a part of the breakthroughs as we continue to improve our prediction algorithms. And as they evolve, we will send you updates as new findings are discovered. The AncestryDNA test could have easily predicted your continental ethnicity as European, Asian, or African, but why settle for results based on the status quo? As Michelangelo is quoted as saying, “The greatest danger for most of us is not that our aim is too high and we miss it, but that it is too low and we reach it.” It’s one of the many benefits of AncestryDNA. So don’t be surprised if your ethnicity results get updated over time. This is a good thing and it just means our science team is working hard to better your experience.

If you’re interested in learning more about the new AncestryDNA, or would like to order a DNA test, you can click here.

40 Comments

A 

When will customers be able to download their DNA information directly? I know it has been mentioned early 2013, but I was wondering if a date has been set.

Thanks.

January 11, 2013 at 11:30 pm
LynnFDR 

From the Ancestry Aces Facebook page I do know that the skew towards Scandanavian ancestry seems to be a problem many people believe is occurring concerning their genetic ancestry determinations. I know my own 76% seems to be, well, 76% too high.

I expect that some Scandanavian ancestry might creep in; but when your ancestors come from Poland, Niedersachsen, Baden, Brabant, Luxembourg, Lorraine, Franche-Comte, and Ct Jura, one would expect at least some Central European ancestry to creep into the mix.

Instead I get 76% Scandanavian, 14% Persian/Turkish/Caucasus, and 10% Southern European. Not one designation commensurate with the locales that are known for my ancestors. Perhaps it is a deep ancestry in time that is showing through in my DNA mix; but one would intuite that there would be at least some congruence with ancestral locales.

January 12, 2013 at 12:58 am
BEE 

It’s understandable that 75% of my DNA would be 75% Eastern European, since three of my grandparents were born in Poland, and the fourth, of parents born in Poland. I would suppose the 25% Scandinavian comes from the many invasions, but I really expected a little bit of influence from either the east or west from other invasions. Having said that, I have two 5th to 8th cousin matches. The first, my grandfather and his grandmother came from the same little village in southern Poland. The second, we actually have three matching surnames from that same village. Why the first is “moderate confidence” and the second “low confidence”, I have no idea. However, some of the many other matches, even though they are very distant, are ridiculous. I think the strangest so far is the 62% Western African, 24% Scandinavian, 6% South Asian, 8% uncertain. I’ve found at least a dozen almost as strange, all showing up on the Eastern European side. If “Europeans, Africans, and Asians are genetically very distinct” – it makes no sense to me. All the names on my tree are Polish. There is no way to make any connection to any of the names that come in on the Scandinavian side, so I hardly bother looking at them, but I do look forward to more updates and hopefully, further “matches”.

January 12, 2013 at 7:21 am
Audrey Babbitt 

I am truly having much difficulity understanding my DNA test results. I have had two DNA tests completed.

I guess I need some help.

Help!

January 12, 2013 at 8:44 am
Bwilson 

I guess admitting it is the first step, now lets fix the problem with your AncestryDNA Ethnicity Prediction software.

January 12, 2013 at 9:16 am
Marie Kovac 

Mine shows 31% european jewish but I don’t of any matches yet. Seems odd that their wouldn’t have been anyone that has European Jewish dna.

Although this dna comes as a shock to me.

January 12, 2013 at 12:31 pm
Chris Ferch 

I hope that 2013 brings some updates in our results. I appreciate your explanation – I never did expect to find links to the cities of my ancestor’s births but I do still have trouble understanding how my 100% German-surnamed tree could result in 90% British Isles/10% Persian/Turkish/Caucasus. The British Isles seems like a “leap” – sorry, had to say that:)

Adding to my disappointment, with over 1250 matches, I’ve had zero confirmed/zero hints. There are 4 people with 96% confidence matches and healthy trees (like mine) but no common names or locations.

I’m hesitant to think of NPE’s occurring from all sides of my family – I’ll be patient and hope for an update.

January 12, 2013 at 3:07 pm
Debbie Kennett 

Thank you for the interesting explanation. I thought the language analogy was particularly apt. It does seem that these predictions are in their infancy, and I’m sure they will improve in time. It would be a big help if you could provide us with some technical details about your test, and specifically details of the reference populations that you are using. I am particularly interested to know what you are using for your “British Isles” reference populations. I live in England and all my ancestors are from the UK going back for several hundred years. However, according to your test I am 25% British Isles, 58% Central European, 24% Eastern European and 4% uncertain. I’ve noticed that many of the Americans I match have much higher percentages of “British” than me, though often they only have a small percentage of documented British ancestry with a significant percentage of Continental European ancestry. Are you perhaps using reference populations that consist of Americans of British ancestry, which might explain the discrepancies?

January 12, 2013 at 4:28 pm
CeCe Moore 

Hi Stephen and Ken,
Very nice article! When might AncestryDNA be providing more details in regard to the specifics of the reference populations being used for the Genetic Ethnicity tool?
Thank you.

January 12, 2013 at 4:50 pm
Kathy 

I appreciate the finetuning of your DNA process. However I almost feel like my results may have been a mixup. 40% Scandinavian when I only have one relative several generations back and 10% Eastern European when I have none in my ancestry is strange. Also I have many German ancestors on both sides for generations yet no German ethnic results. How can that be? Should I try another test?

January 12, 2013 at 7:17 pm
Mark 

Before getting this test done, people should really educate themselves a little on how DNA is passed down from your ancestors to you and also how the science of DNA analysis/deciphering works.

January 13, 2013 at 3:39 am
Diane 

I appreciated the article, and will spend more time going over it. It is likely I am not understanding the whole autosomal DNA process. My test results showed 71% Scandinavian, 17% Southern European, 10% Eastern European, and 2% uncertain. My mother is 100% Scandinavian, so that part of it fits. But 71%? My paternal grandmother came from Czechloslavakia – hence Eastern European I guess. They were German. But I have no southern European. Basically, I have worked on my grandfather’s family tree for 30 years. I KNOW where these people came from! They were of English origin, and some German – none of which show up in DNA results. Way too much Scandinavian, absolutely no German or English at all??? I have often wondered if I got the right results.

January 13, 2013 at 8:59 am
Ruth Detjen 

I think this was a great explanation and I enjoyed reading it and sharing it on Facebook. At first I was a bit shocked that there was no discernible or definitive Central European in my DNA results, BUT I do have 7% uncertain. I’m looking forward to the day when my uncertain will come into focus. The other percentages – 70% British Isles, 15% Eastern European and 8% West African are more or less correct.

January 13, 2013 at 2:15 pm
Angie Bush 

The language analogy was a great one to use. I’m hopeful that as ethnicity predictions continue to be revised that they will coincide better with the genealogical information in my trees. I am also curious as to the amount (if any) of the SMGF data that is used in the ethnicity prediction. It would be nice to have a better understanding of the reference populations being used. Keep up the great work!

January 13, 2013 at 9:08 pm
Angie 

The language analogy was a great one to use. I’m hopeful that as ethnicity predictions continue to be revised that they will coincide better with the genealogical information in my trees. I am also curious as to the amount (if any) of the SMGF data that is used in the ethnicity prediction. It would be nice to have a better understanding of the reference populations being used. Keep up the great work!

January 13, 2013 at 9:11 pm
Colleen Lukoff 

I believed I was part of an ongoing experiment – to build a body of knowledge and learn a bit about myself and my heritage if I was very lucky. I am pleased that, as more becomes known, the results of my test may be changed to reflect the increased learning. What an exciting time to be part of family history:-)

January 14, 2013 at 12:56 pm
Tom S. 

I am 50% Italian and 25% Hispanic. I have traced my ancestors living in southern italy back to the early-1700′s. My DNA results came back ZERO PERCENT Southern European but 42% Central European and 21% Turkish/Persian. I understand that there was a lot of migration in Europe during the last thousand years but having no Southern European DNA makes me doubt the results.

January 16, 2013 at 10:17 am
Audrey 

My dad is 100% Italian, with his grandparents all being from the same Ligurian mountain comune. So I should be 50% southern European or thereabouts–because half my genetic makeup is from him (and there is a blue-eyed line, this is outside Genoa, though they are NOT sailors). I came back at 11% southern European, 9% unknown. How is that possible? (And fyi, I very much resemble my dad, and people notice!). I am not that surprised that no German showed up because you only inherit 50% from each parent. But I got 74% British Isles, 11% southern European, 9% unknown, and 6% Scandinavian. I expected 50% southern European (Italian), 20% BI, 10% central European (German)–with possible a touch of Scandinavian or Eastern European. How off it is really confuses me.

January 17, 2013 at 1:47 pm
Anne 

This was not what I expected at all no matches anywhere???

January 17, 2013 at 4:59 pm
Christian Zank 

To be honest, it is still confusing for me after reading the article. But I learned some new stuffs thought.. Will definitely try this DNA test later.

January 19, 2013 at 7:53 am
Cheryl 

I had my DNA done and it pretty much matched. I was excited to get back my son’s DNA and that is where it got interesting. His father’s side is directly from Ireland and Sicily so I excepted to see that along with some mixture of mine. However, instead he had Scandanavian and Persian along with my European side. I was concerned the results where right and so I called. I was reminded that these results may not show your most current ethnic backgrounds but may go way back. When I started looking up the history of northern Ireland, I found that the Vikings were a major presence there for a long time. Way before we started recording anything. The same with Sicily. It was a major strategic area in that area when the Greeks, Romans, Turks, Persians and Moors where fighting over land. So when I dug deeper, it does make sense. I have learned that you have to know the history of the area you are researching and to remember that the country borders that we know today are really not what they used to be.

As far as matches, I have found one direct match and I have connected a couple of distant matches. Again, I feel the results are correct, I just think that the connections are farther back than we have researched.

January 19, 2013 at 9:53 am
Lydia 

Exactly Cheryl!!!!
People need to research these areas before they jump to the conclusions that the DNA is wrong. They also need to know how the DNA actually works….. Every piece of DNA your parents have, does not get passed down & cut in half in you. My mother has DNA that I dont have, & DNA is different in siblings as well. You can also carry DNA that is inactive in you, but may become active in your child. For example neither of my parents have an uncertain, but I do. You also can have a higher percentage of something than your parent has. They say it takes about 6 children, for all of a persons DNA to get passed down.
If you are confused, Call the ancestryDNA support, they have been educated & can answer these questions. They made me understand things that I didnt know before.

January 20, 2013 at 10:50 am
Michele 

I can no longer see the “Ethnicity” data for my matches. What is going on?

January 23, 2013 at 2:10 pm
Trevor Thacker 

Hi Michele, The issue has been reported and we’re working to resolve it at this time. Sorry for the inconvenience.

January 24, 2013 at 3:08 pm
Suzanne 

I can see how DNA can be used for making ID info for solving much in todays matching for cases of modern reasons such as in criminal activity? As for ancestory back to creation that may need much more done yet. Especially when so many believe by faith in a Father Son & Holy GHOST SPIRIT SOUL. HIS written Word from the Dead Sea Scrolls. Today almost all accept all the 62 Books Of the Holy Bible Genesis-Malachi + Matthew Mark Luke &John with Joseph Mary,Peter Paul Abraham Sarah Hagar Isaac Ishmael Jacob Esau. JOHNS REVELATION.OF Jesus Christ Cross Gospel.+

January 27, 2013 at 11:17 am
Ed Rocco 

“Before getting this test done, people should really educate themselves a little on how DNA is passed down from your ancestors to you and also how the science of DNA analysis/deciphering works.”

Well said Mark.

Further: “Sizing Up the Family Gene Pool” should be read; http://www.nytimes.com/2012/02/26/magazine/ethicist-dna.html?_r=0.

“A Government Accountability Office investigation into so-called direct-to-consumer genetic testing found inaccurate results and exaggerated claims about how much those results could really tell you. One expert declared the testing so inaccurate that, when it came to medical inquiry, “the most accurate way for these companies to predict disease risks would be for them to charge consumers $500 for DNA and family-medical-history information, throw out the DNA and then make predictions based solely on the family-history information.”

“As for the privacy issue, your concern is well founded. Many of these companies do use customers’ data for medical research or commercial applications, or they sell it to third parties whose interests you might never know. Legally they can’t do that without your consent, but the fine print on those consent forms goes by so quickly that it can be hard to follow.”

January 28, 2013 at 12:49 pm
Alice 

So here is a question. I was adopted at birth. I know my maternal side but I have little information on my paternal side. Many of my DNA matches make sense as they appear on the maternal side of my tree. But these results make me question the validity of the DNA matches: (1) are some of the DNA hits I’m getting, that don’t appear to have any connection to my tree, possibly those of my birth father’s family? and (2) Some DNA matches/hits appear on my adopted family’s tree where I have no biological connection at all – so the DNA matches to that tree seem to prove that the results aren’t reliable.
Any insight?

February 26, 2013 at 8:02 pm
Jackie 

Sorry, my results were less than satisfactory. Okay, the Vikings were a highly mobile and prolific bunch, but I think the %’s quoted are extremely high, certainly for my profile. The other two geographic areas for mine were certainly plausible, but actually gave no “ethnic” information, as many ethnic groups lived or passed through the geographic areas. And no mention of any Native American background, even though my NA heritage has been researched and documented by several different individuals and genealogists.

March 17, 2013 at 10:05 am
Phyllis 

A person with a possible 5-8 cousin was e-mailed for my review. When I looked they had 69% British Isles and 31% Scandinavian and no uncertain %.

My per centages are 83% Eastern European and 12% Central European and uncertain is 5%.

Does this possible match suggest my uncertain percentage may consist of one of both of her Genetic Ethnicity?

March 17, 2013 at 5:21 pm
AncestryDNA celebrates DNA Day. Wait, what’s DNA Day? | Ancestry.com 

[...] how DNA and family history come together. Check out this recent blog post that discusses how DNA works and some of the challenges we are constantly working on to advance the [...]

April 25, 2013 at 1:28 pm
Arlene Miles 

So when do we get to decipher the 700,000 snps of our DNA?
I already knew the ethnicity of my parents, grandparents and great grandparents. I wanted to know more about the ancestors from Germany and Ireland. Where did they come from and how long ago?
Would it be wiser for me to study Anthropology?

June 17, 2013 at 12:19 pm
Francine Long 

Could you provide a higher resolution version of the ethnicity map you show here that includes all the ethnicity classes that you test for? The map is very grainy and impossible to even read the labels. When I click on it to bring up a better view, I just get redirected to my genetic test results page. There I only get to see the two ethnicity areas that came up in my own test.

June 25, 2013 at 5:15 pm
Jaffe 

I am upset about my results because I have a 3rd grandmother who was from California and Mexico from 1850-1930. From my understanding of Mexican history, shouldn’t there be ethnic diversity in my test? No, no reference to anything Latino-Hispanic. My great-grandfather was a full blooded Mikmaq Canadian. Is there any mention of Native American? No. Ancestry.com probably didn’t even know that NorthEastern Native American fit haplo group q but are genetically distinct from all other Natives in the US and Canada. My Italian grandma, English-Scottish heritage, They got the Swedish 2nd great grandma but rumors are suggesting Ancestry.com can’t miss that one. Irish roots didn’t show and I know I’m not as French as my bro’s dad but my results said Central European, Scandinavian, unsure. What!? That’s all, huh? Considering my dad’s background has some French but they said I was really Central European, and I have to wonder, where’s the British, Irish, Italian, Native American, Latino, in these results. They tried to feed me some line about how I inherit DNA. Oh, so I’m of all things simply Central European and Scandinavian. OK, they are flubbed results, they are screwing us into buying crap and then who knows what with our genetic information. If a lawsuit opens up I’m jumping on that bandwagon, Ancestry.

July 2, 2013 at 7:24 pm
Benny 

I fell victim to the discounted rate for the DNA test earlier this year by not fully researching the type of test it was. I personally do not care what my ethnic background is, I just want to trace the individuals in my family history, where ever that may lead. I should have taken the test for Males only to help confirm connections on my father’s line that some others claim connection to. I can’t see how ethnicity can help with this. The only connection I have confirmed with the autosomal test is one of my first cousins and our percentages aren’t even close. The only thing I can figure there is that the differences must have come from our mother’s.

August 22, 2013 at 7:46 pm
katya 

Well both my parents are from Latin American countries but my test put me at 45 %Native American 35% European mostly Eastern Europe with only 8% Southern European! !! 7% West African and 13% Unknown? !

August 29, 2013 at 10:43 pm
Kelly Corrales 

I just received my results for the AncestryDNA test and was very shocked to see that I am 50% scandinavian, 21% southern european, 13% central european, 11% eastern european. I was very surprised to see that I had no British Isles with a confirmed British/Irish pedigree. I know that these percentages may change with new discoveries but I would advise everyone to do some research in the history of Europe. England and Ireland were invaded by the vikings and danes and that may absolutely explain my 50% scandinavian heritage. One of my genetic matches is solving a mystery in my family of an adoption, turns out my g grandfather may not have been adopted by taken in by his natural father. Am looking forward to new discoveries and grateful for this new technology.

September 9, 2013 at 9:08 am
sally 

Like most people here I was surprised and doubtful when I first saw my results. All of my lines go back to germany, holland and france at least 200 years, but I only had 18% central european. But when I thought about it and did some research I could accept the eastern european, scandinavian and even the persian-turkish due to my dad’s mother, but I’m still at odds with 25% british isles. There is no one in my family that came from there, at least 200 years back. I think that dna probably was confused with my german DNA since the central european is so low. Its fun though and I’m excited to see the updates.

October 14, 2013 at 7:28 pm
Robin 

The Ethnicity V 2.0 Ancestry has just released has a “White Paper” on the development and procedure of ethnicity predictions. You can find it on your DNA homepage by clicking on the red question mark found on the get help section of your results. Although not written in standard scientific journal format, it explains how exactly this re-evaluation of ethnicity was done. Ignore the math, but if you have a good science background you should be able to get something out of it and understand why predicting ethnicity is so difficult and ever-changing, and will continue to be so.

October 18, 2013 at 6:20 pm
Robin 

Whoops. The question mark is on your main Ethnicity Estimate page. Upper right corner,

October 18, 2013 at 6:25 pm
William 

I can imagine that I have some Scandinavian DNA as a large number of my ancestors came from England and Ireland but my results show 28% Scandinavian with only 5% British and Irish. I have about 6% Pensylvania German ancestry along with about 2% Dutch. Nothing appeared in my DNA to account for either of these groups. The Pennsylvania Dutch come from three separate families.

I attempted to download my raw data to try some of the other software that is available but was unable to do so. The Ancestry.com representative said that this was a known issue and they were working on it. They have apparently been working on it for more than a month and a half. Perhaps they should hire one of the local high school kids to solve this. It is clearly beyond their capability.

February 8, 2014 at 5:33 pm