I’ve received a number of emails and comments (see, e.g., here) complaining about Ancestry.com’s new test, AncestryDNA. Specifically, several test-takers believe that the Genetic Ethnicity Prediction provided by Ancestry.com does not reflect the numbers that they expected based on their own research.
“I just got my DNA test results back from Ancestry.com and I am concerned. I was born in England and I have gone back many generations and have found that all my ancestors as far back as the 1600′s in most cases are English. According to the results I have no British Isles DNA. It states that I have 60% Central Europe, 30% Scandinavian and 7% Southern Europe. I also have 3% unknown. How can this be?”
“Just received my results: 21% Southern European and 79% Central European which doesn’t follow years of work on my family history.”
Do these comments reflect errors in AncestryDNA’s Genetic Ethnicity Prediction, or are there other factors at play?
Although I am not privy to the ‘behind-the-scenes’ at Ancestry.com, I don’t believe that there are serious issues with AncestryDNA’s Genetic Ethnicity Prediction. Ancestry.com’s DNA arm has a solid scientific team and a large and valuable reference database.
Indeed, Ancestry.com is well aware of the limitations and challenges that their Genetic Ethnicity Prediction brings:
We use cutting-edge science as a base for our predictions, but that comes with its own inherent challenges. It’s an emerging field with exciting new discoveries and developments constantly changing the landscape. Right now, your genetic ethnicity may not look quite right, with some ethnicities under or over-represented. As scientists gain a deeper understanding of the data, our prediction models will evolve to provide you with more accurate and relevant information about your family history.
It’s important to understand that biogeographical estimates, which are still relatively new, are notoriously difficult and complicated. Ten different researchers analyzing the same genome can come up with ten different estimates based on a number of different factors, including their algorithm, the reference populations used for comparison, and many others.
Here are just a few factors that can influence a biogeographical estimate, and any one or more of these may be the reason that your Genetic Ethnicity Prediction does not match estimates you make based on your paper trail.
- Different Reference Populations and Algorithms
As I suggested above, different companies use different reference populations and algorithms to create a biogeographical estimate, which can result in varying estimates.
For example, in my previous review of AncestryDNA’s Genetic Ethnicity Prediction, I compared my genetic ethnicity results from three companies (Ancestry.com, 23andMe, and FTDNA), and found that their results varied considerably. I’m not surprised by this, but I do expect that over time – as the industry arrives at more standard reference populations and algorithms (which the cheap whole-genome sequencing revolution will enable) – that estimates from different companies will align much more closely. Be patient and enjoy being a pioneer.
- You Have TWO Family Trees!
Remember that “Everyone Has Two Family Trees – A Genealogical Tree and a Genetic Tree.” Your Genealogical Tree is the tree containing ALL of your ancestors. However, only a tiny subset of these individuals actually (randomly) contributed DNA to the genome that you walk around with today. These ancestors are the only individuals in your Genetic Tree. It has been estimated, for example, that at 10 generations, only about 10-12% of ancestors in your Genealogical Tree are actually in your Genetic Tree!
Accordingly, even if a decent percentage of your ancestors at 10 generations originated in the British Isles, there is possibility that your DNA – and thus your Genetic Ethnicity Prediction – could include very little or absolutely no British Isles ancestry, simply because of the rules of genetics.
Ancestry.com tries to explain this as well (I’m biased, but I think my “Everyone Has Two Trees” explanation is a little clearer; I’ve had great luck explaining this to newbies):
So if you look at your family tree, it may indicate a pedigree-based ethnicity of 30% English, 20% Scandinavian, and 50% Italian (based on birth locations of your great-great-great grandparents). While this is one valid way to look at ethnicity (and in fact has been the only way until recently), DNA analysis can reveal the actual percentage of your DNA that is reflected by these ethnic groups. So your genetic-based ethnicity might reveal you are 40% British Isles, 15% Scandinavian, and 45% Southern European. Both measures are accurate and informative—but they are measuring different things.
- Misleading Labels
Another issue with any biogeographical estimate is the labels used to describe a population. For example, what does “Scandinavian” or “Central European” really mean? Does “Scandinavian” mean that great-grandpa must have been a Swede, or does it mean something else?
Ancestry.com defines the “Scandinavian” with the modern day locations of Norway, Sweden, Denmark, but explains in their FAQ that it can mean much, much more:
Ethnic groups moved around. Because people move over time, (and when they do they take their DNA with them), a group may contribute DNA to other groups at different times. So ethnic groups can be defined by time and place—not just location. For example, if you have German or British ancestors in your family tree, it’s a possibility that your genetic ethnicity may be partly Scandinavian. The Viking invasions and conquests about a thousand years ago are likely responsible for occurrences of Scandinavian ethnicity throughout other regions. And there are similar examples for other ethnicities. With your results, we provide historical information describing migrations to and from the regions to give you a broader picture of the origins of your DNA.
Similarly, the “Central European” label is defined to include the enormous swath of land in Europe including the modern day locations of Austria, Belgium, France, Germany, Netherlands, Switzerland, Slovenia, Czech Republic, Luxembourg, and Liechtenstein.
I certainly don’t think of France as being “Central Europe,” which shows that a test-taker shouldn’t rely on the labels alone. Dig a little deeper.
- Non-Paternal Events (NPEs)
I won’t dwell on non-paternal events, because I believe they have become too much of a scapegoat. Non-paternal events, or NPEs, can be broadly defined as secret or unknown breaks in your Genealogical Tree (adoption, infidelity, etc.). At some point every single Genealogical Tree has an NPE, although current estimates vary widely. Consider the possibility of a break in your tree, but focus on the other factors presented here as the more likely explanation for your unexpected results.
Reviewing My Genetic Ethnicity Prediction
I have a fairly well-documented Genealogical Tree. My documented ancestors were mostly from the British Isles (England and Ireland) and France, with far fewer ancestors from Germany, and Central America. Years ago, based on my paper trail, I might have predicted 65% British Isles, 20% Irish, 15% French, and 5% German.
In light of the above, let’s review my AncestryDNA Genetic Ethnicity Prediction:
- Scandinavian – 78%
- Central European – 12%
- Uncertain – 10%
At first glance and without any of the knowledge above, these numbers seem way out of whack. I don’t have a single document ancestor from Scandinavia or the area I think of as “Central Europe.”
However, when I learn that “Central Europe” includes France and Germany, a contribution of 12% “Central European” doesn’t seem far-fetched. Further, considering that ancestry in the British Isles can include “Scandinavian” ancestors as a result of relatively recent Viking conquests (on a genetic timescale), perhaps the 78% Scandinavian isn’t so far-fetched either.
While I am still surprised that I don’t report any British Isles DNA, that could simply be because of difficulties in deciphering between Scandinavian and British Isles, or perhaps because of the random inheritance of DNA from those ancestors rather than others.
Lastly, where’s my confirmed Native American and African DNA? Well, these percentages are rather small (~ or <5% each) and I’m sure they’re contained within the “Uncertain” category.
In any event, I’m not discouraged by my results, and I fully expect my results to change over time.
Lastly, as Ancestry.com has warned, don’t forget that your results are subject to change with revisions of their algorithms and new discoveries. And if Ancestry.com is dedicated to the best and latest results, your results almost certainly will and should change.
What are your percentages? Do they match your expected percentages? If you were unhappy with your AncestryDNA Genetic Ethnicity Prediction, does any of the above change your view?