Today, I saw an interesting table posted to Facebook, summarizing a genealogist’s family tree. It listed a handful of generations along with the number of possible ancestors in each generation, and the individual’s known ancestors for that generation.
Out of curiosity, I generated a similar table with my own data:
There are many interesting data points in the table. For instance, between the 7th and 8th generations, I drop from knowing 71% of all of my ancestors to knowing just 51% of my ancestors. At 10 generations, with 2046 total ancestors in all generations, I only know a quarter of them. And while I feel very confident for the first 6 or 7 generations; after that I’m much less confident with my family tree.
So this is an interesting exercise, but is a table like this at all applicable to genetic genealogy?
The Importance of Knowing Your Family Tree
Having a chart like this for your own family tree, or at least some knowledge of the information in such a chart, is a vital consideration in one’s determination of confidence in a conclusion made using DNA.
I recently stated the following in a thread in the ISOGG Facebook group:
“No atDNA paper or proof argument should EVER make a conclusion based on shared segments without at least a sentence or two about the lack of known overlap in other family lines. Or, alternatively, addressing known overlap. That is a fatal error.”
My point was that whenever we make a conclusion about a particular ancestor or ancestral couple based on segments of DNA shared with a relative, we absolutely must address whether we do, or could, share other ancestors with that relative.
For example, when I’m reviewing someone’s conclusion, I need to know whether they’ve at least considered the possibility that they could share DNA from another line instead of or in addition to the line on which they’ve focused. I need that information to evaluate their conclusions.
Now, maybe I’ll go one step further. Perhaps I’d also like to see how likely it is that they might share DNA via more than one recent line of their family tree; one vital factor in that probability determination is the extent to which they and their match have researched their family tree. And this is the topic I’d like to get other people’s thoughts on.
Of course there are many caveats, but ultimately they don’t really change my question. For example, we all know that family trees are notoriously error prone and can be subject to misattributed parentage. As a result, even the most complete family tree can be misleading. Or maybe your family lines are so diverse that you can be fairly certain there’s no unknown overlap affecting your conclusion. In either case, providing that information will help others evaluate your conclusions.
I know what some of you are thinking right now: “I’m not going to publish my conclusions anyway.” Or perhaps: “It doesn’t matter what others think of my conclusions.” Maybe you aren’t writing for a publication, and maybe you’re doing this for yourself, but ultimately your genealogical conclusions will be judged and evaluated. Whether it’s a relative that we decide to share it with, or a relative going through our files after we’re gone, all of our conclusions are ultimately evaluated.
Let’s use a quick example.
Joe, Julie, and John are all predicted to be about 5th cousins with each other (although I use 3 cousins, it could be 4 or 5, or even more). They all share a 22 cM segment of DNA on chromosome 3 which they’ve triangulated to a shared 4th great-grandparent, George and Susan (Gold) Silber. The triangulation of the segment of DNA is a definite plus, and it looks like a good conclusion.
But let’s factor in their family trees:
- Joe: an experienced genealogist, John knows 85% of his family tree out to the 6th generation
- Julie: an intermediate genealogist, Julie knows about 45% of her family tree out to the 6th generation
- John: a beginning genealogist, John knows about 10% of his family tree out to the 6th generation
How confident do you feel about their conclusion now? Julie is missing 55% of her family tree and John is missing 90% of his family tree. Doesn’t that increase the likelihood that their shared ancestor could be located within an unknown area of their family trees?
Is this information we should have when we evaluate a genealogical hypothesis or conclusion using DNA?
Does this example suggest that beginning genealogists shouldn’t make conclusions using DNA? Of course not! Does this example suggest that we need to ignore any conclusion unless you have 100% of your tree completed? Of course not!
Does this suggest that everyone should create and share a chart like the one I’ve provided when they want others to evaluate their genetic genealogy conclusion? Perhaps, yes. I can say that it only took about 30 minutes for me to create these graphs, so the burden is extremely low (especially when compared to the hundreds of hours spent on DNA!) and the return on investment is extremely high.
So what are your thoughts on the matter?