As you might recall, a few weeks ago I sent out a call for information about the amount of DNA shared by people having a known genealogical relationship. I was hoping to get a better picture of the ranges of the amount of DNA shared by people in these relationships (through about the third cousin range). Although people like Tim Janzen have gathered this type of data and so kindly made it available for everyone, I felt like more data was needed.
What is the range of cMs shared by third cousins? What does the distribution within that range look like? Does the longest segment factor into that at all? If so, how?
These are the types of questions I wanted to examine. And to entice submissions, I offered a free Family Finder kit to one lucky person that submitted data prior to April 1, 2015.
So, I sent out a call for information to be submitted through a portal. And boy did you respond!!
And the Winner is…
I randomized the list of 6,078 submissions in Excel, and then I used random.org to select a random number between 1 and 6,078. The number was 2,358, and the winner was email address [email protected]! Congratulations, and thank you for your submission!
And the REAL Winner is…
Everyone! There is a wealth of information in these 6,000 submission. For example, here is a breakdown of the number of submissions for certain relationships (not all):
Third Cousins – A Preview
It’s going to take me some time to review all the data and make it available for publication here on the blog and on the ISOGG Wiki. However, in the meantime, I wanted to take a quick look at the 501 third cousin submissions (you’ll see that 1 had to be thrown out due to a data entry error) to get a feel for the issues I will encounter and the results we can expect to see.
So here are some statistics for the 500 usable third cousin submissions:
Both the median and the average show a large increase between ‘no endogamy reported’ and ‘endogamy reported.’ Third cousins are predicted to share 53.13 cM of DNA, and the median and average for the ‘no endogamy reported’ category are close to that value.
Here is the distribution of the total shared cM between the minimum of 0 and the maximum of 334.1:
And here’s the distribution of the longest cM segment between 0 and 127.3:
These are just rough numbers for fun, and are subject to change. For instance, I will probably recalculate the averages to be averages for those that reported shared DNA, meaning that I won’t include the values of 0 cM. And there will probably be more lessons along the way.
Once again, a big thank you to everyone that contributed! Be sure to stay tuned for more updates!