I’ve recently noted a trend among genealogists to discount unexpected (or unwanted?) DNA test results in order to make the results fit an existing hypothesis, instead of properly re-evaluating the hypothesis in light of the new DNA evidence. (This is NOT made in reference to any specific person, post, or question; it is rather something I’ve been mulling over for some time).
Let’s take third cousins as an example. According to Family Tree DNA’s FAQ, you will share detectable DNA with approximately 90% of your third cousins under FTDNA’s threshold. According to AncestryDNA’s help page (see “Should other family members get tested?”), you will share detectable DNA with 98% of your third cousins under AncestryDNA’s threshold. In other words, if you have 100 third cousins and they all get tested (how’d you do that?), you will share DNA with 98 of them.
So what does this mean? Very simply, it means that if predicted third cousins do NOT share DNA, then there is a very strong presumption that they are NOT third cousins. It is easy to believe that the third cousins could be in the 2-10% of third cousins that don’t share DNA, but this is statistically unlikely; it is far, far more likely that they are not actually third cousins.
The presumption is rebuttable, of course, but it would take a great deal of very strong evidence to convince me that the third cousins are in the 2-10%. And I’m not sure I could be convinced without some type of DNA evidence.
So while there aren’t many absolutes in genetic genealogy, I do believe that there are a few:
- Second cousins and closer (nuclear family, grandparents, aunt/uncles, first cousins, great-grandparents, second cousins, etc.) – if you don’t share DNA with them, then you aren’t second cousins or closer. The amount of DNA you share will shed light on the relationship, but there is no relationship at this level if you don’t share DNA detectable by a testing company. Period.
- Third cousins – Since 90 to 98% of third cousins share detectable DNA (see above), predicted third cousins are almost certainly NOT third cousins if they don’t share DNA above the company threshold. A lot of very good evidence will be needed to rebut this presumption (and may not be possible without some type of DNA evidence).
- Fourth cousins – Since 50% (FTDNA) to 71% (AncestryDNA) of fourth cousins share detectable DNA, predicted fourth cousins are probably not fourth cousins if they don’t share DNA. This presumption is much easier to rebut. However, discerning genealogists should always begin with the assumption that if predicted fourth cousins don’t share DNA, it is statistically more likely than not that they are not fourth cousins.
- Fifth cousins and further – Since fewer than 32% to 10% of relationships at the fifth cousin and further level share detectable levels of DNA, it is entirely likely that actual cousins at this level will not share DNA. Lack of sharing at the fifth cousin level or beyond is unlikely to support or refute a hypothesis (with a tiny handful of very rare exceptions).
Of course, there’s a lot missing here (because they aren’t absolutes). For example, deciphering between a half-sibling and a full-sibling relies on the amount and/or pattern of shared DNA. And how does a third cousin once removed fit into this? Or a second cousin twice removed? And so on.
What do you think? Are these fundamental principles that we can use for genealogical research? How would you change them? Without getting into the amount of shared DNA, are there any that you would add?
.
I like it. I had never known the percentages before asking if atDNA tests could detect the Catholic consanguinity marriage impediment (3rd cousins or closer). These are very useful to know.
A known paper trail 3C1r has a small shared segment (7cm) and a 4th cousin about the same. Yet a family of 5 half siblings who I can find no paper link with are sharing 25cm with me and my siblings. Is it possible to set such hard and fast rules yet?
So you are saying those of us who think we are making dna matches in Colonial times, 1600s, are not making dna matches, we are only making paper matches?
No, I’m not sure how that message might have been conveyed from the post. If you share DNA with someone at the fifth cousin level or beyond, none of the points above apply. I mean, none of the bullet points above address in any way the scenario where you DO share DNA with a fifth cousin level or beyond (which would be the Colonial scenario you describe).
Often families in Colonial times who stayed in a certain physical location would marry among a group of families and those families would marry back into their original family so then again at some point descendents would marry into the similar families and so on. So the dna basically stays very similar so you can easily match people way back who come from these original families. When people spread out to other parts of the country or married distinctly different dna people, then the breakdown occurs without reintroducing the same dna back into the family line. Mutations occur and dna drops off or is replaced or doubled or whatever. Moves to other countries of non similar dna would probably effect this even more. Saying all that to say, on paper they may be missing the last name of a for sure relative or you may be missing that line totally so regardless of dna matches sometimes they can match and you can’t figure it out, other times it’s obvious because the original families kept marrying each other back and forth down multiple generations. I think how long you live in the same gene pool makes a big difference. Hope that makes sense…
The solution to the problem of the 3rd or 4th cousin from the paper trail who doesn’t show as a DNA match is to test more people. A 3rd cousin of my mother recently tested. Four of his supposed 3rd cousins had previously tested and matched each other appropriately (my mother, her sister, her first cousin, her second cousin). The new 3rd cousin matched 3 of the 4. I have been building a group of test subjects at each level before going to a more distant level to confirm relationships.
Great way to explain this and really good explanation for understanding how to make the best use of our matches. Thanks
I like Blaine’s analysis (I usually do). Don’t necessarily accept non-Matches with 3rd and 4th cousins – dig deeper. Create your own clusters to see if the non-Match doesn’t match other cousins.
When a potential/probable/paper trial 3rd or 4th cousin does not show up as a Match at a company, the raw data should be uploaded to GEDmatch and compared with yourself and others in the one-to-one utility with the threshold set to 300 SNPs and 3cMs. Look for triangulation with other known cousins to verify one of the small segments.
On a technical note, I believe we share some DNA with almost all of our ancestors back 8 generations (7th cousin range) per calculations by Graham Coop – about 215 of our 256 ancestors at that level will share some DNA. However, at that level the average (repeat: average) would be about 0.1cM, or perhaps 100,000bp. Clearly this is not detectable with any of the companies today. However, with “sticky segments”, there is a lot of range, including segments well above average: and we do see many of our 6th to 8th cousins as Matches. In fact most of our Matches are in this range. Think about it… if most of our Matches were 5th cousins or below, we would be confirming a lot more Common Ancestors than we are.
If we took your scenario one step further, we assumed there were 100 3rd cousins all tested (I don’t know how I did that). For every other “known” 3rd cousin that lacked common DNA with the person in question, the odds of them being a true 3rd cousin would become even less likely. If it turned out he/she had common DNA with several of the cousins, the more likely he/she was truly in the 2% – 8% group.
Thank you very much for this article!
18 months ago I began corresponding with a gentleman via Ancestry.com. On paper he showed us to be 3rd cousins, 1 x removed. I was skeptical due to inconsistencies in one of his direct ancestor’s names as listed in historic documents. The fact that we did not show as matching via AncestryDNA convinced me that we were not in fact cousins.
He challenged me to take a Y-DNA test. While my initial 12 marker Y-STR test results were identical to his, this only provided for the possibility that we might have had a common male ancestor in 30 generations. After additional DNA testing on my part (111 Y-STR markers) it is all but certain that we are in fact 3rd cousins. Subsequently my paternal uncle took both the AncestryDNA test and Y-STR tests. My uncle does match this gentleman, both on the autosomal test and the Y-DNA test. We subsequently uncovered additional historic documents that listed his ancestor’s name correctly. I have no idea why two official records (US Census and state probate records) with only a year between them would list a child by completely different first names, but they do.
At the 3rd cousin level I would suggest that a lack of a match with a single individual does not meet that threshold of ‘proof.’
Sorry, Blaine, I can’t share your certainty for third cousin and more distant relations. I have two examples in my own family (including my sister) where both paper genealogy and DNA from multiple tests show that third cousins don’t always match — and I have not had anywhere near 100 tests done. Also, based on interactions with various cousins involved in one of the 3rd cousin relationships, I can say that if these kind of absolute statements are presented as agreed upon and “true”, there are going to be a lot of people claiming the non-existent non-paternal events “must” have happened. I think it is much better to stick to the 10-50-90% figures.
I had to chuckle at your premise because my very first experience testing a 3rd cousin was an exception to your rule. In trying to resolve time and place of an NPE, I immediately asked my nearest living (and elderly) 3rd cousin on that side of the family to test. She did not match at the FTDNA threshold and that opened up any number other possibilities regarding the NPE. I eventually located another 3rd cousin to test – the 2nd cousin of the first (they shared a great-grandmother.) This one clearly matched the first 3rd cousin and clearly matched me. Only when I uploaded the data to gedmatch was I able to see that I match the first of these cousins at above 3 cM in 14 places, the largest being 7 cM, 5.3 cM, and 4.4 cM (total of 53.7cM) True, by the narrowest of margins we didn’t make the cut at the accepted threshold but if FTDNA had provided some additional information they would have saved me a lot of angst, effort, and time.
BTW, on a note related to part of your post, you might want to take a look at FTDNA forum (advanced autosomal section) and enjoy the thread by Georgian1950 titled “are the majority of Americans related.” (He also has posted on the Gedmatch forum.) At the very least you’ll shake your head in despair as you read his hypothesis, and it might even provide you with a new blog topic.
I think it’s important to make a distinction here re: 3rd cousins match 9 out of 10 times. That means several things:
1. 1 of 10 doesn’t meet the matching threshold with you.
2. That one 3C almost certainly shares some DNA with you below threshold (you can check at GEDmatch)
3. So in a group of ten 3C – on average – the one that you don’t match will match the other 9 (thus confirming they are cousins from the CA)
4. Each of the ten – on average – will only have a shared segment(s) with 9 of the others.
5. It pays to share and track all of the cousins of a given ancestor who test (like Ancestry Circles)
6. Extrapolate this to 4C, 5C, etc. About 8C or so, you’ll finally – on average – find a cousin who really doesn’t share any DNA with you.
You are certainly correct, Jim, but implicit in your statement is that we are living in a near-perfect world where a.) we have ten 3rd cousins in one ancestral line and b.) they are still alive, and c.) they have tested or are willing to test. As it happens in the situation to which I referred, the two I mention are the only still around and they are well into their dotage (and I’m not far behind.) And you use the phrase “all the cousins of those who test.” The simple fact is that testing is still relatively new and “those who test” a relatively small group – even at a several million – which, arguably, is concentrated in the US and, perhaps, the UK. I posit, without any negative intent, that many of you who are intricately involved in genetic genealogy, are living in your own rarified world at this moment.
Please note that I have found matches and have paper records supporting connections at the 4th through 6th cousin levels but I am more stumped by the many 2nd-3rd cousin matches for which no support can be found. Unless I have many more cousins in unexpected parts of the US, I suspect that there are many more IBS matches above that 7cM threshold than is currently hypothesized. But it’ll all come out in the wash eventually!
I believe there are IBS shared segments in the 7-10cM range (and below), but I don’t believe there are any (well almost none) which triangulate which are IBS.
My point had nothing to do with testing ten 3rd cousins… It’s about the fact that virtually all 3rd to 6th cousins will be genetic cousins, albiet some will be below threshold at the 3 testing companies; and that the shared segments can often be found at GEGmatch.
I have found a number of Common Ancestors born in the UK with Matches in TGs. In fact, I believe some of our TGs have “sticky segments” which are probably 10-12 generations, or more, back. As such, these segments may well have UK CAs.
I agree that all of this will come out in the wash.
Some of the conclusions seem a bit off-base. The detectable DNA you share with 98% of your third cousins under AncestryDNA’s threshold does indeed mean that 98/100 of your third cousins should appear as a match. It says nothing about the false postive rate. That is, it could capture 98% of your cousins, miss 2% of your real cousins and falsely capture 22% of people who are not your cousins. (The 22% is just an example). As far as I know they don’t publish their false positive rates, just what is known as the sensitivity of the test (the 98% in the example.) The false positives could be due to endogamy or sheer chance, just to list a couple of reasons.
Andrew – I’m not sure I understand how a false positive rate would change the proposed (rebuttable) assumption that you aren’t third cousins if you don’t share DNA.
There probably is a false positive rate, although due to the amount of DNA sharing at the third cousin level, the error is not in identifying you as a match, but in the prediction of the relationship.
It looks like I misread some things the first time. My apologies. Interesting post!
The statement that if you don’t match your predicted 3rd cousins, you are almost certainly not 3rd cousins is misleading. This post implies that 90-98% of the time this is the case. But what you haven’t explained in this blog post is that while not matching them is an unexpected event, you are comparing the probabilities to another unexpected event, an NPE. And you haven’t gone into the actual probilities of that being the case, which was necessary for you to make the statement that the NPE is more likely. Either by NPE or by phasing, not matching a 3rd cousin has resulted in an unexpected event which has already 100% occurred, your fallacy was in comparing its likelihood to the expected event not another unlikely event.
Sorry if I didn’t get the phasing definition right. Put another way, if I have 100 3rd cousins and get them all tested, I’ll expect to have 2-10 third cousins who don’t match at the thresholds put forth by the testing companies. The important number to compare that to is the number of third cousins I could normally expect would not be related due to an NPE. If you want to say it’s more than 2-10 that’s fine but you have to show that. You can just ignore the 3rd cousins who match, and only consider the ratio of genealogical non-testing 3rd cousins to NPE 3rd cousins. But you have to look at the probabilities of what the likelihood of NPEs is. For instance if it is a maternal line 3rd cousin, not all that probable.
Kimberly – the probability of an NPE is irrelevant, because we have the overall probability that any set of third cousins will match: 90-98% of the time. The probability of an NPE, or simply a lack of shared DNA, is therefore 2-10%.
No, the probability of a NPE is not irrelevant. The 2-10% does not represent the number of expected NPEs in a group of 100 paper trail 3rd cousins. It is calculated to represent the number of genealogical 3rd cousins who no longer share any segments of the same DNA from their common ancestors due to the random processes of inheritance. In other words, when you and a paper trail 3rd cousin test and you come up not a match there are then two possibilities. The first is that you and your 3rd cousin did not inherit the same large enough to be detected segments of DNA from your common ancestors. This happens 2-10% of the time based on the testing algorithm. The second possibility is that you are not genealogical 3rd cousins due to an NPE or other mistake in the tree. In order to consider which is more likely, you have to establish the second number, which is how many times out of 100 your paper trail 3rd cousins are not related to you due to an NPE. Say on average 10 of your 100 3rd cousins are not related due to NPE. Then using the 90% algorithm, you are equally likely to be looking at an NPE or a case of the test not picking up shared DNA between two genealogical relatives. It is a common but total fallacy to look at the 90% number once you have established that you are in the 2-10%, you have to look at the probability of the other possibility.
See my comment below about the rebuttable presumption.
I see your point and agree with much of it, but in terms of the presumption that 3rd cousins that fail to share DNA are probably not 3rd cousins, the rate of NPE would only come into play if you wanted to know approximately what % of the 2-10% are NPE, and what % of the 2-10% are below the company threshold, and what % of the 2-10% just don’t share DNA even though they’re third cousins. For the presumption, it doesn’t matter which of these categories you fall into. Once you’ve fallen into the 2-10%, my theory states, then you should assume that you are not third cousins unless you confirm it with other data (either DNA and/or paper).
I would like to know the various percentages just for the sake of knowing them, but that wouldn’t help me rebut my presumption in any way. In other words, if I have a set of third cousins that don’t share DNA, then probabilities of NPE versus ‘below the threshold’ versus ‘lack of sharing by third cousins’ won’t help me identify the DNA and/or paper records I need to prove or disprove the third cousin relationship.
Sounds like you are interpreting the stated facts differently. Blaine is assuming the 2-10 percent is for all purported cousins. Personally, with the statements given by these companies, I would disagree, and would assume that they mean out of all proven actual 3rd cousins.
Having said that, however, I also disagree with Kim’s assertion that a non-match is only due to 2 events: either a true cousin who doesn’t share enough dna, or an NPE. There is a third option: your paper trail is simply wrong. It happens. Therefore, it is not enough to “consider the ratio of genealogical non-[matching] 3rd cousins to NPE 3rd cousins.” It is nearly impossible to pin down the likelihood of the paper trail (or assumption) being wrong or inadequate, so there can be no ratio assigned to Kim’s scenario.
I agree with Blaine that NPE is not part of this discussion. With an IBD shared segment you are a biological cousin with that Match (whether it’s an NPE or not). A false positive is an IBC (or IBS) “shared segment”, made up by a computer algorithm. This means the “segment” you or your Match, or both, “share” is not from one ancestor – it’s not IBD. And so, at least at this area of your chromosome, you are not genetically related.
It is relevant, because what Blaine is saying here is that most likely people who don’t match, are not genealogical 3rd cousins. It is entirely possible to be genealogical 3rd cousins but not share any large segments of DNA that can be picked up by the testing algorithms. This happens 2-10% of the time at the 3rd cousin level. They are not making any statements about the likelihood of it being an NPE or other genealogical error when they calculate that 2-10% figure, otherwise the 2nd cousins wouldn’t be so close to 100% probability.
Just to clarify, and I think it will help the discussion, but technically I said that the assumption should be that they are most likely not genealogical 3rd cousins. It is a rebuttable presumption, however, for exactly the reasons you state: they may be the extremely rare third cousins that don’t share detectable DNA, or the amount of DNA is below the company threshold. Personally, I think that represents the vast majority of the 2-10%, with NPEs being a very small contributing factor.
I think if I said something like 3rd cousins that don’t share CANNOT be third cousins, then the rate of NPE would be important.
Ok, this is going somewhere. The statement that any two random people who test and don’t match are probably not 3rd cousins holds for all kinds of reasons. The statement that two people who believe they have evidence to prove that they are 3rd cousins, are probably not 3rd cousins if they test and don’t match has to be matched by the probability that their evidence is wrong, vs. the probability that the test was unable to pick up the genealogical relationship. You seem to be going back on yourself a bit here by stating that in the 2-10% the vast majority just aren’t picked up by the test and NPEs are a very small factor. I would tend to agree that is likely but I don’t have the statistics at hand. My point is that NPEs are not included at all in the 2-10%. The comparison here is between the likelihood that the evidence of genealogical relationship is wrong, vs the 2-10% non detection rate. Not the fact that it was unlikely in the first place to have it not test positive. Neat discussion thanks.
I agree, again, with Blaine. Don’t mix apples and oranges. Virtually all 3rd, 4th and 5th true cousins (whether or not they are NPE) are genetic cousins – they share some DNA from the CA. That’s apples. About 10% of them won’t share enough DNA to be called a match at FTDNA. That’s oranges. Ancestry claims that only 2% 3C won’t match there – because they set the threshold so low (which threshold also includes many false Matches).
My issue with the FTDNA threshold, 7cM I believe, is that it creates false negatives that you never know exist and cannot be identified unless both sets of data are at Gedmatch. If FTDNA provided the means to look at matches at the margins by varying the threshold, it would be a significant benefit. The FTDNA chromosome browser will let one examine small segments between matches but only among those with at least one segment greater than 7cM. So which is worse – having too many matches to cull or unnecessarily thinking you are not related to a person for whom you have paper documentation and thereby have an NPE in the last three or four generations? I’m no fan of Ancestry and I know they don’t provide info on the segment sizes to help trim the population, but at least you get to see them.
Check your facts before you declare them, it’s certainly not true that all 4th cousins share any of the same DNA from their common ancestor. It is estimated at 50%. With 5th cousins the number sharing DNA from their common ancestors drops off even further to 5%.
Check out http://www.theroot.com/articles/history/2014/11/cousin_relatedness_how_much_dna_do_they_actually_share.html
Kim,
To whom was that blast directed? I don’t see anything preceding it to which it might apply.
And does it really drop from 50% at the 4th (!) cousin level that you specify to 5% at the 5th cousin level. Wow, that’s quite a drop-off. Or perhaps you inadvertently misstated your facts . . . 😉
Sorry, pushed the reply button to the post above yours, and I guess it lists below your reply to the same post. Check the article for the evidence: it is consistent both with the stochastic nature of inheritance and the statistical probabilities.
Kim, I think your post “nailed” it. I’m still early in learning genetics and their meanings but I have a lot of experience in statistics. The threshold values must come from a stat analysis. What I wonder about is the contributions of the non-family line parent. They certainly must add combinations of dominant and recessive genes from possibly far back in their family line. My two paternal aunts have children that it is easy to see different physical traits from both parents. It is not hard to see that that sort of diversity could be magnified at the level of third, fourth and so on cousins. My understanding is that we humans are about 1% genetically different than some of the ape family so what real percentage are we dealing with? Just my humble thoughts as I know I have much to learn yet. Thanks for all the posts here. jt
Jim,
NPE aside (I agree it has no relevance here), you have just introduced a new thought into the discussion – that an IBS match can be (always is?) no more than a construct of the computer algorithm used for the analysis. Earlier you also agreed that an IBS match could be in the 7-10 cM range. Along these lines, is it not possible that a false positive or IBS match could exceed 10cM simply because the algorithm is not sufficiently sophisticated to avoid them on occasion? Something like the old saw of a bunch of monkeys playing with a typewriter might eventually duplicate a Shakespearian play, only more likely even if rare.
Sorry, pushed the reply button to the post above yours, and I guess it lists below your reply to the same post. Check the article for the evidence: it is consistent both with the stochastic nature of inheritance and the statistical probabilities.
Yes, I checked the article and I see where you found your numbers. However I would be a bit careful calling that statement a fact – it’s only an opinion or an estimate – based presumably on reasonable deduction but still an estimate even if it was the august CeCe who said it. I’d also interpret the article by Cece Moore and Dr. Gates a bit differently than you – I believe it says that 4th cousins will “match” 50% of the time, not that they will share their common ancestor’s dna 50% of the time. It also says that “when you get out BEYOND the 5th cousin level” the odds decrease to 5% – it doesn’t state what BEYOND means but clearly it’s not AT the 5th cousin level. Finally, a “match” depends entirely on the threshold used to determine the match – as was stated earlier, FTDNA has a 7cM threshold which is the number I believe CeCe usually uses, while Ancestry uses a much lower number. I match one of my two 3rd cousins in one line at a level lower than 7cM and FTDNA didn’t consider us a match. Only testing my only other living 3rd cousin on that line was the relationship “proved.”
I don’t think it’s an opinion… There are robust statistical ways to calculate these likelihoods base on available data sets. You can look at the likelihood that any particular segment will be inherited by the next generation intact, not at all, or cut down, and extrapolate that across the generations. I have come across those figures in other places as well. By the time you get down to likely sharing only one segment with a given relative, it is more likely than not that it will either get passed down intact or not at all, than having it further cut down in size.
Kim – please check your facts, and wording, and re-read my post.
There are a very few, percentage wise, in the 10-15cM range. No one has reported, yet, any shared segment over 15cM which is IBS.
Just reconnecting my subscription to comments as I accidently press the unsubscribe button…oops…
Kim – I hope you have re-read my statement and CeCe’s blog. CeCe was explaining the percentage for a match at FTDNA. This percent varies by company and depends on the matching threshold they use. I stand by my statement that virtually all 3rd (and 4th and 5th) cousins share DNA. Most of them can be found using a low threshold at GEDmatch. At FTDNA the threshold appears to be 7.7cM. Do you really believe that true 3rd cousins who don’t match at FTDNA have 0cM from a CA? Many 3rd cousins that don’t match at FTDNA do match at 23andMe or AncestryDNA; and most will match at GEDmatch. So in our statements it’s important not to mix apples and oranges.
My question should have read “… 0cM shared DNA…”
I believe the difficulty once you get into the range around 7cM becomes separating the wheat from the chaff. The conventional wisdom is that if one has a match well above 7cM the two almost certainly share a common ancestor but if below 7cM the likelihood is much lower so the inverse is not true – one cannot say “less than 7cM implies almost certainly no shared CA,” only that a conclusion would be tenuous without additional (documentary) evidence. You or Blaine said earlier that Ancestry uses a low threshold of about 2cM. Do you know what the 23andMe threshold is? It would be nice if all the labs were transparent and provided their parameters, and if those parameters were adjustable as they are on Gedmatch.
I have found these comments to be invaluable, and my green notebook is bulging. Thank you.
Someone mentioned the lowest threshold at AncestryDna is 2 cMs.
There are 5 Confidence levels there: Extremely High, Very High, High, Good and Moderate. I have noted in my notebook that their Moderate Confidence is 6cMs, and Moderate is their lowest level. Do not remember the source. But, if indeed, their lowest level is 2cMs, that is egregious, and shame on them!
The companies do not post their criteria for an atDNA match, but several of us have noted the following from our experience:
FTDNA: 7.7cM plus 20cM total plus 5 shared segments.
23andMe: 7cM; 5cM in some cases at CoA. As you get more Matches, the threshold inches up.
AncestryDNA: it used to be 5Mbp (which sometimes allowed segments down to 2cM. They now use 5cM plus they “phase” your data with a likely reference population (which clears out some IBC segments; and drops out some good ones too). Population phasing may be helpful, but it’s not the same (nor as accurate) as phasing with parents).
IMO, Triangulation weeds out IBC segments, and shared segments over 7cM in a TG are virtually always IBD.
Great to have this information, Jim. Thank you!
I do believe that it is likely that there are only 50% of 4th cousins who share the same DNA from their common ancestor, and that this is what is being referred to in these articles. And that when you do share DNA with your 4th cousin using lower thresholds, it is not because it came from your common ancestor, it is because it is occuring by chance or detected artificially due to a happenstance same sequence of SNPs.
I can give you an example from my own precise experience. I have a 4th cousin on 23andme who is verified by paper trail. He shows 0cM relationship with me. But: He shows an expected 0.9% shared over 4 segments with my paternal great uncle, who is related through the same line. I am confirmed related to this paternal great uncle by the expected amount. When I look at my DNA, the segments that my 4th cousin shares with my great uncle were all contributed to me by my maternal grandmother. Start playing around with the family inheritance advanced tool in 23andme and you will see how easy it is for a segment of DNA to get entirely shuffled out of DNA for the next generation. Jim you are making the mistaken and common assumption that if your parents share a certain amount, the kids will share half, when once you get down to a certain segment size it is more likely that they will either have all or none of the segment in question. I have advanced statistics of permutations and combinations under my belt from my graduate work, and when you look at that with the rather large nature of the segments created from DNA recombination, you don’t have a perfect world where everyone gets exactly half from the preceding generation.
*Sorry I made a misleading statement in the story. My great uncle is not my paternal great uncle, he is my maternal grandfather’s brother. Somehow that came out wrong in my head, but would be very key to correctly interpreting my DNA results*
An easier way to picture it is this: Take a typical map of shared segments between a grandparent and grandchild. On average half of those segments came from your great grandmother and half of those segments came from your great grandfather, with some segments being a combination of both. Then you can do the same again, refining the quarter of your DNA that you got from your grandparent down to the contribution from each of your great great grandparents. Then do it again and you have the segments refined down to the contribution of the 8 ggg grandparents that you are related to through that grandparent. You then have a map of the segments you share with one particular set of ggg grandparents, or about 3% of your DNA filled in from them. This is in on average fairly large segments of DNA. Yes there can be a few teeny segments from them. But that is not relevant to the question of IBD segments. If you have a 4th cousin who also shares 3% of their DNA with the same couple, you can imagine taking a transparency of your shared segments, and overlapping it with a transparency of their shared segments. You are much more likely to overlap with that 4th cousin for an IBD stretch in the larger segments that you have from those ancestors. It does not mean that you can’t by chance overlap in a smaller random patch, it is just less likely. But what the 50% figure that happens with 4th cousins shows, is that statistically, if you are only markering 3% of your genome transparency, and your cousin is markering 3% of theirs, and then putting them one on top of each other with the limited number of relatively large segments from whence that 3% came, you are only 50% likely to still have an overlap to that cousin, even though you both got 3% of your DNA from the same couple, it is only a 50% chance that any of it will be the same 3%.
See this excellent post for an example. Slightly different parameters in the math to arrive at 77% instead of 50% chance of sharing DNA with 4th cousins, but correct in its principles nonetheless. http://ongenetics.blogspot.ca/2011/02/genetic-genealogy-and-single-segment.html
Interesting discussion!
I have a perfectly typical 3rd cousin match to a known 3rd cousin but no match at all (at default settings) with his brother. Their respective DNA matches prove beyond doubt that they are full brothers. So the younger brother must fall into that 2-10% group.
I am still learning so I was surprised when known 3rd cousins, once removed, brother and sister, tested and got different results. I share enough DNA with the brother to come up as a match, but not the sister. Both share DNA with our shared cousins, so there isn’t anything fishy going on with the sister’s parentage. These results are on both FTDNA and GEDMatch.
Hi,
I would have to disagree with your premise because you are dealing with probabilities and not absolutes. Here’s why: Lets say you have 1000 living 3rd cousins walking around who have tested their DNA. You will match most of them. But there are 2 to 10 percent who will not match you. The range would be anywhere from 20 to 100 of those 3rd cousins not matching you in that group. Since not every one of these 1,000 3rd cousins have tested, the ones not matching you are simply part of that 20 to 100 who don’t share DNA with you. I have a case of two documented 3rd cousins not matching each other. Many supporting records link them together and they do share some mutual matches.
Also, everyone doesn’t have the same amount of 3rd cousins either. In my case, I descend from Roman Catholics in Louisiana who produced large families. The number of children were as much as 20 to 30 in certain cases. So, the number of 3rd cousins I could have would be in the thousands. Me having 3,000 3rd cousins wouldn’t be out the question. So, I wouldn’t match 60 to 300 of them depending on which one of them tests..