How Many Segments Do You Share?

I have told people in the past that we share a single segment of meaning IBD DNA with the vast majority of our genetic matches (where IBD means Identity-by-Descent, or a valid matching segment of DNA from a recent genealogical relationship). I usually say that we share a single segment of DNA with 99% of our matches, but that’s been an off-the-cuff estimate. I wanted to have better data to cite, so I took a closer look at this issue.

At FTDNA, you can download a list of all of your matches:

I downloaded my list and removed all of my targeted test-takers (anyone that I tested or I asked to test). These close test-takers would skew the data.

After removing them from my match list, I have a total of 2,491 matches at Family Tree DNA.

Family Tree DNA also allows you to download a list of all the segments you share with your matches:

I downloaded my list of total shared segments, and removed all segments smaller than 7 cM (these segments are either too small to be valid or too old to be genealogically relevant). This is perhaps the most controversial aspect of this analysis, but see “Sharing Large Segments With a Match Does Not Validate Small Segments Shared With That Match.”

After I removed segments smaller than 7 cM, I shared a total of 2,554 segments of DNA with my 2,491 matches.

I share more than 1 segment of DNA with 63 matches, or 2.5% of my matches. Accordingly, I share a single segment of DNA with 97.5% of my 2,491 matches at Family Tree DNA.

Of those 63 matches, I share multiple segments with the following breakdown:

  • 2 segments = 55 matches
  • 3 segments = 6 matches
  • 4 segments = 2 matches

Thus, among my matches, it is extremely rare to have a non-targeted test-taker share more than a single segment of DNA (only 2.5% of my matches). It is even more rare to have more than two segments, with just 8 people (<1 %) sharing 3 or more segments.

Of course these numbers will vary from person to person, and will vary significantly in endogamous populations. Interestingly, this might be a way to identify people that come from endogamous populations, but may not know it. It will be interesting to see this analysis from people that have endogamous backgrounds.

Using a 5 cM Threshold

Notably, when I only removed segments smaller than 5 cM, I had 212 matches (8.5%) that shared more than 1 segment of DNA, and they broke down like this:

  • 2 segments = 192 matches
  • 3 segments = 15 matches
  • 4 segments = 4 matches
  • 5 segments = 1 match

Thus, most of these additional matches had a second segment between 5 cM and 7 cM, which is a danger zone for DNA evidence.

Conclusions

After this analysis, I maintain that most people will share a single segment of IBD DNA with their genetic matches (endogamous populations will vary).

Among my Family Tree DNA matches, I share a single segment of DNA with 97.5% of my matches.

It will be interesting to see whether I can find common ancestry with the 63 matches I’ve identified as sharing more than 1 segment of DNA, and whether it is easier than other matches. I have a ton of questions I don’t have time to answer right now, unfortunately. For example, what is the total amount of DNA shared by these matches compared to the average of all of my matches? Do any of these matches match both my parents? What are the segment size distributions for the segments shared with 63 matches? Is it usually one large segment and one small segment, or does it vary?

I would also like to repeat this process using the Tier 1 Matching Segment Search tool at GEDmatch. Using the tool, I get a total of 4,732 segments of 7 cM or greater shared with non-targeted test-takers (it reports that it analyzes my 9,787 matches). However, it’s a lot trickier because I have to be wary of repeated uploads and re-used pseudonyms, so identifying the total number of matches is an issue that requires more time.

Of course this analysis is strongly influenced by FTDNA’s matching algorithm (see “Family Tree DNA Updates Matching Thresholds“), so please keep that in mind.

11 Responses

  1. Marge Swanson 30 October 2017 / 9:05 am

    Looking forward to learning more about this fascinating topic.

  2. Chuck Haine 30 October 2017 / 9:28 am

    Blaine, thanks for continuing to our understanding of genetic genealogy!

  3. Bruce G Harlow 30 October 2017 / 9:28 am

    Using your methodology, I had 3824 matches with 3930 segments. 96 matches had 2 segments, 5 had 3 segments, and none had more than 3. So 2.6% of my matches had more than 1 segment, a result very similar to yours.

    I created a scatter graph using the smallest segment size as the X axis and the largest segment size as the Y axis. The resulting formula for the trendline was y=1.1235x + 1.3884. In other words, when the smallest segment was small (e.g., less than 10), the largest segment also tended to be small.

  4. Tammie Gregori 30 October 2017 / 12:26 pm

    I used the method on my father, who comes from a somewhat endogamous population (Quakers), and whose parents were related (3rd cousins). After removing myself (his daughter) and 1 targeted test-taker, he had 1088 matches. Of those, there are 108 who share segments of 7 cM or greater, which is 9.93%. (So, quite a bit higher than the 2.5% of Blaine’s matches who share more than 1 segment.)
    Of those 108:
    85 share 2 segments
    15 share 3 segments
    5 share 4 segments
    1 each share 5, 7, and 8 segments.

  5. Louis Kessler 30 October 2017 / 5:10 pm

    Okay. Here’s my Ashkenazi endogamy for you. At Family Tree DNA I match 11,401 people. Removing my uncle and my 3rd cousin who are the only two I know genealogically how I’m related, I share 229,829 segments with 11,399 people.

    For 7 cM segments and above, I share 20,121 segments with 11,399 people.
    Of those 11,399 people:
    I share 1 segment with 5,637
    I share 2 segments with 3,630
    I share 3 segments with 1,504
    I share 4 segments with 483
    I share 5 segments with 102
    I share 6 segments with 35
    I share 7 segments with 4
    I share 8 segments with 4

    So I share multiple segments >= 7 cM with 50.5% of my matches compared to Blaine’s 2.5% and Tammie’s 9.93%.

  6. CathyBaumgartner 30 October 2017 / 11:45 pm

    This is interesting, Blaine. Thanks for sharing. I will have to do this analysis on my own data. I’m assuming that since you mention IBD that you are also excluding those > 7 cM segments that FTDNA does not match to your parents?

    • Blaine Bettinger 31 October 2017 / 10:11 pm

      I didn’t go that far into the analysis, no, I didn’t have time.

      • CathyBaumgartner 1 November 2017 / 7:41 pm

        I ended up finding that most of my second segments were on the X chromosome, which raised a red flag, as so many of my X matches on FTDNA don’t match either parent. So I ended up cross-checking by using Philip Gammon’s
        Match Maker Breaker tool, the results of which mostly got rid of those second segment X matches. Excluding close relatives and targeted matches, and using only IBD segments >= 7 cM, 98.2 % of my matches match me on just 1 segment. Restricting to segments at least 10 cM (my typical exclusion, these days), the % is almost the same: 97.8% (And Roberta Estes’ blog provides a link to Match Maker Breaker tool https://dna-explained.com/2017/04/06/introducing-the-match-maker-breaker-tool-for-parental-phasing/ )

  7. ananthavally 31 October 2017 / 12:00 am

    It is possible for a child to have matches that their parents do not due to compound DNA segments.

  8. Lee Goodman 8 November 2017 / 12:32 am

    Hello,
    I’m not very Excel savvy. How do I go about eliminating the matches with segments less than & cM and then how do you then sort them by number of shared segments?
    Thanks,
    Lee G

Leave a Reply

Your email address will not be published. Required fields are marked *