Have you experienced this? You’ve identified a very clear cluster that includes numerous DNA matches that all descend from a single family, but you have no idea how this family links into your family tree. Try as you might, and despite building numerous trees, you can’t seem to figure out how these DNA matches and this single ancestral family link into your family tree. If this sounds familiar, you have an Unlinked Family Cluster!
Defining “Unlinked Family Cluster”
An Unlinked Family Cluster is a very specific phenomenon in genetic genealogy, one that is becoming increasingly common. We see more and more of these clusters for various reasons; the matching databases get larger and larger meaning that these clusters get larger and easier to identify. Additionally, the more we work with our closest DNA matches, the more we have very promising “left behind” matches that don’t fit into our known ancestry. Often, these “left behind” matches form a shared match cluster or a triangulated cluster around a specific family. This is an Unlinked Family Cluster.
This is how I define an Unlinked Family Cluster:
- Forms a cluster – The cluster may be formed by shared matching (without segment data), triangulation (with segment data), or a combination of the two.
- Not recent ancestry – The cluster does not represent recent ancestry (no parents, grandparents, great-grandparents).
- No close matches – Related to the previous point, the matches in the cluster are more distant matches usually in the range of about 20-50 cM. Although a very small number of matches in the cluster can be higher or lower, if there were many closer matches then placement of the cluster in the family tree should be solvable. Similarly, if the cluster is made up of more distant matches (less than 20 cM), it may be more likely to be a pile-up region rather than recent common ancestry.
- Large number of matches – The cluster includes a large number of matches, typically in the range of 25 or more DNA matches, sometimes 50 or more. While a cluster may have a smaller number of people, it may be difficult to reliably determine that a hypothesized common ancestor/couple of the cluster is actually the common ancestor/couple responsible for the cluster. Further, DNA matches from as many lines of descent from the identified common ancestor/couple (preferably through multiple different children) is preferred to lend further support to the identification. Many Unlinked Family Clusters have members from more than one testing company, but this isn’t a requirement.
- Same ancestral family – The members of the cluster have trees (that they built or that you built!) that include the same ancestral ancestor/couple. While it’s rare that all members of the cluster can track their ancestry back to the identified common ancestor/couple, usually most members of the cluster can do so.
- Not in your tree – The identified common ancestor/couple is not found in your known family tree. If they were, this would be a Linked Family Cluster!
I want to emphasize that an Unlinked Family Cluster can be formed by either shared matching, triangulation, or a combination of the two. Also, it is important to note that none of these methods of forming a cluster are better than another. Especially considering the size of these clusters, a shared match cluster is just as valid as a triangulation cluster, and vice versa. Both methods can help you identify members of the cluster, and then the genealogy steps in to solve the rest of the mystery!
My Example – The Unlinked Zufelt Cluster
I ran into my first Unlinked Family Cluster maybe 5-6 years ago while working through my mother’s DNA test results at AncestryDNA. I was working my way down her list of matches, assigning them to maternal and paternal (in the “old days” before AncestryDNA automated this process!) when I ran into a match not too far down her match list. “Iva Smith” (a pseudonym) shared 48 cM with my mother. Of course, when I first worked with the match, there was not “Paternal Side” (i.e., Parent 2) label, nor my notes!
I examined the match, first looking at the match’s tree and then at shared matching. The tree wasn’t very informative, except that Iva had ancestry from Upstate New York and Canada (as does my mother on her paternal side). None of the people/surnames in Iva’s tree matched mine, or even looked familiar.
Interestingly, TIMBER had reduced this match from 72 cM to the reported 48 cM:
When I looked at shared matching, I saw a number of matches in the list, all sharing less than this 48 cM. While there were also some close family members, none of them were helpful in assigning a line (these close matches were either my mother’s siblings, descendants of my mother’s siblings, or my mother’s descendants, so completely unhelpful).
As I started to open the trees of these shared matches, and as I built out trees of these shared matches, I began to notice a pattern develop: the Zufelt surname (and variations thereof) appearing in every one of these trees. Over and over again, I would explore a shared match and would review their tree (although usually I would be building their tree!) and run into the Zufelt surname.
Not only would I run into the Zufelt surname, but it was the same Zufelt family over and over again. Each match (if a tree could be built!) could be traced back to Adam and Neeltje (Freer) Zufelt of Upstate New York. As the cluster grew with more tree building and an occasional new shared match showing up as more people tested, I could track these matches through three different children of Adam and Neeltje (Anthony, Elizabeth, and Henrich). I also found some matches through two of Adam’s brothers (Johann and Jacob).
Currently there are more than 50 matches in this cluster, all or most appearing as shared matches to each other, and all being descendants of this same Zufelt family. Of course, there are several matches in the cluster that cannot be tied into the family, usually do to MPEs (misattributed parentage events), tree errors, and so on.
Armed with the Zufelt surname, I also found matches at Family Tree DNA and MyHeritage that tied into this family.
The breakdown of matching (pre-TIMBER for AncestryDNA matches) is as follows:
Despite the one very close match (Iva Smith at 72 cM), I was still unable to tie this Zufelt family into my family tree.
To help me keep track of the matches and the cluster, I created a descendancy chart in LucidChart that had each of the matches (including their username, the testing company, and the shared cM total) and their lineage back to the Zufelt couple. Below is a screenshot of the LucidChart for the Zufelt cluster. Please note that this is intentionally blurred to avoid identification of the matches.
Unfortunately, despite all these matches, I could not place the Zufelt family anywhere within my family tree. Since these are my mother’s matches, I knew this family was on my maternal side. I could further narrow down the connection to my mother’s father’s family given both the location of this family (i.e., Upstate NY) and the fact that my mother’s mother came from Honduras and all of those matches are endogamous (and none of the Zufelt descendants shared this ancestry).
Although not particularly helpful to this particular shared match-based cluster analysis, I also had segment data for the cluster from matches at MyHeritage and Family Tree DNA:
Remember that these are my mother’s matches and thus my mother’s segment data.
I didn’t match any of the Zufelts on chromosome 1. Indeed, when I look at this region of chromosome 1 on my Visual Phasing chromosome map (which shows me which grandparent I inherited my DNA from at any given chromosome location), I can see that I inherited DNA from my mother’s mother (Jane Garcia) at this region, further supporting the hypothesis that these matches are via my paternal grandfather.
I DID match the Zufelts on chromosome 7, however, and I can see that this region comes from my mother’s father (Theodore LaBounty):
Despite all of these matches, all of these trees, and the segment data, I could not place the Zufelt family within my maternal grandfather’s family tree. I hypothesize that the match is on Theodore’s mother’s side (Goldiah Blanchard’s line), as the other line is all French Canadian and usually very distinct.
What now? What do I do with this cluster to be able to tie it into my family tree?
Working with Unlinked Family Clusters
There is no magic tool when it comes to Unlinked Family Clusters. Sometimes a new match will come along and allow you to tie in the family. For example, if I found shared matches to the Blanchard family in the Zufelt cluster, that might suggest they tie in there. Or if I’m able to map the segments on chromosomes 1 or 7 to the Stevens family, that might suggest the Zufelts tie in there.
Often, however, we don’t yet have that new magic match and we can’t yet map the segment. So what can we do?
- Genealogy, genealogy, genealogy.
The single most important thing you can do with an Unlinked Family Cluster is to research the family and build out their descendants. Discover everything you can about the identified common ancestor or ancestral couple, and build out the tree forward. The goal here is to find a branch of the family that could conceivably tie into your known family. Hopefully you will get lucky and find a direct connection (for example, a great-grandchild married your great-grandfather).
Of course, it is possible that your link to the family is via an MPE (misattributed parentage event), in which case tree building may not directly lead to an answer. However, it will still help you locate the various branches of the family in time and space, which may be very beneficial as you continue the investigation.
- Work across companies
There’s no reason to limit a cluster to matches found within a single testing company. There are multiple ways to find relevant matches at other testing companies. The goal is to find as many matches to the cluster as possible, with the hope that new matches will reveal the unknown link to your family tree.
Let’s assume you’ve identified this cluster at company X, and you’ve identified the hypothesized common ancestor/couple of the cluster. You can search for this family among the trees of company Y (obvious, if the family is Johnson or Smith or a common family, this is going to be more complicated; however, you’re looking for a specific Johnson family so this method can work. It will just require more tree review and building). I found the Zufelt cluster at Ancestry, and I’ve found members of this Zufelt family at MyHeritage and Family Tree DNA. If you’re really lucky, you might even find some of the same test-takers from company X (in the known cluster) in company Y. You can then exploit this to use tools like shared matching to identify matches that are shared by you and that match.
If you’re working with segment data from company Y, you can look for matches that share that segment(s) at company X (remembering of course that overlap does not equal triangulation. See “A Triangulation Intervention“).
- Find the commonality
Researching the identified common ancestor or ancestral couple in great detail, and tracking the descendants forward, will also help you identify any commonality that could explain how you link to the family. For example, does a branch of the family end up in one of your ancestor’s towns or counties?
In my example, many descendant lines from the Zufelt ancestral couple end up in Upstate New York where my mother’s father’s family is from. If I didn’t already know that they tied into his ancestry at some point, this would be extremely helpful information.
- Walk back the cluster.
Jim Bartlett coined the phrase “walking back” to refer to pushing a segment or shared matching back one generation at a time. Here, walking back may refer to mapping the segment(s) back in generations to help identify the link between the known family and the Unlinked Family Cluster. For example, I’ve mapped back the Zufelt segments to my maternal grandfather, but that’s as far as I’ve gotten. If I can find a way to map the segments back further, I might be able to push that back another generation or two. Since I’ve tested my mother and several of her siblings, I could do Visual Phasing to possible push the segment back. Otherwise, I can continue to map matches as they come in (at 23andMe, Family Tree DNA, and MyHeritage) and hope that eventually I find matches with both the matching segments and the tree I need.
Similarly, I can walk back the shared match cluster by passive or active shared matching means. Passive means would be waiting for shared matches to show up on the cluster that provide a clue to the connection (for example, a descendant of a known line in my tree that is also a shared match to the Zufelt Unlinked Family Cluster). Active means would be testing descendants from different lines in hope of finding a new shared match to the cluster. This is, however, a bit of a gamble unless I have a reason to suspect a particular line (and even then it remains a big gamble as to whether they share any DNA with the Zufelt line).
- Examine the branches
Although I haven’t had much success with this approach, it’s a reasonable hypothesis that the line of descent with the closest matches, if statistically significant, could be the line of descent that ties into your ancestry. For example, let’s say that the Unlinked Family Cluster is the Snodgrass family and there are four grandchildren. The average shared cM for the descendants of grandchildren #1, 2, and 4 is about 20 cM. The average shared cM for the descendants of grandchild #3 is 40-50 cM. This might suggest that you are more closely related to this line and thus could focus your research on this line. However, it could just as easily be random chance that you and these descendants inherited a larger segment from the ancestral couple. So you might pursue this, but don’t put too much emphasis on this possibility.
Are these pile-up clusters?
A “pile-up” is a region of your DNA that statistically shares more matches in a database than is expected, usually due to old shared ancestry. Given the randomness of DNA and the randomness of who tests at any given database, we expect our matches to be evenly distributed along our chromosomes. However, there are huge spikes in matches, sometimes 10s or 100s of matches, that stick out like sore thumbs. Working with a match at this location can be problematic, as it likely to be a smaller segment/match and much older common ancestry (usually not identifiable due to spotty trees, poor records, etc.).
It is possible, therefore, that a cluster of shared matches (which VERY often form around a single segment or two, as with the Zufelt example) could be the result of a pile-up region rather than very recent shared ancestry. However, the fact that every match in my Zufelt cluster tracks back to the same Zufelt family suggests that rather than being a segment that is common within a population, this segment is more specific to Zufelt descendants and relatives and is therefore not just a pile-up region. Regardless, I will consider this possibility as I continue to search for a possible genealogical connection to the Unlinked Family Cluster.
What approaches have you found helpful as you work with an Unlinked Family Cluster?