Five-Part Series on Visual Phasing:
- Part I – Explaining visual phasing and identifying/labeling recombination points (November 21, 2016)
- Part II – Assigning segments of DNA (November 22, 2016)
- Part III – Using cousin matches to identify which grandparent provided the segments
- Part IV – Mapping my own chromosome using the visually phased paternal chromosomes
- Part V – Using the mapped DNA with new matches
This weekend, I spoke at a meeting of the New England chapter of the Association of Professional Genealogists, and it was a wonderful group. One of my talks was about “Chromosome Mapping.” Unfortunately, since the talk was only an hour, we didn’t have time to discuss “Visual Phasing,” a chromosome mapping methodology. Instead, I promised to finish this blog post to explain the process. As I was writing, the blog post turned into a 5-part series!
- What is it? A method to assign segments of DNA to the test-taker’s four grandparents.
- Why use it? To identify which grandparent gave the test-taker which segments of DNA (eliminating 75% of the family tree to search for MRCA).
- What do you need? Autosomal DNA of three siblings uploaded to GEDmatch.
Visual Phasing is a process by which the DNA of three siblings is assigned to each of their four grandparents using identified recombination points, without requiring the testing of either the parents or grandparents. Although the process does not automatically reveal which segment belongs to which of the four grandparents, matching with cousins provides this identification as a further step of the process.
Kathy Johnston developed this process some time ago, and first posted about it in the Family Tree DNA Forum. As shown in the figure below, there are two PDF documents available for download to explain the method with both images and text. However, note that you must be a registered member of the forum (free) to download the documents.
My understanding is that Randy Whited also independently developed this process. I attended his excellent lecture on visual phasing at SCGS Jamboree in June 2016, and an audio recording of that talk is available for purchase here (Session# TH 023 entitled “Reconstructing Grandparent DNA Using Sibling Results” for $11.00).
Visual Phasing is an incredibly valuable tool. Although requiring three siblings creates a considerable barrier for many people, it can be extremely valuable for genetic genealogists interested in chromosome mapping. For example, as we’ll see, I have an adopted great-grandmother, and using visual phasing I can identify entire portions of my chromosomes that came from her, which could prove to be beneficial to my search.
NOTE: although visual phasing can potentially be performed with just two siblings and close cousin(s), it is considerably more challenging. I strongly recommend starting out with three siblings (either your own family or someone else’s family).
In addition to Kathy’s PDF documents and Randy’s recording, there are several other resources. Joel Hartley has been publishing the results of his visual phasing (see “My Big Fat Chromosome 20”), as has Ann Raymont (“Chromosome Mapping with siblings – part 1” and “Chromosome Mapping with siblings – part 2”). Ann’s blog posts contain a lot of details about the whys and hows of visual phasing.
What You Need
You need three siblings who have done autosomal DNA testing and transferred their results to GEDmatch. The testing company doesn’t really matter, even if you’ve tested all three at three different companies. Trust me!
I perform visual phasing in PowerPoint because it gives me a great deal of freedom to manipulate screenshots and add annotations, but it isn’t perfect. I’d love for this process to be semi-automated, at least creating an output comparison for Step #1, below.
STEP 1 – Setting Up
Visual Phasing works by identifying recombination points in the DNA of the three siblings. As will become clear, a recombination event in one sibling will affect how she shares DNA with the other two siblings.
Accordingly, the first step in visual phasing is to compare the DNA of the three siblings to each other in GEDmatch using the One-to-One tool. We’re going to work on one chromosome at a time, and I recommend starting with the X chromosome (especially if one of the siblings is male, since he’ll only have one X chromosome) or one of the shorter chromosomes such as 20 through 22.
Capture a screenshot of the comparisons, and paste them into PowerPoint.
In this example, we’re going to be looking at Chromosome 21 in three siblings, Brooke, Felix, and Susan:
With this information, you can identify most or all of the recombination that took place when the sperm and egg were created for each of the three siblings.
In One-to-One comparison, you’ll usually see both half-identical (yellow) and fully-identical (green) sharing (but not on every chromosome). Remember that we’re actually comparing TWO CHROMOSOMES of each person at each and every point, so sometimes full siblings will share DNA on only one of their chromosome pair, while they will also share DNA on both copies of their chromosome pair.
STEP 2 – Identify and Label the Recombination Points
Now we can identify and label the recombination points. Here is the first key point:
KEY POINT #1 – Anywhere there is a change in the sharing status between two siblings, there must be a recombination event in at least ONE of the siblings (and sometimes both!).
For example, a switch from sharing a yellow segment to sharing no DNA means there was recombination at that point in one or both siblings. A switch from a yellow segment to a full segment means there was recombination at that point in one or both siblings. And so on.
In the following figure, each of the recombination points (i.e., each change in sharing) is identified by an arrow:
Now, in PowerPoint, draw a long vertical line through each recombination point. Each line should intersect at least two recombination points:
Often, this is where you’re going to run into the first problem, namely that the line doesn’t always seem to intersect at least two recombination points. This happens for a variety of reasons. Most commonly, the chromosome visualizations don’t always line up perfectly. Second, sometimes the start and stop locations are fuzzy (Ann Raymont mentions this in her terrific blog post here: “Chromosome Mapping with siblings – part 2”). Third, sometimes there are recombination events in two siblings at one place, which can cause some difficulty.
KEY POINT #2 – Do NOT get too stuck on recombination points. Trust me, getting frustrated with recombination points that don’t line up can quickly derail a phasing project! “Close enough” is just fine when trying out the first few chromosomes.
For example, there is an issue aligning the recombination event shown below, which is at position 30,574,043 according to the comparison of Susan v. Felix, but at position 31,604,127 according to the comparison of Brooke v. Felix. This is unlikely to be “fuzziness.”
It’s very easy at this point to throw your hands up and jump to another chromosome. But for now, we’ll put a recombination point around 31,000,000 or so.
I also like to label the recombination points with a number for easy reference. The start and stop positions for each yellow segment (sharing on one chromosome) is provided by GEDmatch.
However, you’ll need to take an extra step to get the start and stop point for green segments (sharing on both chromosomes). This is a great trick that I just learned recently (via Sue Griffith at “Obtaining FIR Boundaries on GEDmatch using the Little Tick Marks”) is to perform a One-to-One comparison, but to click “Full resolution.”
This will expand the chromosome and show megabase positions, and you can obtain the start or stop position:
So now we have positions for each of the recombination points:
Now, let’s assign each recombination point to the person for which that recombination occurred. This is usually as straightforward as identifying which sibling has the recombination event twice. For example, Felix owns the first recombination point:
Now all recombination points are identified and labeled.
In the next post, we’ll start assigning segments of DNA based on the identified and labeled recombination points.