Today (or perhaps yesterday?) popular DIY genomics website GEDmatch.com released a new tool for phasing DNA data. Listed under a link entitled “Generate phased data file,” the tool allows users of the GEDmatch.com site to phase their chromosomes if they have their parent’s raw data.
(A similar tool was previously created by David Pike at http://www.math.mun.ca/~dapike/FF23utils/; with David’s tool, users receive their results directly and do not need to upload their DNA test results; accordingly, users have a variety of options depending on their privacy tolerance).
What the Heck is “Phasing”?
Currently, SNP chip testing performed by 23andMe or Family Tree DNA is unable to attribute a test result to either one of your parents. For example, if your results for SNP rs00000 are “AG,” the test alone cannot determine whether the “A” came from your mother or father.
“Phasing” refers to the process of separating the mixed DNA results (the “AG”) into the DNA obtained from your mother (the “A”) and the DNA obtained from your father (the “G”). This is typically done by comparing your results to your parents’ results and determining which parent could have and/or must have contributed each SNP.
For example, if mother’s results are “AA” at rs00000, and father’s results are “GA” at rs0000, then the data can be phased into “A” from mom and “G” from dad (since only dad could have contributed the “G”). Every once in a while, the data can’t be phased, however (say you’re “AG,” mom is “AG,” and dad is “AG”).
What Good is Phasing?
Many genealogists are using phased data to identify which DNA came from individual grandparents, great-grandparents, and beyond. I won’t get into that in detail now, but hope to at some point in the future (and eventually in the book I’m working on!).
As another example of using phased data, I used the new GEDmatch.com tool to phase my data. Both my parents and I had previously uploaded our 23andMe and FTDNA data into GEDmatch. I then performed some admixture analysis to compare unphased v. phased data.
Here is my unphased chromosome painting (Dodecad World9):
For comparison, here is my chromosome painting using the DNA I obtained from my father (same settings):
And here is my chromosome painting using the DNA I obtained from my mother (same settings):
Note that since this is so early I can’t say for certain whether using phased data creates some unwanted effects on the analysis (I’d love some input on that). It is interesting, however, to compare the results of phased v. unphased data.
What uses will you put your phased data to?