Using Semi-Anonymous Genetic Data in Genealogical Conclusions

The genealogical community has a serious issue we need to talk about.

We are amassing one of the largest collections of genealogical information ever created, in the form of DNA match data. As of October 2018, approximately 20 million people have taken autosomal DNA (atDNA) tests, and that number continues to grow rapidly. DNA evidence is being added as an additional record to support existing genealogical conclusion, being used to generate new hypotheses, and helping break down decades-old brick walls.

However, since many genetic matches are either unwilling or unable to respond to communication or provide permission for public use of the genetic data, much of the massive database is potentially locked behind privacy walls such that the information can’t be utilized in scholarship and can’t be publicly shared. Indeed, Standard #8 of the Genetic Genealogy Standards (PDF) mandates the following: ... Click to read more!

Introducing “DNA Match Labeling” – A Sorting Mechanism for AncestryDNA Matches!

Many DNA test takers have a wealth of genetic relatives! For example, I have more than 50,000 different genetic cousins across all the genealogy DNA testing companies. Although many regions of the world do not yet have 1,000’s of genetic cousins in their match lists, they will in the future as DNA testing grows increasingly popular and the testing companies target other countries.

Unfortunately, the testing companies have not provided users with the tools necessary to organize these matches. Indeed, clustering and organization of genetic cousins is a huge component of the future of DNA evidence. Clustering of our matches allows us to identify information that is not visible or apparent when the matches are unorganized.

This is where the DNA Match Labeling extension for Chrome comes in! I worked with a programmer to build this extension. ... Click to read more!

AncestryDNA Revises Ethnicity Estimates

AncestryDNA today (12 September 2018) released updated ethnicity estimates for all customers. Everyone in the AncestryDNA database will see some change in their estimate.

This update represents one of the most significant refinements of AncestryDNA’s ethnicity estimates. Both the reference populations and the ethnicity algorithm underwent significant development.

The size and makeup of the reference populations grew substantially, from ~3,000 reference samples to ~16,000 reference samples (many provided by test takers that consented to participating in AncestryDNA research). The update adds 17 new regions to the ethnicity analysis (from 363 to 380). Many more are needed in areas such as Asia and Africa, of course, but this is a great addition. As well, many regions were redefined or their names were changed to more accurately reflect the region. ... Click to read more!

DNA Central is Live!

DNA Central (www.DNA-Central.com), your one-stop resource for DNA education, has just gone live! See below for your special discount membership offer, a thanks for being a The Genetic Genealogist subscriber!

What is DNA Central?

There is an enormous need for DNA education. Millions of people are testing their DNA every year, but there are so few educational resources for those test takers. Receiving your results from the test company is just the first step in the process. How do you understand those test results? What do you do with them?

DNA Central is my effort to provide that DNA education in multiple formats. The membership-only website will provide resources for people at all levels trying to understand their results and stay on top of the most recent developments in DNA. These resources include: ... Click to read more!

Board for Certification of Genealogists: Comments Sought for Proposed DNA Standards

Today (23 May 2018), the Board for Certification of Genealogists announced a 60-day public comment period for a set of proposed DNA standards:

The link for the Genealogy Standards is here: https://www.amazon.com/Genealogy-Fiftieth-Anniversary-Certification-Genealogists/dp/1630260185.

The PDF of the proposed standards is here: https://bcgcertification.org/DNA/Proposed_Standards.pdf.

The link for the survey is here: https://goo.gl/forms/57ahXLqkAYOBWDop2.

Via a Facebook post, BCG announced the following:

The Trustees of the Board for Certification of Genealogists (BCG) met in Grand Rapids, Michigan on 2 May 2018. The trustees…debated a proposal to update genealogy standards to incorporate standards related to genetic genealogy. As a result of this discussion BCG intends to move forward with the integration of genetic genealogy into Genealogy Standards. The board directed that the committee’s proposal be published for public comment. The proposed standards can be viewed at https://bcgcertification.org/DNA/Proposed_Standards.pdf. The public comment period ends on 23 July 2018. Fill out the survey at this link (https://goo.gl/forms/57ahXLqkAYOBWDop2) by 23 July 2018. Due to the expected volume of comments, we will not be able to acknowledge or respond to individual comments. ... Click to read more!

Heteroplasmies and Poly-Cytosine Stretches – An mtDNA Case Study

I don’t match my mother’s mtDNA test results.

How is that possible?

Last fall during the sales at Family Tree DNA, I ordered an mtDNA test for my mother. Now, usually there isn’t a need to test your mother’s mtDNA if you’ve tested your own, and vice versa. You should have the same mtDNA results, and in almost all cases you will.

However, I decided to test my mother because I have a mutation that puts me at an mtDNA genetic distance of 1 relative to matches that are likely related about 200 years ago in the Caymans where my mtDNA came from. That mutation always bugged me. Although it certainly isn’t unusual to have a GD=1 at 200 years, my maternal line is a work in progress so I want as much information as possible. So I decided to “walk back” the mtDNA line as far as I could. Although my mother was only one generation further back, it couldn’t hurt to have her mtDNA sequenced just in case it could shed light on this mutation (and, frankly, the sale was too good to pass up!). ... Click to read more!

Informed Consent Agreement and Beneficiary Agreement

Last year in the Facebook group Genetic Genealogy Tips & Techniques, we were discussing the need in the community for an informed consent agreement and a beneficiary designation form. Provided below are informed consent agreement and beneficiary designation form that I drafted with very helpful feedback from the GGT&T community early last summer. Feel free to use these agreements/forms pursuant to the disclaimers below, and pursuant to the CC license under which they are distributed!

Please note that this is NOT legal advice. I do NOT make any representation that these forms are legally binding or sufficient for their intended purpose. I highly recommend that you see an attorney if you have any questions or concerns.

Informed Consent Agreement ... Click to read more!

DNA Central! A New DNA Resource Launching April 2018!

DNA is hard. There’s no way around it. Sure, it’s easy to spit in a tube, but once you have all those percentages and centiMorgans and matches, how do you make sense of it? What does it all mean? And how can it help you understand where you came from and how to use it to explore your ancestry?

Announcing DNA Central, a new resource for everything DNA! This new subscription-based site will have blog posts about the latest and greatest, how-to content (including a  “What Next?” series), short videos (such as the “3-Minute DNA” series), webinars, and forums! We’ll also have monthly giveaways and much more! It’s everything you need to finally understand and apply the results of your DNA testing! ... Click to read more!

TGG’s Top Posts in 2017

I started The Genetic Genealogist on February 12, 2007 with my first post, “New estimates for the arrival of the earliest Native Americans.” There were few educational resources for genetic genealogy back then, and all testing was Y-DNA and mtDNA. Although 23andMe would launch the first large-scale atDNA test a few months later in November of 2007 (see “23andMe Launches Their Personal Genome Service” announcing the $1,000 test), it would be a couple of years until they used the results for cousin matching. Today, almost 11 years later, there are 617 posts with more than 310,000 words.

Here’s a screenshot from the blog in December 2007:

This year I posted about 30 times about a wide variety of topics. Here are the most popular posts in 2017: ... Click to read more!

A Small Segment Round-Up

If you aren’t already a member of the coolest Facebook group ever, Genetic Genealogy Tips & Techniques, you really should be! We have a friendly and engaging environment, and everyone learns something new every day!

This post is meant to answer a question or issue that is raised almost daily in the group, and that is the issue of small shared DNA segments. Although these small segments are alluring, they are the mythological sirens of the genealogical world!

Small Segments Executive Summary

Here’s a bite-sized summary of the content below:

  • Many to most small segments (at least 7 cM and smaller) are FALSE, meaning they are NOT actually shared by the two matches, and therefore do NOT indicate shared ancestry;
  • This is supported by a 2014 paper by 23andMe scientists showing that at least 33% of 5 cM phased DNA segments are false-positive (and it’s much worse for unphased segments or segments smaller than 5 cM);
  • This is further supported by evidence that anywhere from 20-35% of distant matches at a testing company are not shared with either tested parent;
  • This is further supported by evidence that phasing your DNA with two tested parents significantly reduces the number of matches below 10 cM (with proportionally more matches reduced as the segment size gets smaller);
  • There is currently no evidence that triangulating segments or finding a paper trail provides a mechanism for distinguishing between false segments and valid segments;
  • Since we can’t tell the difference between false small segments and valid small segments, we must avoid these small segments to avoid poisoning our genealogical conclusions with false data; and
  • Beware any research or conclusion that uses these small segments without specifically addressing the issues that are known – based on all the scientific research and evidence gathered to date – to surround small segments.

If you’re interested in learning more, keep reading!

Small Segments In Detail

One of the most common questions in the group has to do with small segments. There’s no exact definition of “small” when it comes to small segments, but many of us define them as being a single segment of DNA of 7 cM or smaller. Others use 5 cM or smaller, while others use 10 cM or smaller. Personally, I consider segments of 7 cM or less to be “small,” although when I’m being very conservative I use a definition of 10 cM or smaller. ... Click to read more!