The genetic genealogy world is abuzz following a recent report in news outlets around the world (including CNN, Seattle PI, Daily Mail, etc) that investigators have used public genetic genealogy DNA databases for leads in a 20-year-old cold case.
The Case
In December 1991, 16-year-old Sarah Yarborough was tragically murdered in Federal Way, Washington. Despite an extensive investigation, no suspect has ever been named. Investigators have sketches of a man they believe might have been involved, but there is no name to put to the pictures.
Investigators did find some important evidence however: DNA left at the scene, possibly by Yarborough’s attacker.
The DNA
Late last year, investigators gave the DNA profile (apparently the Y-DNA profile) to California-based forensic consultant Colleen Fitzpatrick (who I’ve written about before here on TGG). Fitzpatrick, it appears, compared the Y-DNA profile to publicly-available Y-DNA databases, such as Ysearch, in an attempt to identify a potential match for the profile. After identifying potential matches, Fitzpatrick could then potentially identify the surname of the Y-DNA’s donor. For example, if all Bettingers have a particular Y-DNA profile and a sample Y-DNA profile closely matches that particular Y-DNA profile, then it is likely that the parties are either closely or distantly related (on a scale of 10s or 1000s of years), and they could potentially have the same surname.
Therefore, by comparing an unknown’s Y-DNA profile to public databases, it is possible to find matches and potentially identify a surname for the owner of that Y-DNA (but see “The Caveats,” below).
The Search
Fitzpatrick’s research determined that the suspect’s Y-DNA profile appears to match the Y-DNA profiles of individuals with the surname “Fuller.” Although unclear without more information, it further appears that the suspect’s Y-DNA profile specifically matches the Y-DNA profiles of purported descendants of Robert Fuller, who settled in Salem, Mass. in 1630.
Accordingly, Fitzpatrick’s research has merely suggested that the suspect MIGHT have the surname Fuller. Nothing more, nothing less. It is merely a lead, something that investigators will have to devote countless hours to following up on. The lead has not provided investigators with a magical solution to their mystery, and following this discovery they are likely not all that much closer to identifying a suspect that they were before.
The Caveats
It is important to note that there are some serious caveats to this process. Just because an unknown Y-DNA profile matches a group of surnames in a database does not automatically mean that the unknown Y-DNA donor had the same surname. Non-paternal events such as infidelity, adoption, name change, and others can – and have – resulted in surnames being jumbled throughout history. Thus, simply matching the unique Bettinger profile does not mean that your last name might be Bettinger; it could be Samuels as a result of great-grandpa’s roving eye, Smith as a result of your step-great-great-grandmother’s love for orphans, or Johnson because your father was tired of people spelling “Bettinger” wrong. For all these reasons surnames have changed over time.
It is even more vital to note, however, that Fitzpatrick’s research process is absolutely neither a new nor a groundbreaking technique! It is a familiar technique that has been done MANY times before, and continues to be done. People – including non-genealogists – have used public databases to attempt to identify their surname and/or family. Indeed, Family Tree DNA itself has noted that male adoptees have a 30-40% chance of identifying a likely surname by comparing their Y-DNA profile to FTDNA’s database (see here: “During the introduction Max [Blankfeld] stated that 30%-40% of male adoptees find their likely surname in FTDNA’s database”).
The Concern
Some, including both experienced genetic genealogists and people who have never had a DNA test, have expressed concern that their DNA was or could be used for this purpose, a purpose that it “wasn’t intended to be used for.” Some have stated that the search constituted an “illegal seizure” of their property, or that their DNA should not be used by “big brother.”
Further, as the ISOGG mailing list for project adminstrators has demonstrated, many project administrators are concerned that this hullabaloo will scare away potential test-takers.
The Past
Despite the concerns of the public, genetic genealogists, and project administrators, Fitzpatrick’s process is neither a new technique nor a frightening one. It has been done before. Further, Fitzpatrick’s process is simply a new twist on an old method. How is Fitzpatrick’s DNA search different, for example, from any of the following (and please don’t throw any genetic exceptionalism arguments my way!):
- Using a public reverse-phone lookup to identify the owner of a phone number? I didn’t authorize my phone number for that use;
- Searching through a public phone book to identify all the Bettingers in New York state? I didn’t authorize my phone book listing for that purpose;
- Using the census to identify my ancestors? I guarantee that NONE of my ancestors authorized the use of the census for genealogical research (indeed, just think of ALL the secrets that have been revealed in the census that our ancestors would have wanted buried forever!).
Interestingly, genealogists happen to be the biggest offenders of using public databases for purposes other than the one they were intended.
My Thoughts
One of the most interesting points to me is where some genealogists have decided to draw their line in the sand. Comparing a person’s Y-DNA profile to public databases is fine if the person is an adoptee searching for his last name, but not if the person is a criminal that investigators need to identify.
I also believe that project administrators are overly concerned. These types of stories come and go, and this one will fade away just as all the others have. We are (I sincerely hope) heading into an era of genetic openness, not one of genetic fear.
Lastly, the answer to this dilemma is, as always, education. We have to educate the public and potential test-takers that if they decide to make their Y-DNA public, it will be public for any purpose any person sees fit. They should understand this when they send in their cheek swab. The danger to test-takers, however, is almost nil; a public Y-DNA profile is either incomprehensible or useless for 99.99% of the world. And keep in mind that if a criminal is identified using this method, it is the criminal activity that endangered him, NOT the public Y-DNA databases!
Your Comments
What I’m really looking for here is a conversation about the pluses and minuses of Fitzpatrick’s method and the use of public DNA databases. Are there valid concerns, or only concerns due to the lack of education? Why do you believe these methods are different from non-traditional uses of other public databases such as the examples I listed above? Why do you think people might be afraid of this use of their public DNA? And how can we better education test-takers and the public to avoid these types of concerns?
[Note: I will immediately delete any comment that is aimed at Fitzpatrick herself. She did not invent these search methods, and should not be held responsible for their use. I’m looking for comments about the method, not the investigator].
I remember when I asked my brother to submit to a DNA test years ago and I asked him “If you have left your DNA at a crime scene you may not want to do this.” We laughed and he donated for our genealogy cause. I knew before hand I would probably be putting his/our DNA out there “publicly” and I am pretty sure I know what the definition of “public” is without looking it up. If you don’t submit the DNA publicly and someone gets a hold of it and uses it in the manner talked about here I would have issue. But if a DNA related person goes down for a crime because of DNA I submitted publicly; I think they should not have done the crime and I would be happy to keep the world a bit safer from a criminal.
I have no objection to the use of genealogical DNA databases for legitimate forensic purposes. But, I am as yet unconvinced that autosomal DNA data in genealogical databases couldn’t be used for HLA typing (organ transplant matching).
Hi I’m one of those project admins that doesn’t see a problem with this. You are right on with genealogists using records that were not originally intended for that purpose! The dna itself is often a big surprise, I found out my surname should be something different than it is! Within my project I am helping a group of seven men with finding the reason they do not have a certain surname they are matching to and two different surnames match my uncle’s results at the 67marker. So the Fuller result could be a wild goose chase or lead to a viable suspect, which would bring great closer to the family dealing with a cold case. I agree the general lack of education is what causes fear in this situation and recruiting problems as well. Bottom line if you have something to hide don’t submitt your dna, however this doesn’t stop your brother, cousin, etc. from submitting theirs! I hope we are moving towards genetic openess too, but I feel some privacy concerns and how the “public” is allowed to use our data do still need to be addressed.
Nicole Polk Macpherson/McPherson Project Ftdna
I dispute the probability assertions as to the surname and note that actual probilities aren’t stated. It is yet another aspect which can’t be evaluated.
The seminal article on the subject (King & Jobling, 2009, “Founders, Drift,..”) found an inverse correlation between Y-DNA and surname frequency; the more common a surname, the less likely persons holding it will have a match. My own study of the sbject suggests that — while there is an association between surnnaem and Y-DNA — it is much less strong than commonly believed, including by DNA surname project administrators.
With common surnames, we often see those with very many matches. Even with high resolution and quallity, the surnames aren’t in high agreement.
King and Jobling did propose just this method for identifying criminals but, as their own data shows, it would work well only for rare surnames. It’s of limited usefulness.
-ralph taylor, Taylor Family Genes at FTDNA
To my knowledge, you could not type HLA from SNPs as they are determined by STRs.
“And keep in mind that if a criminal is identified using this method [use of public genetic genealogy DNA databases for crime scene leads], it is the criminal activity that endangered him, NOT the public Y-DNA databases!”
So where is the objectivity in this method finding ‘matching profiles’ instead of ‘criminals’? And what about degree of certainty over probability, does that even matter? Those are the objections at the root of whether using this as a resource for law enforcement is plausible or fallible- me being just a layperson and not a scientist/paid-professional.
Soliciting public DNA databases should only yield ‘probabilities’ for matches, not the certain answer of a committed crime. Mark Minick found this out in 2008; Google his name and how his DNA found its way into a DailyMail-UK article (although there are 15 other hits for Mark Minick in the U.S. so you never know!). Although I’m sure his DNA was drawn from a ‘localized’ pool (criminal one) he was still was found complicit to an event he did not take part of and paid the consequences nonetheless. And you feel it is a good idea to search ‘remote’ pools (such as ancestor/sibling profiles) and up the probabilities to incriminate a given profile to a match? That sounds like taking at face value your first course plot when navigating via dead-reckoning, you will only increase the degree of error to get to your final destination. Read commenter No. 5’s post.
If you want to bring to the table the positives or negatives of using public DNA as a ‘search’ tool, then you must objectively examine its true value to answering the given question, not whether it can answer it. Statistics can also yield answers, any of a thousand you wish… So when genealogists/paid-professionals such as Commenter No. 1 and No. 4 stop putting such smug feelings behind DNA being so wonderful at literally putting criminals and their brothers in jail so they can do cameo interviews in TruTV shows, then laypersons like us will not be so reticent to give out samples for things we want to consent to, like genealogy or genetic ailment tests or others- lest we also wind up like Mark Minick by proxy of increased probabilities within this method.
@6 – not so.
[a] HLA types are the names given to different haplotypes of the various genes in the HLA region. These are not easily determined by SNP typing, because the genes are too polymorphic – see e.g. http://www.ebi.ac.uk/imgt/hla/
[b] there are however several attempts ongoing to impute HLA types from SNPs on common GWAS chips – see e.g. https://oxfordhla.well.ox.ac.uk/hla/
[c] but, like the case at hand, these only really allow statistical conclusions, not ones that can be applied with confidence to a specific individual.
Yes, I have to agree with the fact that genealogists are the most frequent abusers of other peoples’ records including dna results. I have done the same with my Maltese ancestors by using the results of baptism, marriages, funerals from the Adami Collection. I have often thought how these ancestors would have felt with some person (me) poking around their records just because of some minor ancestor shared with them. I also think solving serious crime is more important than any individuals concern about privacy. The unlawful killing of another person is a serious crime.
Recently I had my STR results, kit number and surname written about on dna forums. It was some bull about the mythology of some ethnic group’s creation stories, nothing based on science, the dna itself or my specific results. I was momentarily incensed. I left my surname group at FTDNA upsetting the Administrator as I am the closest match to her paternal line. I deleted my Ysearch account. I have reversed these things but I intend to leave FTDNA permanently, have them totally delete my account by the end of 2012. Using surname projects and STR results for silly racist ideas is definitely beyond the pale. I will eventually delete my 23andMe account this year or the next. I think dna testing has had its day in the sun.