(Jim Watson via Wikipedia)
As if there wasn’t enough to worry about during the genetic revolution, researchers have found a way to characterize redacted genetic sequences from whole-genome or large-scale sequencing.
Here’s how it works.Â Let’s say that Mr. X has had his genome sequenced, but doesn’t want to know the results of some genes known to influence the development or progression of Alzheimer’s Disease.Â So when he receives his genomic sequencing, these genes have been ‘redacted’, or removed from the data.Â This is exactly what James Watson decided to do when he received his data.
Characterizing Redacted Genes
However, researchers have characterized one of Watson’s redacted genes by examining the sequences surrounding the gene in question.Â Often, when we inherit a gene from our patents, we receive that gene as well as some of the surrounding genetic sequence.Â By examining the surrounding sequence, some insight into the redacted gene is gained.Â For example, if I gave you the quote “A penny _____ is a penny earned”, you can derive from the surrounding words that the missing word is “saved.”
From an article discussing the researcher’s work:
“When the researchers told Watson about the paper’s results prior to publication, he redacted an additional 2 million DNA letters surrounding his APOE gene. This will make determining his redacted sequences much more difficult to decode – but not impossible, the authors write.”
This ability, of course, raises numerous ethical concerns.Â If we value the protection of privacy, even for people who make part of their genetic sequence available online, how do we protect their privacy?Â Asking people to avoid this type of analysis won’t work, of course.Â Is the only answer to redact huge portions of DNA surrounding redacted genes?Â Or are we faced with an all-or-nothing question: either people put their entire sequence online (or just portions but face the risk of this analysis) or they keep their sequence private?
The authors of the study are also concerned about the potential problems.Â From the paper:
“We believe the potential for such indirect estimation of genetic risk has considerable relevance to concerns about privacy, confidentiality, discriminatory and defamatory use of genetic data, and the complexities of informed consent for both research participants and their close genetic relatives in the era of personalized genomics.”
The article: Dale R Nyholt, Chang-En Yu, Peter M Visscher (2008). On Jim Watson’s APOE Status: Genetic Information is Hard to Hide. European J. of Human Genetics (DOI: 10.1038/ejhg.2008.198).