The Early Stages of the Genetic Genealogy Revolution – Part II


I’ve spoken before about the enormous effect that affordable SNP and whole-genome sequencing will have on genetic genealogy. In that previous article, I mentioned a study using SNP analysis to identify a person’s ancestry based on autosomal DNA (all the nuclear non-sex DNA). Another study, released today in PLoS Genetics, used SNP chips to identify SNP markers that are characteristic of a certain ancestral origins. According to the authors:

“We have developed a novel algorithm to identify a subset of SNP markers that capture major axes of genetic variation in a genotypic dataset without use of any prior information about individual ancestry or membership in a population.”

To accomplish this, the researchers:

“…studied here 274 individuals from 12 populations (20 Mbuti, 20 Mende, 22 Burunge, 42 African Americans, 42 Caucasians, 20 Spanish, 11 Mala, 20 East Asians, 20 South Altaians, 20 Nahua, 20 Quechua, and 19 Puerto Ricans). Three of these populations are admixed (Caucasians, African Americans, and Puerto Ricans). All individuals were typed using the 10K Affymetrix array.”

The “10K Affymetrix array” is a chip that tests for about 11,500 SNPs (single nucleotide polymorphisms, or mutations) in each individual. Personally, I don’t know why more genetic genealogy companies aren’t doing this type of research themselves. The chips are relatively cheap these days, and there are plenty of people willing to send in DNA from all around the world with extensive ancestral information. This is the future of genetic genealogy, and they should be a part of it.

The most interesting paragraph from the news release:

“Their program was more than 99 percent accurate and correctly identified the ancestry of hundreds of individuals. This included people from genetically similar populations (such as Chinese and Japanese) and complex genetic populations like Puerto Ricans who can come from a variety of backgrounds including Native American, European, and African.”

The researchers then used their data to analyze an admixed population to evaluate their results, with great success. Here’s a figure related to the paper, here is Dr. Petros Drineas‘ lab website, and here’s the entire news release:

“A group of computer scientists, mathematicians, and biologists from around the world have developed a computer algorithm that can help trace the genetic ancestry of thousands of individuals in minutes, without any prior knowledge of their background. The team’s findings will be published in the September 2007 edition of the journal PLoS Genetics.

Unlike previous computer programs of its kind that require prior knowledge of an individual’s ancestry and background, this new algorithm looks for specific DNA markers known as single nucleotide polymorphisms, or SNPs (pronounced snips), and needs nothing more than a DNA sample in the form of a simple cheek swab. The researchers used genetic data from previous studies to perform and confirm their research, including the new HapMap database, which is working to uncover and map variations in the human genome.

“Now that we have found that the program works well, we hope to implement it on a much larger scale, using hundreds of thousands of SNPs and thousands of individuals,” said Petros Drineas, the senior author of the study and assistant professor of computer science at Rensselaer Polytechnic Institute. “The program will be a valuable tool for understanding our genetic ancestry and targeting drugs and other medical treatments because it might be possible that these can affect people of different ancestry in very different ways.”

Understanding our unique genetic makeup is a crucial step to unraveling the genetic basis for complex diseases, according to the paper. Although the human genome is 99 percent the same from human to human, it is that 1 percent that can have a major impact on our response to diseases, viruses, medications, and toxins. If researchers can uncover the minute genetic details that set each of us apart, biomedical research and treatments can be better customized for each individual, Drineas said.

This program will help people understand their unique backgrounds and aid historians and anthropologists in their study of where different populations originated and how humans became such a hugely diverse, global society.

Their program was more than 99 percent accurate and correctly identified the ancestry of hundreds of individuals. This included people from genetically similar populations (such as Chinese and Japanese) and complex genetic populations like Puerto Ricans who can come from a variety of backgrounds including Native American, European, and African.

“When we compared our findings to the existing datasets, only one individual was incorrectly identified and his background was almost equally close between Chinese and Japanese,” Drineas said.

In addition to Drineas, the algorithm was developed by scientists from California, Puerto Rico, and Greece. The researchers involved include lead author Peristera Paschou from the Democritus University of Thrace in Greece; Elad Ziv, Esteban G. Burchard, and Shweta Choudhry from the University of California, San Francisco; William Rodriguez-Cintron from the University of Puerto Rico School of Medicine in San Juan; and Michael W. Mahoney from Yahoo! Research in California.

Drineas’ research was funded by his National Science Foundation CAREER award.”

There’s still a long way to go, but this is a great start.

Non-Scientist Summary: A group of researchers used SNP (single nucleotide polymorphism) analysis to identify particular SNPs which are associated with an individual’s particular ancestry (for example, Caucasian, African American, Japanese, etc..). Using this information, they could test individuals with unknown ancestry for those SNPs, in effect characterizing their ancestry based on the SNPs that they possess.

13 Responses

  1. Hsien Lei 21 September 2007 / 9:35 am

    I wouldn’t say that genetic genealogy companies aren’t doing this type of research. AncestryByDNA also looks at ethnicity/biogeographical ancestry using SNPS. Some people may say their results are not accurate, but at least the company is continually innovating!

  2. Blaine Bettinger, Ph.D. 21 September 2007 / 9:45 am

    Very true, and a good point. As with all new technology, it takes quite a while to filter down to commercial availability. I just wonder if AncestryByDNA is concentrating on too few SNPs – their most advanced test so far only uses about 1,500 SNPs. Perhaps they have more in the works – I’d love to see what’s going on in their R&D department!

    And besides AncestryByDNA and 23andMe, are there any other Genetic Genealogy companies experimenting with SNP chips or the recent explosion of SNP information?

  3. Megan Smolenyak Smolenyak 21 September 2007 / 9:59 am

    All I can say is bring on the SNPs! I’m waiting on my hi-res Euro-DNA results now, but would love for us all to have as many options for as many SNPs as possible. 10k+?! Sign me up!

  4. Deeps 12 November 2007 / 4:11 am

    Yes, this is quite interesting. Since the Microarrays are affordable and a better technology, why not used in the Geneaology to cut short the turnaround time? If I am not wrong Turn around time (average 3-4months)is the major concern for most of the Genealogy customers.

    Does any one know, any company/lab using Microarray chips for Genetic Genealogy studies. Esp for the YSTR’s and mtDNA studies?

    I am a keen follower of this technology and haven’t come across the technology trends that most of the companies are using in Genetic Genealogy.

  5. Blaine Bettinger 12 November 2007 / 7:45 am

    Deeps – I think the reason that genetic genealogy companies don’t use microarrays is because it’s not cost effective for so few SNPs. Most of these chips test thousands or millions of SNPs, while genetic genealogy companies only screen a maximum of 67 SNPs.

  6. adam 1 December 2007 / 3:02 am

    isn’t this just like what do?
    could it be that DNA tribes are using the Petros method?

  7. Blaine Bettinger 1 December 2007 / 9:56 am


    You’re right, uses autosomal markers to estimate ancestry, just as was done in this study. However, they test 14 markers by sequencing, rather than using a SNP chip. SNP chips can test thousands of markers.

    Affordable genome analysis is literally right around the corner (and already here if you can afford services like 23andMe and deCODEme). These type of low-cost tests will examine thousands and thousands of ancestral markers.

  8. adam 2 December 2007 / 3:21 am

    wouldn’t it better if they added more sequencing for other tribes and nations to the data. it will make the application slower but with better results.
    I hope they get financed to do that.

  9. Blaine Bettinger 2 December 2007 / 12:58 pm

    Absolutely – the more individuals, tribes, ethnicities that get tested and analyzed, the better the comparison.

Comments are closed.