In 1915, Thomas Morgan published the book Mechanism of Mendelian Heredity summarizing his work with fruit flies and the revolutionary conclusion that genes are arranged in a linear fashion. Ninety years later, biologists are still obsessed with that linear arrangement, and correlating phenotypes with genotypes is the focus of entire institutes.

Thanks to the confluence of the genome-sequencing and digital revolutions, the genotyping craze is now entering a new phase. Just as printing circuit components on silicon chips spawned an entire branch of engineering, printing nucleic acids on chips is creating a completely new approach to biology.

The Eureka trio: a set of chips that support Illumina's whole-genome genotyping with their Infinium assay. (Courtesy of Illumina.)

Geneticists are already switching from time-consuming gels and microsatellite analysis to quick and accurate chip-based systems. Meanwhile, array-based tools are finally becoming available for sophisticated association studies, or 'genetics without families', which some experts claim will revolutionize human gene discovery. Laboratory tinkerers have also hacked some of the new gene chips into platforms for analyzing the copy numbers of genes, revealing surprising new levels of complexity in genetic regulation.

In the laboratory, researchers learning how to use these new tools face a bewildering array of options. Besides a well-designed experiment, genotyping now requires a careful analysis of the available technologies, most of which are optimized for particular types of projects.

Guilt by association

As a result of the publicly funded HapMap initiative (Box 1), researchers around the world now have access to a database of hundreds of thousands of single nucleotide polymorphisms (SNPs), single base differences in the human genome that can be used as genetic markers. Traditional genetics can be done with far fewer markers, but the main goal of HapMap was to allow an entirely new type of gene mapping.

“In family-based studies, you're looking at a couple of generations and really trying to track large chunks of chromosomes through the pedigree, and if you coinherit consistently a chunk of chromosome and a trait, then there must be something on that chromosome,” explains Dietrich Stephan, director of the neurogenomics division at the Translational Genomics Research Institute (TGEN; Phoenix, Arizona, USA).

Unfortunately, human geneticists must rely on 'found experiments', in which families with well-characterized pedigrees have also developed well-defined genetic diseases. Complex traits with environmental and genetic components, like heart disease, cancer and mental illness, have been notoriously difficult to study this way.

The completion of the HapMap now permits researchers to test a bold new approach. An association study entails “looking at the population as a whole as one humongous family, with the initial founders being Adam and Eve, or whoever,” says Stephan. At TGEN, a nonprofit institute that focuses on association studies, Stephan and his colleagues use new DNA chips from Affymetrix exclusively. The chips contain 500,000 SNPs apiece, so a single chip-based experiment can characterize a patient's entire genotype at high resolution.

Affymetrix pioneered DNA chip technology, but in the past few years other companies have rushed into the field, often with highly sophisticated products for SNP analysis (Box 2). Perlegen and Illumina, for example, were both heavily involved in the HapMap project itself, and are continuously developing new tools for the emerging market of association studies.

Illumina's Golden Gate genotyping assay “was used for over 60% of the HapMap project. More recently, we just launched Beadchip, which is for whole-genome genotyping assays,” says Sarah Murray, genotyping manager at Illumina (San Diego, USA). The company's 109,000-SNP system is already on the market, and Murray says they are developing a 250,000-SNP version of the assay as well.

That's only half as many SNPs as Affymetrix's top-end chip, but Illumina argues that sheer SNP quantity is not the most relevant standard for choosing a genotyping system. “We chose to define content to be SNPs in and around genes, [because] you're more likely to find association near a gene,” says Murray, adding that the 109,000-SNP product has “greater than 99.9% reproducibility, greater than 99.9% call rates, [and] the inconsistency rate is extremely low.”

Playing the odds

Besides choosing a technological platform, another challenge in designing an association study in humans is determining the minimum number of samples. Statisticians suggest that a set of only 200 people should be enough to characterize a disease caused mainly by a single gene variation. More complex traits will require more patients in the study.

As clinical research goes, association studies should be relatively cheap, even with large numbers of patients. Taking blood samples and extracting genomic DNA are simple enough procedures. The main expenses involve the SNP genotyping technology itself, but researchers are already finding new ways to economize.

Blue beads—the beads in wells that form the basis of the technology platform of Illumina. (Courtesy of Illumina.)

“There are strategies where you can pool samples and get similar information ... on smaller chips,” says Stephan, adding that “this is the difference between spending $2 million on a study and spending $20,000.”

Even as the tools become cheaper and easier to use, though, many geneticists remain unconvinced that association studies will live up to their hype. Though the concept has been proven in relatively simple conditions involving mutations in one or a few genes, the real test will be the characterization of highly complex, multigene traits. Skeptics argue that the number of patients required for an association study of heart disease, for example, would exceed the bounds of any research grant, and that the approach will only work on simpler, much rarer conditions.

HapMap believers are confident they will soon be vindicated. Stephan advises fence-sitters to watch the SNP mapping literature closely, adding that the results from several pilot studies have already provided ample proof that the system will work on a larger scale.

All in the family

Outside of association studies, human geneticists today primarily rely on a strategy that would have been perfectly familiar to Morgan: linkage analysis. In the decades since the molecular biology revolution began, linkage analysis has mostly meant finding suitable test populations, then mapping the trait of interest based on its proximity to microsatellite sequences.

Results of allelic dosage analysis performed on a genotyping microarray, showing copy number variation across the genome in two cell lines. Top and middle show an increase in copy number at the MYC oncogene locus in one cell line. Bottom, a cell line that harbors a homozygous deletion at the CDKN2A (p16) locus has decreased copy number at this locus. (Reprinted from ref. 3. Copyright 2005, with permission from Elsevier.)

In the past few years, DNA chips with a carefully selected subset of SNPs have almost completely supplanted the often-painful gel electrophoresis marathon of microsatellite mapping. Most of the major DNA array manufacturers now offer simple, turnkey SNP-based systems for rapid genotyping in linkage studies. With a defined pedigree, the arrays can be magnitudes smaller, and the latest trend is to run multiple genotypes in multiplexed, automated systems.

Not only is the SNP-based linkage analysis faster and easier than microsatellite mapping, it is also more precise. Citing work her company published a year ago1, for example, Murray concludes that “you get more information using a SNP panel than using a microsatellite panel.”

Comparative hacking

Whenever a new technology enters the laboratory, the first instinct of hands-on experimentalists is to throw away the manual and see what the new toy can really do. DNA genotyping arrays were no different.

Production run of the Affymetrix chip. (Courtesy of TGEN.)

Soon after the first 10,000-SNP arrays became available, researchers discovered that they could be used for comparative genomic hybridization (CGH), a technique that reveals how many copies of a given gene are actually in a sample. The rise of this new method took chip makers completely by surprise.

“We did not anticipate this, but they've proven to be very useful in this application,” says Keith Jones, vice president for molecular genetics at Affymetrix (Santa Clara, California, USA). Jones adds that “now that we have sort of clued in that they're working well, we are thinking about next-generation designs that are more quantitative, higher resolution, and yet continue to include the allelic information that you get from SNP genotyping arrays.”

For any laboratory with access to DNA chips and readers, the method is essentially the same for CGH as for regular genotyping. The main difference is in data analysis, since the copy number of a gene correlates with the intensity of the signal where it hybridizes to the array. By comparing a test sample with a sample that is known to be diploid, the experimenter can determine the ploidy of the test sample.

The most obvious application is in cancer biology, as tumor cells may rearrange or lose chromosomes. Losing regulatory genes in this way can turn a small, slow-growing tumor into an aggressive malignancy. “You can use the array both to classify tumors and to understand the pathophysiology behind them,” says Jones.

Because the Affymetrix arrays now used for CGH were originally developed for genotyping, the strategy can also uncover some surprises. For example, cells sometimes adapt to the loss of one chromosome of a homologous pair by duplicating the other chromosome. The cell then contains a normal-looking pair of homologous chromosomes, but the genotyping array reveals that both copies are from the same lineage, a condition called uniparental bisomy. If the lost chromosome contained the only wild-type copy of a gene, and the replicated chromosome contained a mutant copy, the resulting cell will now have the homozygous mutant genotype even though the subject's germline genotype is heterozygous.

Though CGH surprised chip makers initially, they are now embracing the new market and offering a variety of products. Some of these are simply more carefully tailored versions of genotyping arrays, allowing simultaneous genotyping and CGH analysis, but others are dedicated systems that only assess gene copy number.

“People are asking us for a straight CGH approach;... for whatever reason, they find it's better to have a dedicated assay, and they find there [are] improvements in performance or sample preparation methods using a dedicated copy-number assay,” says Emile Nuwaysir, vice president for business development at NimbleGen (Madison, Wisconsin, USA).

Indeed, each array manufacturer seems to be taking a slightly different approach to the CGH field, trying to fit the strengths of their specific technologies into the rapidly developing market niches. NimbleGen's products may be an especially good deal for researchers who only need a small number of custom-built genotyping or CGH arrays. “We can create any microarray instantaneously, and all we require is sequence input,” says Nuwaysir.

CGH has also gotten a makeover as a new method for DNA sequencing2. The technique, called comparative genome sequencing (CGS), begins with a high-resolution CGH experiment, and then proceeds through additional hybridizations that ultimately yield detailed sequence information. “We can sequence and identify all of the changes in a microbial genome in just a few hybridizations,” says Thomas Albert, director of molecular research at NimbleGen and lead author on the study.

So far, CGS appears to work on genomes up to 25 megabases, which would allow it to be used on protozoans like the malaria-causing Plasmodium falciparum. Albert concedes that the technique may have an upper limit that would preclude sequencing larger genomes, but he and his colleagues have not found it yet.

As with any emerging technology, the early adopters are still finding new uses for genotyping systems, and facing new challenges as they test the boundaries of the present platforms. Whether the project is an association study, linkage analysis, CGH or CGS, the array-based tools continue to get cheaper, faster and more accurate. The HapMap project will have officially ended phase 1 by the time this article goes to press, but with new systems and applications still being developed, the era of rapid genotyping is clearly just getting started. (See Table 1)

Table 1 Suppliers guide: companies offering DNA array technologies