Main

Functional genomic annotation initiatives, such as the ENCODE project, are churning out an incredible number of regulatory elements in the human genome. The question that emerges now is how each of these elements can be connected to their target genes. Indeed, regulatory elements in mammalian systems are capable of acting on genes located far away on the same chromosome or even on a different chromosome.

A few years ago, Job Dekker, now at the University of Massachusetts Medical School, developed the chromosome conformation capture assay, known as 3C, precisely for the purpose of identifying physical associations between distant genomic regions. The principle of 3C is to capture these physical interactions by cross-linking associated regions of the genome (Fig. 1). By a remarkable molecular haute couture process, the cross-linked fragments are then cut with a restriction enzyme and stitched together to form a library of ligation products, each representative of a physical interaction and detectable by PCR. The problem with 3C is that pairs of interacting elements have to be tested one by one, therefore limiting the scope of analysis.

Figure 1: The 3C method captures physical associations between distal genomic regions using cross-linking followed by restriction enzyme (RE) digestion and ligation (1), then cross-link reversal (2).
figure 1

A 3C library can now be copied and amplified into a 5C library, by multiplex hybridization of 5C primers (3) and ligation-mediated amplification (4). Adapted from Dotsie et al., 2006.

To simultaneously interrogate a 3C library for many interactions why not make... carbon copies? This is the principle of 5C, which stands for 3C carbon copy, the new method from the Dekker laboratory, recently published in Genome Research.

The 'reproduction process', as you will have guessed, is more complicated than your usual trip to the Xerox machine. It relies on ligation-mediated amplification and careful design of primers at the cleavage sites of the restriction enzyme used to prepare the 3C library. A large number of these 5C primers are hybridized to the 3C library. When a 3C template allows them to do so, two 5C primers bind in close proximity and are ligated to each other; then the product is amplified by PCR (Fig. 1). The resulting 5C library can be further amplified and analyzed using a high-throughput readout such as microarray or 454 sequencing.

The new method can be used in several configurations. One can ask how a single or a handful of 'fixed' elements interact with the neighboring chromatin, which the authors demonstrated on a 400-kb region containing the well-characterized β-globin locus. Alternatively, 5C is amenable to mapping a network of interactions between two large sets of elements. Dekker's group performed a proof-of-principle experiment examining a 100-kb conserved gene desert region, and they now plan to establish such maps for the ENCODE regions.

This is where the readout by sequencing comes in handy. Even if considering a single small chromosome, the number of possible interactions between the genes and regulatory elements present on this chromosome can easily reach several million. “By sequencing,” explains Dekker, “you will only detect those interactions that are actually occurring, but with a microarray you would have to represent each possible combination.” Capitalizing on the new sequencing technologies such as 454, which works optimally on short reads, Dekker's group specifically designed the 5C method so that short reads would be sufficient.

“With sequencing methods allowing you to read 100,000 or even 1 million sequences,” explains Dekker, “this really opens up the possibility to map this network of interactions that we believe occur in the genome... linking things that are far apart on a chromosome, but actually are functionally related and physically associated.”