In the last decade, DNA microarrays, by offering unprecedented wealth of information on gene expression, have changed the pace of research. From a vehicle to analyze expression patterns of relatively few genes, microarrays have evolved into indispensable tools for scientists in biology and medicine. Cancer researchers, for example, are using microarrays to define expression profiles in tumors with the goal of discovering disease mechanisms, making diagnoses and selecting treatments. The more data are obtained from individual studies, the greater is the power of the collective dataset, if, and that is the great caveat, these data can be meaningfully compared. Unfortunately experiments on similar biological material often yield different results, as demonstrated in examples given by Gavin Sherlock in his News and Views article (p. 329). This lack of comparability has caused much frustration and even contention among microarray users and has left many with a major question: are such discrepancies inherent to the technology or can they be overcome?

An answer will likely be found only as a result of joint efforts by researchers in many laboratories, and in this issue, three reports illustrate such endeavors. The authors systematically analyze possible causes for discordant results and suggest practices for data comparison. These three papers collectively represent the efforts of researchers in 17 labs, using over 15 microarray platforms, and show different approaches to addressing the problem of reproducibility. In the first study (p. 337), John Quackenbush and his team evaluated whether the same biologically meaningful response could be achieved using two substantially different microarray platforms. In the second study (p. 345), ten research groups coordinated by Rafael Irizarry, investigated the degree to which different labs or different array platforms contribute to data variability. To be true to a real-life situation they did not standardize protocols, but evaluated the disparity when each group used the procedures they were most comfortable with. The third study (p. 351) is the effort of the Toxicogenomics Research Consortium, launched by Brenda Weis in 2001 to establish and share large gene expression profiles. This group sought to standardize protocols and define 'best practices' in analysis, to allow comparability of their data.

Together, these reports sound a note of guarded optimism. Contrary to some reports in the literature, they indicate that microarray data can indeed be reproducible and comparable between different platforms and labs. Rather than endorsing one specific protocol or array platform, the authors want to provide guidelines that will improve data quality. They advise that researchers should carefully choose methods for sample preparation and data analysis, depending on the biological question, and standardize protocols as much as possible, if comparative studies are planned. They caution that comparing new data to existing results requires a lot of information about the individual experiments so that, ideally, raw data can be analyzed with the same statistical methods.

Calls for more transparency in data documentation are not new. In 2001 the microarray community launched a collaborative effort to standardize how array data are reported. They established the minimum information about a microarray experiment (MIAME) standards, which require reporting of specifications of experiment design, sample treatment, hybridization conditions, data acquisition and normalization (Nature 419, 323; 2002). Some commercial array manufacturers also have responded to this need for information by making details of array probes available.

All of these efforts are stepping stones toward global data comparison; however, a lot remains to be done. Among the things most needed is a universal RNA standard to normalize intensity readings on different arrays. The External RNA Standard Consortium, sponsored by the US National Institute of Standards and Technology is in the process of establishing such a standard. This group, comprising representatives from government, industry and academia, has agreed on the characteristics of the reference material that would span various transcripts at different concentrations. They are now working on making it available to scientists conveniently and at a reasonable cost. Other items on the wish list of microarray users include improved annotation of genomes and commercial arrays and better compliance with MIAME standards.

The hard work by members of the microarray community in recent years, including the reports in this issue, is a testimony to the importance of this technology for the future of biomedical research. This development allows cautious optimism but also calls for continued collective efforts to address limitations so that microarrays can deliver their tantalizing promises.