Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Matters Arising
  • Published:

Revisiting the use of structural similarity index in Hi-C

Matters Arising to this article was published on 05 December 2023

The Original Article was published on 19 October 2020

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Distributions of mean SSIM in Hi-C experiments.
Fig. 2: Intrinsic limitations of applying SSIM in Hi-C experiments.

Data availability

HIC files for both the DLBCL and healthy B cell datasets2 are available for download at https://github.com/vaquerizaslab/chess/tree/master/examples/dlbcl, while the raw FASTQ files can be accessed from the ArrayExpress archive under accession code E-MTAB-5875. COOL files for the reproduced and shuffled data can be accessed from https://github.com/hanjunlee21/StructuralSimilarity/tree/main/COOL and have been deposited at Zenodo (https://doi.org/10.5281/zenodo.7937194). HIC files for seven human cell types5,6 are available for download at the Gene Expression Omnibus under accession code GSE63525. FASTQ files for the GM12878 dataset are available for download at GSM2360314. The DNase I hypersensitivity assay dataset for GM12878 is available for download at https://www.encodeproject.org/experiments/ENCSR000EMT/. Source data are provided with this paper.

Code availability

All code required for the reproduction of our findings is available on GitHub (https://github.com/hanjunlee21/StructuralSimilarity) and has been deposited at Zenodo (https://doi.org/10.5281/zenodo.7937194). The HiCShuffle source code is publicly available at https://github.com/hanjunlee21/HiCShuffle and is indexed in PyPI as hicshuffle. The HiCShuffle source code has been deposited in Zenodo at https://doi.org/10.5281/zenodo.7937187. The CHESS source code1 is publicly available at https://github.com/vaquerizaslab/CHESS and is indexed in PyPI as chess-hic.

References

  1. Galan, S. et al. CHESS enables quantitative comparison of chromatin contact data and automatic feature extraction. Nat. Genet. 52, 1247–1255 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Díaz, N. et al. Chromatin conformation analysis of primary patient tissue using a low input Hi-C method. Nat. Commun. 9, 4938 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  3. Van Der Walt, S. et al. scikit-image: image processing in Python. PeerJ 2, e453 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Ing-Simmons, E., Machnik, N. & Vaquerizas, J. M. Reply to: Revisiting the use of structural similarity index in Hi-C. Nat. Genet. https://doi.org/10.1038/s41588-023-01595-5 (2023).

  5. Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Sanborn, A. L. et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl Acad. Sci. USA 112, E6456–E6465 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Müller, C. A. et al. The dynamics of genome replication using deep sequencing. Nucleic Acids Res. 42, e3 (2014).

    Article  PubMed  Google Scholar 

  9. Van Steensel, B. & Belmont, A. S. Lamina-associated domains: links with chromosome architecture, heterochromatin, and gene repression. Cell 169, 780–791 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Djekidel, M. N., Chen, Y. & Zhang, M. Q. FIND: difFerential chromatin INteractions Detection using a spatial Poisson process. Genome Res. 28, 412–422 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Knight, P. A. & Ruiz, D. A fast algorithm for matrix balancing. IMA J. Numer. Anal. 33, 1029–1047 (2013).

    Article  Google Scholar 

  12. Wang, Z., Bovik, A. C., Sheikh, H. R. & Simoncelli, E. P. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13, 600–612 (2004).

    Article  PubMed  Google Scholar 

  13. Busby, M. A. et al. Expression divergence measured by transcriptome sequencing of four yeast species. BMC Genomics 12, 635 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

H.L., B.B., M.S.L. and T.S. conceptualized the study. H.L. designed the software and carried out the investigation. H.L. prepared and wrote the original draft of the manuscript. H.L., B.B., M.S.L. and T.S. reviewed and edited the draft. H.L., M.S.L. and T.S. supervised the study.

Corresponding authors

Correspondence to Hanjun Lee or Toshihiro Shioda.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Schematic of data shuffling.

To destroy any significant differences in chromatin contacts between the query and reference input files of CHESS, reads from the FASTQ files of Díaz et al.2 were shuffled to create hybrid FASTQ files containing identical fraction of reads from DLBCL and NORMAL libraries. DLBCL, diffuse large B-cell lymphoma.

Extended Data Fig. 2 Assessment of the mean SSIM subtraction approach proposed by Ing-Simmons et al.4.

a, Distributions of mean SSIM in chromosome 2p for each chromatin-contact map comparison. Gray line indicates the subtracted mean SSIM value that is defined as the difference between the mean SSIM value of diffuse large B cell lymphoma versus healthy B cells (blue) and the mean SSIM value of two shuffled datasets (red). b, Scatter plot on the relationship between the mean SSIM values of diffuse large B cell lymphoma versus healthy B cells and the subtracted mean SSIM values (Pearson’s r = −0.012, P = 0.796; two-tailed test). DLBCL, diffuse large B cell lymphoma; SSIM, structural similarity index measure.

Source data

Extended Data Fig. 3 Assessment of the heuristic approach proposed by Ing-Simmons et al.4.

a, Scatter plots on three key metrics (mean SSIM, inverse of the Fano factor, and mean absolute fold change). Magenta dots indicate regions that passed the heuristically defined thresholds proposed by Ing-Simmons et al.4 (bottom 10th percentile for mean SSIM and 90th percentile for the Fano factor), while gray dots indicate regions that failed the thresholds. For each group, three representative regions were selected for further analyses (panels 1–6). b, Chromatin-contact maps for panels 1–6. Regions that passed the heuristically defined thresholds exhibited shallow read coverage and showed limited evidence of differential chromatin contact. DLBCL, diffuse large B-cell lymphoma; SSIM, structural similarity index measure.

Source data

Extended Data Fig. 4 Schematic of data shuffling using HiCShuffle.

HiCShuffle is a python-based software that is indexed in PyPI as hicshuffle. HiCShuffle generates four GZIP-compressed shuffled FASTQ files for paired-end experiments. Each FASTQ file would contain half of the query FASTQ file and half of the reference FASTQ file. Both FASTQ and GZIP-compressed FASTQ formats are compatible with HiCShuffle. HiCShuffle is compatible with UNIX-based systems.

Supplementary information

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, H., Blumberg, B., Lawrence, M.S. et al. Revisiting the use of structural similarity index in Hi-C. Nat Genet 55, 2049–2052 (2023). https://doi.org/10.1038/s41588-023-01594-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-023-01594-6

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research