A flexible cross-platform single-cell data processing pipeline

Battenberg, Kai; Kelly, S. Thomas; Ras, Radu Abu; Hetherington, Nicola A.; Hayashi, Makoto; Minoda, Aki

doi:10.1038/s41467-022-34681-z

Download PDF

Article
Open access
Published: 11 November 2022

A flexible cross-platform single-cell data processing pipeline

Nature Communications volume 13, Article number: 6847 (2022) Cite this article

6672 Accesses
5 Citations
22 Altmetric
Metrics details

Subjects

Abstract

Single-cell RNA-sequencing analysis to quantify the RNA molecules in individual cells has become popular, as it can obtain a large amount of information from each experiment. We introduce UniverSC (https://github.com/minoda-lab/universc), a universal single-cell RNA-seq data processing tool that supports any unique molecular identifier-based platform. Our command-line tool, docker image, and containerised graphical application enables consistent and comprehensive integration, comparison, and evaluation across data generated from a wide range of platforms. We also provide a cross-platform application to run UniverSC via a graphical user interface, available for macOS, Windows, and Linux Ubuntu, negating one of the bottlenecks with single-cell RNA-seq analysis that is data processing for researchers who are not bioinformatically proficient.

Tutorial: guidelines for the computational analysis of single-cell RNA sequencing data

Article 07 December 2020

Systematic comparison of single-cell and single-nucleus RNA-sequencing methods

Article 06 April 2020

Alevin-fry unlocks rapid, accurate and memory-frugal quantification of single-cell RNA-seq data

Article 11 March 2022

Introduction

Single-cell genomics technologies have driven a recent surge in studies of cellular heterogeneity. Cell throughput has increased over the years and current single-cell RNA-seq (scRNA-seq) technologies can routinely generate data for thousands to hundreds of thousands of cells in a single experiment, some of which are commercially available. This increase in throughput has made it possible for researchers to apply scRNA-seq to a whole range of tissues as well as whole organisms^1,2,3. It is expected that scRNA-seq will become more accurate, more reliable, and cost less per cell, becoming feasible for a wide range of studies as the technology matures⁴. However, there is still a bottleneck in the ability of biologists to process the data upon generating the data. Furthermore, with mounting scRNA-seq datasets generated through different platforms deposited by the labs globally, a unified tool is needed for the integration of many dispersed publicly available datasets by processing the data in the same manner and parameters.

In this work, we have developed a data processing tool called UniverSC that will aid in democratising single-cell RNA-seq technology by providing the community, especially biologists who are not familiar with bioinformatics, with a user-friendly tool to process scRNA-seq data generated by any platform.

Results

UniverSC runs Cell Ranger on scRNA-seq data of any platform

A common workflow for many of the scRNA-seq technologies involves capturing individual cells, either in gel emulsion with beads or in wells, followed by the addition of a unique molecular identifier (UMI) to RNA molecules, which makes it quantitative. Leveraging the observation that most scRNA-seq technologies utilise the same concept of cell barcodes and UMIs, we developed UniverSC; a shell utility that operates as a wrapper for Cell Ranger (10x Genomics) that can handle datasets generated by a wide range of single-cell technologies. Cell Ranger was chosen as a unifying pipeline for several reasons: 1) it is optimised to run in parallel on a cluster, 2) many labs working on single-cell analysis are likely to already be familiar with the outputs, 3) many tools have already been released for downstream analysis of the output format due to its popularity, 4) the rich summary information and post-processing is useful for further optimisation and troubleshooting if necessary, and 5) the latest open-source release (version 3.0.2) has been optimised further by adapting open-source techniques, such as the third-party EmptyDrops algorithm⁵ for cell calling or filtering, which does not assume thresholds specific for the Chromium platform (10x Genomics).

UniverSC, which is freely available at GitHub and at DockerHub, can be run on any Unix-based system with the command-line interface. It can also be run on Ubuntu, MacOS, and Windows with a graphical user-interface (GUI), eliminating the need to install or configure separate pipelines for each platform. GUI comes with a function to show the command used for each run, as well as the function to generate reference files. Conceptually, UniverSC carries out its entire process in seven steps (Fig. 1). Given a set of paired-end sequence files in FASTQ format (R1 and R2), a genome reference (as required by Cell Ranger), and the name of the selected technology, UniverSC reformats the whitelist barcodes and sequence files to fit what is expected by Cell Ranger. Additionally, UniverSC provides a file with summary statistics, including the mapping rate, assigned/mapped read counts and UMI counts for each barcode, and averages for the filtered cells. Sequence trimming based on adapter contamination or sequencing quality is not included in the pipeline and no trimming is required to pass files to UniverSC. However, trimming is highly recommended, particularly on R2 files from Illumina platforms, as this generally improves the mapping quality. This requires careful data handling to ensure that all Read 1 and Read 2 are strictly in pairs while only trimming Read 2. We provide a script for convenience that filters Read 1 and Read 2 by the quality scores of Read 2 and avoids mismatching cell barcodes. In principle, UniverSC can be run on any droplet-based or well-based technology (see the software documentation and Table 1 for more details). Settings can also be restored to run on Chromium samples as changes made to the Cell Ranger installation by UniverSC are reversible.

Table 1 Technologies currently available and settings used by UniverSC

Full size table

The current release of UniverSC has pre-set parameters for 40 different technologies (Table 1). Further technologies can be used with custom input parameters for any barcode and UMI lengths or by requesting a feature to be added to the GitHub repository. Testing datasets for the following settings are provided: Chromium version 2 and 3 (default), Drop-seq, ICELL8, inDrops-v3, SCI-RNA-Seq, and SmartSeq3.

UniverSC enables cross-platform single-cell data integration

We demonstrate how our method compares to other data processing pipelines using published datasets. Drop-seq is an example of a droplet-based single-cell technology that does not have known barcodes⁶, thus a whitelist of permutations was generated for compatibility. ICELL8 is a well-based technology that has a known barcode whitelist and allows selecting subsets of wells by known barcodes⁷. SmartSeq3 is also a well-based technology that utilises dual indexing and full-length RNA-sequencing⁸. Together with Chromium, these represent several different classes of technologies with different configurations for processing cell barcodes. To assess the degree of similarity between UniverSC and the pipelines for these 4 technologies (Chromium, Drop-seq, ICELL8, and SmartSeq3), both UniverSC and the pipeline used in the original publication of the technique were run on datasets of human cell lines. Specifically, the following pipelines were compared to UniverSC: Cell Ranger (version 3.0.2) (10x Genomics) for Chromium data, dropSeqPipe (version 0.6)⁹ for Drop-seq data, CogentAP (version 1.0) (Takara Bio Inc.) for ICELL8 data, and zUMIs (version 2.9.7)¹⁰ for SmartSeq3 data. Our results show high correlation between the gene-barcode matrices (GBMs) generated by UniverSC and the coupled pipelines (identical (r = 1) with Cell Ranger 3.0.2 and 0.94 or higher in the three other sets of GBMs, Fig. 2). Correspondingly, clustering results were highly similar based on the high Adjusted Rand Index (ARI) (1 for Chromium, 0.78 for Drop-seq, 0.87 for ICELL8, and 0.78 for SmartSeq3 data, Fig. 2). In the case of UniverSC compared to zUMIs, we do not see a 1-to-1 relationship in UMI counts, despite having a high correlation and a high ARI. This is likely due to the differences in data handling between the two pipelines. While UniverSC discards all multi-mapping reads for UMI counting (function of Cell Ranger), zUMIs includes primary alignments of multimapping reads, leading zUMIs to have a higher UMI count compared to UniverSC. However, the ARI value upon clustering remains high (Fig. 2).

**Fig. 2: Similarity assessment of UniverSC against other pipelines.**

We also demonstrate how applying UniverSC to all datasets from different platforms compares to applying separate pipelines for each technology during data integration. We used published mouse primary cell data from a study benchmarking different scRNA-seq platforms¹¹. The Chromium dataset was used as reference and the SmartSeq2 dataset integrated generally well regardless of what pipeline was used for processing (Fig. 3A). However in comparison, processing the SmartSeq2 dataset via UniverSC (and thereby applying a single pipeline to all datasets) resulted in a lower kBET¹² (0.06 compared to 0.11) and a higher Silhouette score¹³ (0.43 compared to 0.36) (Fig. 3B, C). This suggests that the batch effect was better removed (based on kBET) and the clusters were more distinct (based on Silhouette score) by UniverSC. A drastic impact was certainly not expected given the high level of correlation between the outputs of UniverSC and various other pipelines tested as above, as well as the fact that all pipelines work under a similar framework. Nevertheless, we demonstrate measurable improvements in data integration by applying UniverSC for all samples, compared to applying separate pipelines on datasets generated by different platforms.

**Fig. 3: Assessment of data integration by UniverSC versus multiple pipelines.**

Discussion

With the availability of a Docker image and GUI application for UniverSC, we envision UniverSC will facilitate robust and user-friendly single-cell analysis to democratise scRNA-seq technologies. As single-cell technologies become integral to a wide range of studies, mitigation of technical errors and integration of scRNA-seq data generated across different groups and platforms will be necessary. Processing data that contains various barcode and UMI configurations under a consistent framework will be convenient and essential. While there are pipelines that can be configured for a variety of technologies (dropSeqPipe⁹; zUMIs¹⁰; dropEst¹⁴, Kallisto/BUStools¹⁵), Cell Ranger performs well in a server or cluster environment and generates a rich and informative output summary. It is of note that UniverSC utilises Cell Ranger version 3.0.2 due to licensing. Although later versions of Cell Ranger are now available, since core changes enable analyses other than scRNA-seq, such as scATAC-seq, TCR, and BCR analyses, these updates do not majorly affect scRNA-seq data processing. As novel single-cell technologies are developed, the utility of UniverSC eliminates the need to develop a dedicated data processing pipeline for their own technology. Lastly, it will enable a fair comparison when evaluating the best platform for a specific sample type, which may be especially important with challenging samples, such as those containing large cells or digestive enzymes. We provide this tool for free and open-source to democratise single-cell analysis in a wide range of scientific applications.

Methods

The set of input parameters for UniverSC is similar to that required by Cell Ranger, with a few additions. The UniverSC workflow requires paired-end FASTQ input files and reference data as prepared by Cell Ranger. By default, UniverSC assumes Read 1 of the FASTQ to contain the cell barcode and UMI and Read 2 to contain the transcript sequences which will be mapped to the reference, as is common in 3’ scRNA-seq protocols. Given a known barcode and UMI length, UniverSC will check the file name and barcodes, altering the configurations to match that of Chromium as needed. The chemistry appropriate for each single-cell technology for 3’ scRNA-seq is determined automatically (technologies for 5’ scRNA-seq other than that of Chromium are not supported at the time of writing). Data from multiple lanes is supported and so is using a custom set of barcodes specific to a given technology.

Published datasets of human cell lines were used to test for output similarity between UniverSC and other pipelines. Test datasets were prepared for Chromium¹⁶, Drop-seq⁶, ICELL8⁷, and SmartSeq3^8,10 (see section Data availability for repositories and specific accession IDs for each dataset). The chromosome 21 (Chr21) of human genome GRCh38 (hg38) was used as the reference to process all datasets. For Chromium dataset, the 10x Genomics bamtofastq tool (https://github.com/10XGenomics/bamtofastq) was used to convert Cell Ranger 1.1.0 output from version 1 chemistry to be compatible with running newer versions. Only the reads that mapped to chromosome 21 were kept to reduce output data size. The ICELL8 dataset was further down sampled to 250 K reads using seqtk¹⁷ (sample with the same random seed for each read). Documentation and codes used to generate each filtered/downsized dataset are provided in the UniverSC GitHub repository (https://github.com/minoda-lab/universc). The output for UniverSC and the respective pipeline for each technology is provided as supplemental data (Supplementary Data 1–8). The full raw output is provided for Chromium, Drop-seq, and ICELL8 datasets. Only the processed GBM is provided for SmartSeq3 dataset due to the exceedingly large raw output size.

Each pair of raw GBMs, which is the critical portion of the pipeline output, were processed in parallel. The pair of GBMs was adjusted to have matching sets of barcodes and genes: only barcodes found in both GBMs were kept, and genes only found in one GBM were added to the other with 0 UMIs assigned. The adjusted pair of GBMs was then used to carry out clustering analysis with an R package Seurat (version 4.1.1)¹⁸ within R (version 4.1.2). Finally, the Pearson correlation between the GBMs and ARI between the two clustering outcomes were calculated using R packages stats (version 4.1.2) and clues (version 0.6.2.2) within R, respectively. For the scatterplots and computing Pearson correlation, a pair of UMI counts for each gene for each cell was considered a single data point unless they were both zero, e.g., up to 1000 data points would be compared for correlation for a pair of GBMs with 10 cells and 100 genes.

To demonstrate improvement on data integration, published datasets of mouse primary cells from a scRNA-seq benchmarking study were used¹¹. From this study, we chose a dataset generated via Chromium as a reference and a dataset generated via SmartSeq2 as a comparison. The full mouse genome (GRCm39) was used as the reference genome and no downsampling was performed for these datasets. The reference Chromium dataset was processed once by UniverSC to generate one GBM, and the SmartSeq2 dataset was processed twice, once by UniverSC and once by zUMIs, to generate a pair of GBMs. The two SmartSeq2 GBMs were formatted as described above to have identical genes and barcodes. All three GBMs were formatted as described above to have identical genes (but not barcodes). Then each version of SmartSeq2 GBM was integrated with the reference GBMs independently using Seurat. To evaluate the quality of integration, kBET¹² and Silhouette score¹³ were calculated for each case using R packages kBET (version 0.99.6) and cluster (version 2.1.3), respectively. The 3 output GBMs are provided as supplemental data (Supplementary Data 9–11).

We provide documentation for UniverSC accessible as a manual and help system in the terminal and a user-interface, which checks file inputs and gives error messages to identify potential problems. UniverSC can be run on any Unix-based system in the shell and both the source code and a docker image are publicly available (see Code availability). The user can also choose to install a GUI for UniverSC (see Code availability). We recommend installing UniverSC in a local directory (e.g., to a home directory) or somewhere appropriate with write access; it can be run on any system with Cell Ranger installed (i.e., added to the PATH environment variable). We also recommend running UniverSC on a server with sufficient memory to run the STAR alignment algorithm. Submission to a cluster in parallel with a job scheduler is supported but note that UniverSC can only run on one technology at a time due to the different barcode whitelist requirements. See the manual for further details. Note that UniverSC was developed by a third-party unrelated to 10x Genomics, and the most recent open-source version of Cell Ranger (version 3.0.2) is used with Cloupe (a portion of Cell Ranger) inactivated to comply with the 10x Genomics End User Software Licence Agreement.

Data availability

The Chromium¹⁶ (HEK293T human kidney cell-lines) dataset used in this study is available from the 10x Genomics website (10x Genomics: https://www.10xgenomics.com). The Drop-seq⁶ dataset used in this study is available from GEO under accession code GSE63473. The ICELL8⁷ dataset used in this study is available from EGA under accession code EGAD00001003443. SmartSeq3 dataset used in this study is available from EMBL ArrayExpress under accession code E-MTAB-8735. Chromium¹⁶ and SmartSeq2^8,10 datasets used in this study for data integration test are both available from GEO under accession code GSE133549. All previously published datasets are under no restrictive access except for the ICELL8 dataset. To access the ICELL8 dataset, please consult the Genentech Data Access Committee as described in the aforementioned link. Source data are provided with this paper.

Code availability

The most recent source code for UniverSC is publicly available along with installation instructions at GitHub (https://github.com/minoda-lab/universc) and the specific version of UniverSC used to generate data for this study is available at Zenodo¹⁹. The Docker image at DockerHub with all dependencies installed from source (https://hub.docker.com/repository/docker/tomkellygenetics/universc). We also provide a cross-platform application to run UniverSC via a GUI, available for macOS, Windows, and Linux Ubuntu (https://genomec.gsc.riken.jp/gerg/UniverSC). This comes along with a step-by-step installation and usage guide at (https://genomec.gsc.riken.jp/gerg/UniverSC/UniverSC_app_release/).

References

Cao, J. et al. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science 357, 661–667 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Regev, A. et al. The Human Cell Atlas: A graphical depiction of the anatomical hierarchy from organs (such as the gut), to tissues (such as the epithelium in the crypt in the small intestine), to their constituent cells (such as epithelial, immune, stromal and neural cells). eLife 6, e27041 (2017).
Article PubMed PubMed Central Google Scholar
The Tabula Muris Consortium. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris. Nature 562, 367–372 (2018).
Article ADS CAS Google Scholar
Kulkarni, A., Anderson, A. G., Merullo, D. P. & Konopka, G. Beyond bulk: a review of single cell transcriptomics methodologies and applications. Curr. Opin. Biotech. 58, 129–136 (2019).
Article CAS PubMed Google Scholar
Lun, A. T. L. et al. EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data. Genome Biol. 20, 63 (2019).
Article PubMed PubMed Central Google Scholar
Macosko, E. Z. et al. Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets. Cell 161, 1202–1214 (2015).
Article CAS PubMed PubMed Central Google Scholar
Goldstein, L. D. et al. Massively parallel nanowell-based single-cell gene expression profiling. BMC Genomics 18, 519 (2017).
Article PubMed PubMed Central Google Scholar
Hagemann-Jensen, M. et al. Single-cell RNA counting at allele and isoform resolution using Smart-seq3. Nat. Biotechnol. 38, 708–714 (2020).
Article CAS PubMed Google Scholar
Roeilli, P., Mueller, S., Girardot, C. & Kelly, S. T. GitHub repository https://github.com/Hoohm/dropSeqPipe/tree/develop (Accessed 13 January, 2021)
Parekh, S., Ziegenhain, C., Vieth, B., Enard, W. & Hellman, I. zUMIs - A fast and flexible pipeline to process RNA sequencing data with UMIs. GigaScience 7, 1–9 (2018).
Article CAS Google Scholar
Mereu, E. et al. Benchmarking single-cell RNA-sequencing protocols for cell atlas projects. Nat. Biotechnol. 38, 747–755 (2020).
Article CAS PubMed Google Scholar
Büttner, M. et al. A test metric for assessing single-cell RNA-seq batch correction. Nat. Methods 16, 43–49 (2019).
Article PubMed Google Scholar
Rousseeuw, P. J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Computational Appl. Math. 20, 53–65 (1987).
Article MATH Google Scholar
Petukhov, V. et al. dropEst: pipeline for accurate estimation of molecular counts in droplet-based single-cell RNA-seq experiments. Genome Biol. 19, 78 (2018).
Article PubMed PubMed Central Google Scholar
Melsted, P., Ntranos, V. & Pachter, L. The barcode, UMI, set format and BUStools. Bioinformatics 35, 4472–4473 (2019).
Article CAS PubMed Google Scholar
Zheng, G. X. Y. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Li, H. GitHub repository https://github.com/lh3/seqtk (Accessed May 24, 2022)
Stuart, T. et al. Comprehensive Integration of Single-Cell Data. Cell 177, 1888–1902.e21 (2019).
Article CAS PubMed PubMed Central Google Scholar
Battenberg, K. et al. A flexible cross-platform single-cell data processing pipeline. Zenodo https://doi.org/10.5281/zenodo.7116956 (2022).
Article Google Scholar
Kouno, T. et al. C1 CAGE detects transcription start sites and enhancer activity at single-cell resolution. Nat. Commun. 10, 360 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Hayashi, T. et al. Single-cell full-length total RNA sequencing uncovers dynamics of recursive splicing and enhancer RNAs. Nat. Commun. 9, 619 (2018).
Article ADS PubMed PubMed Central Google Scholar
Hashimshony, T. et al. CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification. Cell Rep. 2, 666–673 (2012).
Article CAS PubMed Google Scholar
Hashimshony, T. et al. CEL-Seq2: sensitive highly-multiplexed single-cell RNA-Seq. Genome Biol. 17, 77 (2016).
Article PubMed PubMed Central Google Scholar
Yan, Y. GitHub repository https://github.com/yanailab/celseq2 (Accessed July 10, 2020).
Veres, A. & Lee, C. H. GitHub repository https://github.com/indrops/indrops (Accessed July 10, 2020).
Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).
Article CAS PubMed PubMed Central Google Scholar
Zilionis, R. et al. Single-cell barcoding and sequencing using droplet microfluidics. Nat. Protoc. 12, 44–73 (2017).
Article CAS PubMed Google Scholar
Jaitin, D. A. et al. Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types. Science 343, 776–779 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Keren-Shaul, H. et al. MARS-seq2.0: an experimental and analytical pipeline for indexed sorting combined with single-cell RNA sequencing. Nat. Protoc. 14, 1841–1862 (2019).
Article CAS PubMed Google Scholar
Han, X. et al. Mapping the Mouse Cell Atlas by Microwell-Seq. Cell 172, 1091–1107.e17 (2018).
Article CAS PubMed Google Scholar
Sasagawa, Y. et al. Quartz-Seq: a highly reproducible and sensitive single-cell RNA sequencing method, reveals non-genetic gene-expression heterogeneity. Genome Biol. 14, 3097 (2013).
Article Google Scholar
Sasagawa, Y. et al. Quartz-Seq2: a high-throughput single-cell RNA-sequencing method that effectively uses limited sequence reads. Genome Biol. 19, 29 (2018).
Article PubMed PubMed Central Google Scholar
Hayashi, T. et al. Single-cell full-length total RNA sequencing uncovers dynamics of recursive splicing and enhancer RNAs. Nat. Commun. 9, 619 (2018).
Article ADS PubMed PubMed Central Google Scholar
Vitak, S. A. et al. Sequencing thousands of single-cell genomes with combinatorial indexing. Nat. Methods 14, 302–308 (2017).
Article CAS PubMed PubMed Central Google Scholar
Datlinger, P. et al. Ultra-high throughput single-cell RNA sequencing by combinatorial fluidic indexing. bioRxiv https://doi.org/10.1101/2019.12.17.879304 (2019).
Article Google Scholar
Soumillon, M. et al. Characterization of directed differentiation by high-throughput single-cell RNA-Seq. bioRxiv https://doi.org/10.1101/003236 (2014).
Article Google Scholar
Bagnoli, J. W. et al. Sensitive and powerful single-cell RNA sequencing using mcSCRB-seq. Nat. Commun. 9, 2937 (2018).
Article ADS PubMed PubMed Central Google Scholar
Gierahn, T. M. et al. Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput. Nat. Methods 14, 395–398 (2017).
Article CAS PubMed PubMed Central Google Scholar
Ramskold, D. et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat. Biotechnol. 30, 777–782 (2012).
Article PubMed PubMed Central Google Scholar
Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).
Article CAS PubMed Google Scholar
Rosenberg, A. B. et al. Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding. Science 360, 176–182 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Islam, S. et al. Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Res. 21, 1160–1167 (2011).
Article CAS PubMed PubMed Central Google Scholar
Islam, S. et al. Quantitative single-cell RNA-seq with unique molecular identifiers. Nat. Methods 11, 162–166 (2014).
Article Google Scholar
Hochgerner, H. et al. STRT-seq-2i: dual-index 5ʹ single cell and nucleus RNA-seq on an addressable microwell array. Sci. Rep. 7, 16237 (2017).
Article Google Scholar
Romagnoli, D. et al. ddSeeker: a tool for processing Bio-Rad ddSEQ single cell RNA-seq data. BMC Genomics 19, 960 (2018).
Article CAS PubMed PubMed Central Google Scholar
Teichmann Group. GitHub repository https://teichlab.github.io/scg_lib_structs/methods_html/SureCell (Accessed July 10, 2020).

Download references

Acknowledgements

This work was supported by a JSPS KAKENHI Grant-in-Aid for Scientific Research on Innovative Areas “Principles of pluripotent stem cells underlying plant vitality” (JP17H06470 to AM and 17H06472 to MH) and Center for IMS. We acknowledge contributions from Tommy Terooatea (RIKEN IMS) for testing UniverSC, Jonathon Moody and Chung-Chau Hon (RIKEN IMS) for their insightful discussions. We thank Musa Mhlanga (RIMLS) for encouraging this tool to be published. We also wish to acknowledge Shuwen Chen, Tsuyoshi Okumo, Max Sanchez, and Karthik Swaminathan (Takara Bio) for supporting data analysis from the ICELL8 platform with their CogentAP pipeline. We thank the developers at 10x Genomics of Cell Ranger and dependencies for making their code publicly available. We also thank Marcus Kinsella (CZI) for releasing a docker image of an open-source version of Cell Ranger 2.0.2.

Author information

These authors contributed equally: Kai Battenberg, S. Thomas Kelly.

Authors and Affiliations

Center for Integrative Medical Sciences, RIKEN, Yokohama, Japan
Kai Battenberg, S. Thomas Kelly, Radu Abu Ras & Aki Minoda
Center for Sustainable Resource Science, RIKEN, Yokohama, Japan
Kai Battenberg & Makoto Hayashi
Faculty of Automatics, Computers and Electronics, University of Craiova, Craiova, Romania
Radu Abu Ras
Department of Cell Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Radboud University, Nijmegen, The Netherlands
Nicola A. Hetherington & Aki Minoda

Authors

Kai Battenberg
View author publications
You can also search for this author in PubMed Google Scholar
S. Thomas Kelly
View author publications
You can also search for this author in PubMed Google Scholar
Radu Abu Ras
View author publications
You can also search for this author in PubMed Google Scholar
Nicola A. Hetherington
View author publications
You can also search for this author in PubMed Google Scholar
Makoto Hayashi
View author publications
You can also search for this author in PubMed Google Scholar
Aki Minoda
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.T.K. and K.B. conceptualised and wrote the UniverSC script, carried out the comparative analysis, and wrote the manuscript. S.T.K. documented the code and built the Docker image. R.A. developed the UniverSC GUI application and app documentation. N.A.H. generated datasets and tested the script. M.H. and A.M. supervised the project. A.M. edited the manuscript.

Corresponding author

Correspondence to Aki Minoda.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Geng Chen, Bart Deplancke, Yan Wu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Ilse Valtierra Gutierrez. This article has been peer reviewed as part of Springer Nature’s Guided Open Access initiative.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Transparent Peer Review File

Editorial Assessment Report

Description of Additional Supplementary Files

Source data

Source Data Figure 2

Source Data Figure 3

File S1, File S2, File S3, File S4, File S5, File S6, File S7, File S8, File S9, File S10, File S11

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Battenberg, K., Kelly, S.T., Ras, R.A. et al. A flexible cross-platform single-cell data processing pipeline. Nat Commun 13, 6847 (2022). https://doi.org/10.1038/s41467-022-34681-z

Download citation

Received: 15 February 2021
Accepted: 02 November 2022
Published: 11 November 2022
DOI: https://doi.org/10.1038/s41467-022-34681-z

This article is cited by

Nonlinear dimensionality reduction based visualization of single-cell RNA sequencing data
- Mohamed Yousuff
- Rajasekhara Babu
- Anand Rathinam
Journal of Analytical Science and Technology (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.