Inferring cellular and molecular processes in single-cell data with non-negative matrix factorization using Python, R and GenePattern Notebook implementations of CoGAPS

Brunet, J.-P., Tamayo, P., Golub, T. R. & Mesirov, J. P. Metagenes and molecular pattern discovery using matrix factorization. Proc. Natl Acad. Sci. USA 101, 4164–4169 (2004).

Article  CAS  Google Scholar 

Stein-O’Brien, G. L. et al. Decomposing cell identity for transfer learning across cellular measurements, platforms, tissues, and species. Cell Syst. 8, 395–411.e8 (2019).

Article  Google Scholar 

Cleary, B., Cong, L., Cheung, A., Lander, E. S. & Regev, A. Efficient generation of transcriptomic profiles by random composite measurements. Cell 171, 1424–1436.e18 (2017).

Article  CAS  Google Scholar 

Gaujoux, R. & Seoighe, C. A flexible R package for nonnegative matrix factorization. BMC Bioinform. 11, 367 (2010).

Article  Google Scholar 

Ochs, M. F. & Fertig, E. J. Matrix factorization for transcriptional regulatory network inference. IEEE Symp. Comput. Intell. Bioinform. Comput. Biol. Proc. 2012, 387–396 (2012).

Google Scholar 

Stein-O’Brien, G. L. et al. Enter the matrix: factorization uncovers knowledge from omics. Trends Genet. 34, 790–805 (2018).

Article  Google Scholar 

Fertig, E. J., Ding, J., Favorov, A. V., Parmigiani, G. & Ochs, M. F. CoGAPS: an R/C++ package to identify patterns and biological process activity in transcriptomic data. Bioinformatics 26, 2792–2793 (2010).

Article  CAS  Google Scholar 

Clark, B. S. et al. Single-cell RNA-seq analysis of retinal development identifies NFI factors as regulating mitotic exit and late-born cell specification. Neuron 102, 1111–1126.e5 (2019).

Article  CAS  Google Scholar 

Sherman, T. D., Gao, T. & Fertig, E. J. CoGAPS 3: Bayesian non-negative matrix factorization for single-cell analysis with asynchronous updates and sparse data structures. BMC Bioinform. 21, 453 (2020).

Article  Google Scholar 

Peng, J. et al. Author correction: single-cell RNA-seq highlights intra-tumoral heterogeneity and malignant progression in pancreatic ductal adenocarcinoma. Cell Res. 29, 777 (2019).

Article  Google Scholar 

Kinny-Köster, B. et al. Inflammatory signaling in pancreatic cancer transfers between a single-cell RNA sequencing atlas and co-culture. Preprint at bioRxiv https://doi.org/10.1101/2022.07.14.500096 (2022).

Reich, M. et al. The genepattern notebook environment. Cell Syst. 5, 149–151.e1 (2017).

Article  CAS  Google Scholar 

Lee, D. D. & Seung, H. S. Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999).

Article  CAS  Google Scholar 

Ochs, M. F., Stoyanova, R. S., Arias-Mendoza, F. & Brown, T. R. A new method for spectral decomposition using a bilinear Bayesian approach. J. Magn. Reson. 137, 161–176 (1999).

Article  CAS  Google Scholar 

Wang, G., Kossenkov, A. V. & Ochs, M. F. LS-NMF: a modified non-negative matrix factorization algorithm utilizing uncertainty estimates. BMC Bioinform 7, 175 (2006).

Article  Google Scholar 

Sibisi, S. & Skilling, J. Prior distributions on measure space. J. R. Stat. Soc. B 59, 217–235 (1997).

Article  Google Scholar 

Woo, J., Aliferis, C. & Wang, J. ccfindR: single-cell RNA-seq analysis using Bayesian non-negative matrix factorization. https://www.bioconductor.org/packages/devel/bioc/vignettes/ccfindR/inst/doc/ccfindR.html (2022).

Kotliar, D. et al. Identifying gene expression programs of cell-type identity and cellular activity with single-cell RNA-Seq. eLife 8, e43803 (2019).

Article  Google Scholar 

Cemgil, A. T. Bayesian inference for nonnegative matrix factorisation models. Comput. Intell. Neurosci. 2009, 785152 (2009).

Article  Google Scholar 

Palla, G. & Ferrero, E. Latent factor modeling of scRNA-seq data uncovers dysregulated pathways in autoimmune disease patients. iScience 23, 101451 (2020).

Article  CAS  Google Scholar 

Shao, C. & Höfer, T. Robust classification of single-cell transcriptome data by nonnegative matrix factorization. Bioinformatics 33, 235–242 (2017).

Article  CAS  Google Scholar 

Xie, F., Zhou, M. & Xu, Y. BayCount: a Bayesian decomposition method for inferring tumor heterogeneity using RNA-seq counts. Preprint at bioRxiv https://doi.org/10.1101/218511

Hou, W., Ji, Z., Ji, H. & Hicks, S. C. A systematic evaluation of single-cell RNA-sequencing imputation methods. Genome Biol. 21, 218 (2020).

Article  CAS  Google Scholar 

Elyanow, R., Dumitrascu, B., Engelhardt, B. E. & Raphael, B. J. netNMF-sc: leveraginggene–gene interactions for imputation and dimensionality reduction in single-cell expression analysis. Genome Res. 30, 195–204 (2020).

Article  CAS  Google Scholar 

Hicks, S. C., Townes, F. W., Teng, M. & Irizarry, R. A. Missing data and technical variability in single-cell RNA-sequencing experiments. Biostatistics 19, 562–578 (2018).

Article  Google Scholar 

Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).

Article  CAS  Google Scholar 

Zhang, Y., Parmigiani, G. & Johnson, W. E. ComBat-seq: batch effect adjustment for RNA-seq count data. NAR Genom. Bioinform. 2, lqaa078 (2020).

Article  Google Scholar 

Wu, Y., Tamayo, P. & Zhang, K. Visualizing and interpreting single-cell gene expression datasets with similarity weighted nonnegative embedding. Cell Syst. 7, 656–666.e4 (2018).

Article  CAS  Google Scholar 

Luecken, M. D. & Theis, F. J. Current best practices in single-cell RNA-seq analysis: a tutorial. Mol. Syst. Biol. 15, e8746 (2019).

Article  Google Scholar 

Stein-O’Brien, G. L. et al. PatternMarkers & GWCoGAPS for novel data-driven biomarkers via whole transcriptome NMF. Bioinformatics 33, 1892–1894 (2017).

Article  Google Scholar 

Taylor-weiner, A. et al. Scaling computational genomics to millions of individuals with GPUs. Genome Biol. 20, 228 (2019).

Stein-O’Brien, G. L. et al. Decomposing cell identity for transfer learning across cellular measurements, platforms, tissues, and species. Cell Syst. 8, 395–411 (2019).

Fertig, E. J. et al. Preferential activation of the hedgehog pathway by epigenetic modulations in HPV negative HNSCC identified with meta-pathway analysis. PLoS ONE 8, e78127 (2013).

Article  CAS  Google Scholar 

Way, G. P., Zietz, M., Rubinetti, V., Himmelstein, D. S. & Greene, C. S. Compressing gene expression data using multiple latent space dimensionalities learns complementary biological representations. Genome Biol. 21, 109 (2020).

Article  Google Scholar 

Way, G. P. & Greene, C. S. Extracting a biologically relevant latent space from cancer transcriptomes with variational autoencoders. Pac. Symp. Biocomput. 23, 80–91 (2018).

Google Scholar 

Bidaut, G. & Ochs, M. F. ClutrFree: cluster tree visualization and interpretation. Bioinformatics 20, 2869–2871 (2004).

Article  CAS  Google Scholar 

Wagner, A., Regev, A. & Yosef, N. Revealing the vectors of cellular identity with single-cell genomics. Nat. Biotechnol. 34, 1145–1160 (2016).

Article  CAS  Google Scholar 

Davis-Marcisak, E. F. et al. From bench to bedside: single-cell analysis for cancer immunotherapy. Cancer Cell 39, 1062–1080 (2021).

Article  CAS  Google Scholar 

Gojo, J. et al. Single-Cell RNA-seq reveals cellular hierarchies and impaired developmental trajectories in pediatric ependymoma. Cancer Cell 38, 44–59.e9 (2020).

Article  CAS  Google Scholar 

Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).

Article  CAS  Google Scholar 

Moloshok, T. D. et al. Application of Bayesian decomposition for analysing microarray data. Bioinformatics 18, 566–575 (2002).

Article  CAS  Google Scholar 

Zhu, X., Ching, T., Pan, X., Weissman, S. M. & Garmire, L. Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization. PeerJ 5, e2888 (2017).

Article  Google Scholar 

Stein-O’Brien, G. et al. Integrated time course omics analysis distinguishes immediate therapeutic response from acquired resistance. Genome Med. 10, 37 (2018).

Article  Google Scholar 

Liu, J. et al. Jointly defining cell types from multiple single-cell datasets using LIGER. Nat. Protoc. 15, 3632–3662 (2020).

Article  CAS  Google Scholar 

Lê Cao, K.-A. et al. Community-wide hackathons to identify central themes in single-cell multi-omics. Genome Biol. 22, 220 (2021).

Article  Google Scholar 

Sharma, G., Colantuoni, C., Goff, L. A., Fertig, E. J. & Stein-O’Brien, G. projectR: an R/Bioconductor package for transfer learning via PCA, NMF, correlation and clustering. Bioinformatics 36, 3592–3593 (2020).

Article  CAS  Google Scholar 

Davis-Marcisak, E. F. et al. Transfer learning between preclinical models and human tumors identifies a conserved NK cell activation signature in anti-CTLA-4 responsive tumors. Genome Med. 13, 129 (2021).

Article  CAS  Google Scholar 

Rodriques, S. G. et al. Slide-seq: a scalable technology for measuring genome-wide expression at high spatial resolution. Science 363, 1463–1467 (2019).

Article  CAS  Google Scholar 

Deshpande, A. et al. Uncovering the spatial landscape of molecular interactions within the tumor microenvironment through latent spaces. Cell Syst. 4, 285–301 (2022).

zenodo: Research. Shared. (CERN and GitHub, 2023).

Anaconda v22.9.0 (Anaconda Software Distribution, 2021).

Virshup, I., Rybakov, S., Theis, F. J., Angerer, P. & Alexander Wolf, F. anndata: Annotated data. Preprint at bioRxiv https://doi.org/10.1101/2021.12.16.473007 (2021).

Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).

Article  Google Scholar 

Seabold, S. & Perktold, J. Statsmodels: Econometric and Statistical Modeling with Python. In Proc. 9th Python in Science Conference (SciPy) https://doi.org/10.25080/majora-92bf1922-011 (2010).

Fang, Z., Liu, X. & Peltz, G. GSEApy: a comprehensive package for performing gene set enrichment analysis in Python. Bioinformatics 39, btac757 (2023).

Article  CAS  Google Scholar 

Korotkevich, G. et al. Fast gene set enrichment analysis. Preprint at bioRxiv.

Comments (0)

No login
gif