Analysis and Visualization of Single-Cell Sequencing Data with Scanpy and MetaCell: A Tutorial

  • Protocol
  • First Online:
Ctenophores

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2757))

Abstract

The emergence and development of single-cell RNA sequencing (scRNA-seq) techniques enable researchers to perform large-scale analysis of the transcriptomic profiling at cell-specific resolution. Unsupervised clustering of scRNA-seq data is central for most studies, which is essential to identify novel cell types and their gene expression logics. Although an increasing number of algorithms and tools are available for scRNA-seq analysis, a practical guide for users to navigate the landscape remains underrepresented. This chapter presents an overview of the scRNA-seq data analysis pipeline, quality control, batch effect correction, data standardization, cell clustering and visualization, cluster correlation analysis, and marker gene identification. Taking the two broadly used analysis packages, i.e., Scanpy and MetaCell, as examples, we provide a hands-on guideline and comparison regarding the best practices for the above essential analysis steps and data visualization. Additionally, we compare both packages and algorithms using a scRNA-seq dataset of the ctenophore Mnemiopsis leidyi, which is representative of one of the earliest animal lineages, critical to understanding the origin and evolution of animal novelties. This pipeline can also be helpful for analyses of other taxa, especially prebilaterian animals, where these tools are under development (e.g., placozoan and Porifera).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
EUR 44.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 209.00
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
EUR 267.49
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/tanaylab/MetaCell/

  2. 2.

    https://anndata.readthedocs.io/en/stable/anndata.AnnData.html

  3. 3.

    https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html

  4. 4.

    https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Lists

References

  1. Moroz LL (2015) Biodiversity meets neuroscience: from the sequencing ship (Ship-Seq) to deciphering parallel evolution of neural systems in Omic's era. Integr Comp Biol 55(6):1005–1017

    PubMed  PubMed Central  Google Scholar 

  2. Moroz LL (2018) NeuroSystematics and periodic system of neurons: model vs reference species at single-cell resolution. ACS Chem Neurosci 9(8):1884–1903

    Article  CAS  PubMed  Google Scholar 

  3. Hernandez-Nicaise M-L (1991) Ctenophora. In: Harrison FWFW, Westfall JA (eds) Microscopic anatomy of invertebrates: Placozoa, Porifera, Cnidaria, and Ctenophora. Wiley, New York, pp 359–418

    Google Scholar 

  4. Nielsen C (2012) Animal evolution: interrelationships of the living phyla. Oxford University Press, Oxford

    Google Scholar 

  5. Nielsen C (2019) Early animal evolution: a morphologist's view. R Soc Open Sci 6(7):190638

    Article  PubMed  PubMed Central  Google Scholar 

  6. Li Y et al (2021) Rooting the animal tree of life. Mol Biol Evol 38(10):4322–4333

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Moroz LL (2012) Phylogenomics meets neuroscience: how many times might complex brains have evolved? Acta Biol Hung 63(Suppl 2):3–19

    Article  PubMed  PubMed Central  Google Scholar 

  8. Moroz LL et al (2014) The ctenophore genome and the evolutionary origins of neural systems. Nature 510(7503):109–114

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Ryan JF et al (2013) The genome of the ctenophore Mnemiopsis leidyi and its implications for cell type evolution. Science 342(6164):1242592

    Article  PubMed  PubMed Central  Google Scholar 

  10. Schultz DT et al (2023) Ancient gene linkages support ctenophores as sister to other animals. Nature 618(7963):110–117

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Whelan NV et al (2015) Error, signal, and the placement of Ctenophora sister to all other animals. Proc Natl Acad Sci U S A 112(18):5773–5778

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Whelan NV et al (2017) Ctenophore relationships and their placement as the sister group to all other animals. Nat Ecol Evol 1(11):1737–1746

    Article  PubMed  PubMed Central  Google Scholar 

  13. Moroz LL, Kohn AB (2015) Unbiased view of synaptic and neuronal gene complement in ctenophores: are there pan-neuronal and pan-synaptic genes across Metazoa? Integr Comp Biol 55(6):1028–1049

    PubMed  PubMed Central  Google Scholar 

  14. Moroz LL, Kohn AB (2016) Independent origins of neurons and synapses: insights from ctenophores. Philos Trans R Soc Lond Ser B Biol Sci 371(1685):20150041

    Article  Google Scholar 

  15. Moroz LL, Romanova DY (2022) Alternative neural systems: what is a neuron? (Ctenophores, sponges and placozoans). Front Cell Dev Biol 10:1071961

    Article  PubMed  PubMed Central  Google Scholar 

  16. Moroz LL, Romanova DY, Kohn AB (1821) Neural versus alternative integrative systems: molecular insights into origins of neurotransmitters. Philos Trans R Soc Lond Ser B Biol Sci 2021(376):20190762

    Google Scholar 

  17. Martindale MQ (2022) Emerging models: the “development” of the ctenophore Mnemiopsis leidyi and the cnidarian Nematostella vectensis as useful experimental models. Curr Top Dev Biol 147:93–120

    Google Scholar 

  18. Martindale MQ, Henry JQ (2015) Ctenophora. In: Wanninger A (ed) Evolutionary developmental biology of invertebrates 1: introduction, non-Bilateria, Acoelomorpha, Xenoturbellida, Chaetognatha. Springer Vienna, Vienna, pp 179–201

    Chapter  Google Scholar 

  19. Sebe-Pedros A et al (2018) Early metazoan cell type diversity and the evolution of multicellular gene regulation. Nat Ecol Evol 2(7):1176–1188

    Article  PubMed  PubMed Central  Google Scholar 

  20. Baran Y et al (2019) MetaCell: analysis of single-cell RNA-seq data using K-nn graph partitions. Genome Biol 20(1):1–19

    Article  CAS  Google Scholar 

  21. Sachkova MY et al (2021) Neuropeptide repertoire and 3D anatomy of the ctenophore nervous system. Curr Biol 31(23):5274–5285 e6

    Article  CAS  PubMed  Google Scholar 

  22. Hayakawa E et al (2022) Mass spectrometry of short peptides reveals common features of metazoan peptidergic neurons. Nat Ecol Evol 6(10):1438–1448

    Article  PubMed  PubMed Central  Google Scholar 

  23. Moroz LL (2009) On the independent origins of complex brains and neurons. Brain Behav Evol 74(3):177–190

    Article  PubMed  PubMed Central  Google Scholar 

  24. Zappia L, Theis FJ (2021) Over 1000 tools reveal trends in the single-cell RNA-seq analysis landscape. Genome Biol 22(1):1–18

    Article  Google Scholar 

  25. Zappia L, Phipson B, Oshlack A (2018) Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database. PLoS Comput Biol 14(6):e1006245

    Article  PubMed  PubMed Central  Google Scholar 

  26. Svensson V, da Veiga Beltrame E, Pachter L (2020) A curated database reveals trends in single-cell transcriptomics. Database 2020:baaa073

    Article  PubMed  PubMed Central  Google Scholar 

  27. Wolf FA, Angerer P, Theis FJ (2018) SCANPY: large-scale single-cell gene expression data analysis. Genome Biol 19(1):1–5

    Article  Google Scholar 

  28. Shannon P et al (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Satija R et al (2015) Spatial reconstruction of single-cell gene expression data. Nat Biotechnol 33(5):495–502

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. McCarthy DJ et al (2017) Scater: pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R. Bioinformatics 33(8):1179–1186

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. ** S et al (2021) Inference and analysis of cell-cell communication using CellChat. Nat Commun 12(1):1–20

    Article  Google Scholar 

  32. Luecken MD, Theis FJ (2019) Current best practices in single‐cell RNA‐seq analysis: a tutorial. Mol Syst Biol 15(6):e8746

    Article  PubMed  PubMed Central  Google Scholar 

  33. Amezquita RA et al (2020) Orchestrating single-cell analysis with Bioconductor. Nat Methods 17(2):137–145

    Article  CAS  PubMed  Google Scholar 

  34. Andrews TS et al (2021) Tutorial: guidelines for the computational analysis of single-cell RNA sequencing data. Nat Protoc 16(1):1–9

    Article  CAS  PubMed  Google Scholar 

  35. Cao J et al (2019) The single-cell transcriptional landscape of mammalian organogenesis. Nature 566(7745):496–502

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Ziegler CG et al (2020) SARS-CoV-2 receptor ACE2 is an interferon-stimulated gene in human airway epithelial cells and is detected in specific cell subsets across tissues. Cell 181(5):1016–1035. e19

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Mathys H et al (2019) Single-cell transcriptomic analysis of Alzheimer’s disease. Nature 570(7761):332–337

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Bornstein C et al (2018) Single-cell map** of the thymic stroma identifies IL-25-producing tuft epithelial cells. Nature 559(7715):622–626

    Article  CAS  PubMed  Google Scholar 

  39. Giladi A et al (2018) Single-cell characterization of haematopoietic progenitors and their trajectories in homeostasis and perturbed haematopoiesis. Nat Cell Biol 20(7):836–846

    Article  CAS  PubMed  Google Scholar 

  40. Alpaydin E (2020) Introduction to machine learning. MIT press

    Google Scholar 

  41. Pedregosa F et al (2011) Scikit-learn: machine learning in Python. J Machine Learn Res 12:2825–2830

    Google Scholar 

  42. Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579

    Google Scholar 

  43. McInnes L, Healy J, Melville J (2018) Umap: uniform manifold approximation and projection for dimension reduction. ar**v preprint ar**v:180203426

    Google Scholar 

  44. Lun AT, Bach K, Marioni JC (2016) Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol 17(1):1–14

    Google Scholar 

  45. Bacher R et al (2017) SCnorm: robust normalization of single-cell RNA-seq data. Nat Methods 14(6):584–586

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Cole MB et al (2019) Performance assessment and selection of normalization procedures for single-cell RNA-Seq. Cell Syst 8(4):315–328. e8

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Lytal N, Ran D, An L (2020) Normalization methods on single-cell RNA-seq data: an empirical survey. Front Genet 11:41

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Street K et al (2018) Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics. BMC Genomics 19(1):1–16

    Article  Google Scholar 

  49. Johnson WE, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8(1):118–127

    Article  PubMed  Google Scholar 

  50. Polański K et al (2020) BBKNN: fast batch alignment of single cell transcriptomes. Bioinformatics 36(3):964–965

    Article  PubMed  Google Scholar 

  51. Korsunsky I et al (2019) Fast, sensitive and accurate integration of single-cell data with harmony. Nat Methods 16(12):1289–1296

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Haghverdi L et al (2018) Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat Biotechnol 36(5):421–427

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Hie B, Bryson B, Berger B (2019) Efficient integration of heterogeneous single-cell transcriptomes using Scanorama. Nat Biotechnol 37(6):685–691

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Tran HTN et al (2020) A benchmark of batch-effect correction methods for single-cell RNA sequencing data. Genome Biol 21(1):1–32

    Article  Google Scholar 

  55. Zheng GX et al (2017) Massively parallel digital transcriptional profiling of single cells. Nat Commun 8(1):1–12

    Article  Google Scholar 

  56. Stuart T et al (2019) Comprehensive integration of single-cell data. Cell 177(7):1888–1902. e21

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Grün D et al (2015) Single-cell messenger RNA sequencing reveals rare intestinal cell types. Nature 525(7568):251–255

    Article  PubMed  Google Scholar 

  58. Wang B et al (2017) Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. Nat Methods 14(4):414–416

    Article  CAS  PubMed  Google Scholar 

  59. Kiselev VY et al (2017) SC3: consensus clustering of single-cell RNA-seq data. Nat Methods 14(5):483–486

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Lin P, Troup M, Ho JW (2017) CIDR: ultrafast and accurate clustering through imputation for single-cell RNA-seq data. Genome Biol 18(1):1–11

    Article  Google Scholar 

  61. Yau C (2016) pcaReduce: hierarchical clustering of single cell transcriptional profiles. BMC Bioinform 17(1):1–11

    Google Scholar 

  62. Zeisel A et al (2015) Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347(6226):1138–1142

    Article  CAS  PubMed  Google Scholar 

  63. Jiang L et al (2016) GiniClust: detecting rare cell types from single-cell gene expression data with Gini index. Genome Biol 17(1):1–13

    Article  Google Scholar 

  64. Qiu X et al (2017) Reversed graph embedding resolves complex single-cell trajectories. Nat Methods 14(10):979–982

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Traag VA, Waltman L, Van Eck NJ (2019) From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep 9(1):1–12

    Article  CAS  Google Scholar 

  66. Blondel VD et al (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):P10008

    Article  Google Scholar 

  67. Duò A, Robinson MD, Soneson C (2018) A systematic performance evaluation of clustering methods for single-cell RNA-seq data. F1000Research 7:1141

    Article  PubMed  Google Scholar 

  68. Zhang S et al (2020) Review of single-cell rna-seq data clustering for cell type identification and characterization. ar**v preprint ar**v:200101006

    Google Scholar 

  69. Liu B, Li Y, Zhang L (2022) Analysis and visualization of spatial transcriptomic data. Front Genet 12:785290

    Article  PubMed  PubMed Central  Google Scholar 

  70. Coifman RR et al (2005) Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc Natl Acad Sci 102(21):7426–7431

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Haghverdi L et al (2016) Diffusion pseudotime robustly reconstructs lineage branching. Nat Methods 13(10):845–848

    Article  CAS  PubMed  Google Scholar 

  72. Xu C, Su Z (2015) Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinformatics 31(12):1974–1980

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Patterson-Cross RB, Levine AJ, Menon V (2021) Selecting single cell clustering parameter values using subsampling-based robustness metrics. BMC bioinform 22(1):1–13

    Article  Google Scholar 

  74. Teaching team at the Harvard Chan Bioinformatics Core. Introduction to Single-cell RNA-seq. [cited 2022 04/10]; Available from: https://hbctraining.github.io/scRNA-seq/lessons/07_SC_clustering_cells_SCT.html

  75. Paul Hoffman SL (2022) Seurat - guided clustering tutorial. [cited 2022 04/10]; Available from: https://satijalab.org/seurat/articles/pbmc3k_tutorial.html

  76. Fruchterman TM, Reingold EM (1991) Graph drawing by force‐directed placement. Softw Pract Exp 21(11):1129–1164

    Article  Google Scholar 

  77. Wolf FA et al (2019) PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. Genome Biol 20(1):1–9

    Article  Google Scholar 

  78. Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18:50–60

    Article  Google Scholar 

  79. Welch BL (1947) The generalization of ‘STUDENT'S’ problem when several different population variances are involved. Biometrika 34(1–2):28–35

    CAS  PubMed  Google Scholar 

  80. Musser JM et al (2021) Profiling cellular diversity in sponges informs animal cell type and nervous system evolution. Science 374(6568):717–723

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Varoqueaux F et al (2018) High cell diversity and complex Peptidergic signaling Underlie Placozoan behavior. Curr Biol 28(21):3495–3501 e2

    Article  CAS  PubMed  Google Scholar 

  82. Dries R et al (2021) Advances in spatial transcriptomic data analysis. Genome Res 31(10):1706–1718

    Article  PubMed  PubMed Central  Google Scholar 

  83. Tarashansky AJ et al (2021) Map** single-cell atlases throughout Metazoa unravels cell type evolution. elife 10:e66747

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Liu X, Shen Q, Zhang S (2023) Cross-species cell-type assignment from single-cell RNA-seq data by a heterogeneous graph neural network. Genome Res 33(1):96–111

    Article  PubMed  PubMed Central  Google Scholar 

  85. Wang R et al (2023) Construction of a cross-species cell landscape at single-cell level. Nucleic Acids Res 51(2):501–516

    Article  CAS  PubMed  Google Scholar 

  86. Wang J et al (2021) Tracing cell-type evolution by cross-species comparison of cell atlases. Cell Rep 34(9):108803

    Article  CAS  PubMed  Google Scholar 

  87. Bishop CM, Nasrabadi NM (2006) Pattern recognition and machine learning, vol 4. Springer

    Google Scholar 

  88. Gan G, Ma C, Wu J (2020) Data clustering: theory, algorithms, and applications. SIAM

    Google Scholar 

  89. Ross SM (2014) Introduction to probability models. Academic press

    Google Scholar 

  90. Zelle JM (2004) Python programming: an introduction to computer science. Franklin, Beedle & Associates, Inc

    Google Scholar 

  91. Chambers JM (2008) Software for data analysis: programming with R, vol 2. Springer

    Book  Google Scholar 

  92. Moroz LL (2023) Brief history of Ctenophora. Methods Mol Biol. in press

    Google Scholar 

  93. Burkhardt P, Jekely G (2021) Evolution of synapses and neurotransmitter systems: the divide-and-conquer model for early neural cell-type evolution. Curr Opin Neurobiol 71:127–138

    Article  CAS  PubMed  Google Scholar 

  94. Moroz LL, Mukherjee K, Romanova DY (2023) Nitric oxide signaling in ctenophores. Front Neurosci 17:1125433

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgments

This work was supported in part by the Human Frontiers Science Program (RGP0060/2017) and National Science Foundation (IOS-1557923) grants to LLM. Research reported in this publication was also supported in part by the National Institute of Neurological Disorders and Stroke of the National Institutes of Health under Award Number R01NS114491 (to LLM). D.R. was supported by the Russian Science Foundation grant (23-14-00050). The content is solely the authors’ responsibility and does not necessarily represent the official views of the National Institutes of Health.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yanjun Li , Chaoyue Sun or Leonid L. Moroz .

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Recommended Books

  • Pattern Recognition and Machine Learning [87]

  • Data Clustering: Algorithms and Applications [88]

  • Introduction to Machine Learning [40]

  • Introduction to Probability Models [89]

  • Python Programming: An Introduction to Computer Science [90]

  • Software for Data Analysis: Programming with R [91]

1.2 PAGA Graph

PAGA algorithm enables the similarity analysis between different partitions (or clusters) generated by the community-detection-based clustering methods. The edge weight or the connectivity in the page graph carries the essential information regarding the similarity. This section briefly introduces the edge weights computation of the PAGA graph. A partitioned directed graph is denotated as G containing e edges and n nodes, and each node corresponds to one cell. For the group i, there are overall ei outgoing edges linked with ni nodes in it. The target coarse-grained PAGA graph is represented as G = (V, E), where \( {V}^{\ast }=\left\{{v}_1^{\ast },\dots, {v}_M^{\ast}\right\} \) is a set of the M cell groups and e ∈ E is a PAGA edge estimated by the PAGA algorithm. A random variable ϵij is used to describe the number of edges connected from cell group i to cell group j in random connecting situation. p(ϵij) is calculated as:

$$ {\displaystyle \begin{array}{l}p\left({e}_{ij}^{\ast }|{e}_i,{n}_i,{n}_j\right)=\frac{\Omega_{\upvarepsilon_{\mathrm{i}\mathrm{j}\mid {\mathrm{e}}_{\mathrm{i}},{\mathrm{n}}_{\mathrm{j}},\mathrm{n}}}}{\Omega_{\mathrm{total}\mid {\mathrm{e}}_{\mathrm{i}},n}}=\frac{\left(\genfrac{}{}{0pt}{}{e_i}{\epsilon_{ij}}\right){n}_j^{\epsilon_{ij}}{\left(n-{n}_j-1\right)}^{e_i-{\epsilon}_{ij}}}{{\left(n-1\right)}^{e_i}}\\ {}=\left(\genfrac{}{}{0pt}{}{e_i}{\epsilon_{ij}}\right){\left(\frac{n_j}{n-1}\right)}^{\epsilon_{ij}}{\left(1-\frac{n_j}{n-1}\right)}^{e_i-{\epsilon}_{ij}}\kern0.5em \end{array}} $$

The expression of p(ϵij| ei, ni, nj) is a binomial distribution with the expectation \( \frac{e_i{n}_j}{n-1} \) and variance \( \frac{e_i{n}_j\left(n-{n}_j-1\right)}{{\left(n-1\right)}^2} \). A new variable ϵ = ϵij + ϵji is introduced to provide a symmetric metrics for the similarity of two clusters. Suppose the cluster size is large enough so that the binomial distributions can be approximated by Gaussian, then the distribution of ϵ can be approximated as:

$$ \kern1em p\left(\epsilon |{e}_i,{e}_j,{n}_i,{n}_j,n\right)\sim N\left(\epsilon |\frac{e_i{n}_j+{e}_j{n}_i}{n-1},\frac{e_i{n}_j\left(n-{n}_j-1\right)+{e}_j{n}_i\left(n-{n}_i-1\right)}{{\left(n-1\right)}^2}\right) $$

Suppose the actual number of edges between cluster i and cluster j is \( {\epsilon}_{ij}^{\mathrm{sym}} \) and the expected number of edges is \( {\hat{\epsilon}}_{ij}=\frac{e_i{n}_j+{e}_j{n}_i}{n-1} \); the edge weights wij is defined as:

$$ {w}_{ij}=f(x)=\left\{\begin{array}{c}\frac{\epsilon_{ij}^{\mathrm{sym}}}{{\hat{\epsilon}}_{ij}},\kern0.5em \mathrm{if}\ {\epsilon}_{ij}^{\mathrm{sym}}<{\hat{\epsilon}}_{ij}\\ {}1,\kern0.5em \mathrm{otherwise}\end{array}\right. $$

If the number of actual edges is larger than the expected value, the connectivity will be set as 1, the upper bounder of the connectivity value. If the given partitioned graph is nondirected, one can convert it to a bi-directed graph by replacing a single edge with two independent edges pointing to the two linked nodes, respectively. Then, the same PAGA calculation strategy can be applied.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Li, Y., Sun, C., Romanova, D.Y., Wu, D.O., Fang, R., Moroz, L.L. (2024). Analysis and Visualization of Single-Cell Sequencing Data with Scanpy and MetaCell: A Tutorial. In: Moroz, L.L. (eds) Ctenophores. Methods in Molecular Biology, vol 2757. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3642-8_17

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-3642-8_17

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-3641-1

  • Online ISBN: 978-1-0716-3642-8

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics

Navigation