A Bioinformatic Pipeline to Estimate Ploidy Level from Target Capture Sequence Data Obtained from Herbarium Specimens

Viruel, Juan; Hidalgo, Oriane; Pokorny, Lisa; Forest, Félix; Gravendeel, Barbara; Wilkin, Paul; Leitch, Ilia J.

doi:10.1007/978-1-0716-3226-0_5

Juan Viruel⁴,
Oriane Hidalgo^4,5,
Lisa Pokorny^4,5,8,
Félix Forest⁴,
Barbara Gravendeel^6,7,
Paul Wilkin⁴ &
…
Ilia J. Leitch⁴

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2672))

960 Accesses
20 Altmetric

Abstract

Whole genome duplications (WGD) are frequent in many plant lineages; however, ploidy level variation is unknown in most species. The most widely used methods to estimate ploidy levels in plants are chromosome counts, which require living specimens, and flow cytometry estimates, which necessitate living or relatively recently collected samples. Newly described bioinformatic methods have been developed to estimate ploidy levels using high-throughput sequencing data, and these have been optimized in plants by calculating allelic ratio values from target capture data. This method relies on the maintenance of allelic ratios from the genome to the sequence data. For example, diploid organisms will generate allelic data in a 1:1 proportion, with an increasing number of possible allelic ratio combinations occurring in individuals with higher ploidy levels. In this chapter, we explain step-by-step this bioinformatic approach for the estimation of ploidy level.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Šmarda P (2008) DNA ploidy level variability of some fescues (Festuca subg. Festuca, Poaceae) from Central and Southern Europe measured in fresh plants and herbarium specimens. Biologia 63:349–367
Article Google Scholar
Díaz-Lifante Z, Andrés Camacho C, Viruel J, Cabrera Caballero A (2009) The allopolyploid origin of Narcissus obsoletus (Alliaceae): identification of parental genomes by karyotype characterization and genomic in situ hybridization. Bot J Lin Soc 159:477–498
Google Scholar
Patel N, Medina R, Johnson M, Goffinet B (2021) Karyotypic diversity and cryptic speciation: have we vastly underestimated moss species diversity? Bryophyte Diver Evol 43:150–163
Google Scholar
Sliwinska E, Loureiro J, Leitch IJ, Šmarda P, Bainard J, Bureš P, Chumová Z, Horová L, Koutecký P, Lučanová M, Trávníček P, Galbraith DW (2021) Application-based guidelines for best practices in plant flow cytometry. Cytom A 101:749. https://doi.org/10.1002/cyto.a.24499
Article Google Scholar
Suda J, Krahulcová A, Trávníček P, Krahulec F (2006) Ploidy level versus DNA ploidy level: an appeal for consistent terminology. Taxon 55:447–450
Article Google Scholar
Farhat P, Hidalgo O, Robert T, Siljak-Yakovlev S, Leitch IJ, Adams RP, Bou Dagher-Kharrat M (2019) Polyploidy in the conifer genus Juniperus: an unexpectedly high rate. Front Plant Sci 10:676
Article PubMed PubMed Central Google Scholar
Farhat P, Siljak-Yakovlev S, Hidalgo O, Rushforth K, Bartel JA, Valentin N, Leitch IJ, Adams RP (2021) Polyploidy in Cupressaceae: discovery of a new naturally occurring tetraploid, Xanthocyparis vietnamensis. J Syst Evol. https://doi.org/10.1111/jse.12751
Viruel J, Conejero M, Hidalgo O, Pokorny L, Powell RF, Forest F, Kantar MB, Soto Gomez M, Graham SW, Gravendeel B, Wilkin P, Leitch IJ (2019) A target capture-based method to estimate ploidy from herbarium specimens. Front Plant Sci 10:937
Article PubMed PubMed Central Google Scholar
Baack EJ, Whitney KD, Rieseberg LH (2005) Hybridization and genome size evolution: timing and magnitude of nuclear DNA content increases in Helianthus homoploid hybrid species. New Phytol 167:623–630
Article CAS PubMed PubMed Central Google Scholar
Ungerer MC, Strakosh SC, Zhen Y (2006) Genome expansion in three hybrid sunflower species is associated with retrotransposon proliferation. Curr Biol 16:R872–R873
Article CAS PubMed Google Scholar
Sánchez-Jiménez I, Pellicer J, Hidalgo O, Garcia S, Garnatje T, Vallès J (2009) Chromosome numbers in three Asteraceae tribes from Inner Mongolia (China), with genome size data for Cardueae. Folia Geobot 44:307–322
Article Google Scholar
Sánchez-Jiménez I, Lazkov GA, Hidalgo O, Garnatje T (2010) Molecular systematics of Echinops L. (Asteraceae, Cynareae): a phylogeny based on ITS and trnL-trnF sequences with emphasis on sectional delimitation. Taxon 59:698–708
Article Google Scholar
Brewer GE, Clarkson JJ, Maurin O, Zuntini AR, Barber V, Bellot S, Biggs N, Cowan RS, Davies NM, Dodsworth S, Edwards SL (2019) Factors affecting targeted sequencing of 353 nuclear genes from herbarium specimens spanning the diversity of angiosperms. Front Plant Sci 10:1102
Article PubMed PubMed Central Google Scholar
Kates HR, Doby JR, Siniscalchi CM, LaFrance R, Soltis DE, Soltis PS, Guralnick RP, Folk RA (2021) The effects of herbarium specimen characteristics on short-read NGS sequencing success in nearly 8000 specimens: old, degraded samples have lower DNA yields but consistent sequencing success. Front Plant Sci 12:669064
Article PubMed PubMed Central Google Scholar
Soto Gomez M, Pokorny L, Kantar MB, Forest F, Leitch IJ, Gravendeel B, Wilkin P, Graham SW, Viruel J (2019) A customized nuclear target enrichment approach for develo** a phylogenomic baseline for Dioscorea yams (Dioscoreaceae). Appl Plant Sci 7:e11254
Article PubMed PubMed Central Google Scholar
Johnson MG, Pokorny L, Dodsworth S, Botigué LR, Cowan RS, Devault A, Eiserhardt WL, Epitawalage N, Forest F, Kim JT, Leebens-Mack JH, Leitch IJ, Maurin O, Soltis DE, Soltis PS, Wong GK-S, Baker WJ, Wickett NJ (2019) A universal probe set for targeted sequencing of 353 nuclear genes from any flowering plant designed using k-Medoids clustering. Syst Biol 68:594–606
Article CAS PubMed Google Scholar
Ewels P, Magnusson M, Lundin S, Käller M (2016) MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32:3047–3048
Article CAS PubMed PubMed Central Google Scholar
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120
Article CAS PubMed PubMed Central Google Scholar
Chen S, Zhou Y, Chen Y, Gu J (2018) Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:i884–i890
Article PubMed PubMed Central Google Scholar
Johnson MG, Gardner EM, Liu Y, Medina R, Goffinet B, Shaw AJ, Zerega NJC, Wickett NJ (2016) HybPiper: extracting coding sequence and introns for phylogenetics from high-throughput sequencing reads using target enrichment. Appl Plant Sci 4:1600016
Article Google Scholar
Li H, Durbin R (2009) Fast and accurate short read alignment with burrows–wheeler transform. Bioinformatics 25:1754–1760
Article CAS PubMed PubMed Central Google Scholar
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Article PubMed PubMed Central Google Scholar
Weiß CL, Pais M, Cano LM, Kamoun S, Burbano HA (2018) nQuire: a statistical framework for ploidy estimation using next generation sequencing. BMC Bioinform 19:122
Article Google Scholar
R Core Team (2020) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
RStudio Team (2020) RStudio: integrated development for R. RStudio, PBC, Boston. http://www.rstudio.com/
Wickham H, François R, Henry L, Müller K (2018) dplyr: a grammar of data manipulation. R package version 0.7.6. https://CRAN.R-project.org/package=dplyr
Johnson P (2020) devEMF: EMF graphics output device. R package version 4.0-2. https://CRAN.R-project.org/package=devEMF
Wickham H (2016) ggplot2: elegant graphics for data analysis. Springer-Verlag, New York
Book Google Scholar

Download references

Author information

Authors and Affiliations

Royal Botanic Gardens, Kew, Richmond, UK
Juan Viruel, Oriane Hidalgo, Lisa Pokorny, Félix Forest, Paul Wilkin & Ilia J. Leitch
Institut Botànic de Barcelona (IBB, CSIC-Ajuntament de Barcelona), Barcelona, Catalonia, Spain
Oriane Hidalgo & Lisa Pokorny
Naturalis Biodiversity Center, Evolutionary Ecology, Leiden, Netherlands
Barbara Gravendeel
Radboud Institute for Biological and Environmental Sciences, Leiden University, Leiden, Netherlands
Barbara Gravendeel
Real Jardín Botánico (RJB-CSIC), Madrid, Spain
Lisa Pokorny

Authors

Juan Viruel
View author publications
You can also search for this author in PubMed Google Scholar
Oriane Hidalgo
View author publications
You can also search for this author in PubMed Google Scholar
Lisa Pokorny
View author publications
You can also search for this author in PubMed Google Scholar
Félix Forest
View author publications
You can also search for this author in PubMed Google Scholar
Barbara Gravendeel
View author publications
You can also search for this author in PubMed Google Scholar
Paul Wilkin
View author publications
You can also search for this author in PubMed Google Scholar
Ilia J. Leitch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juan Viruel .

Editor information

Editors and Affiliations

Institute of Botany, TU Dresden, Dresden, Germany
Tony Heitkam
Botanical Institute of Barcelona, Barcelona, Spain
Sònia Garcia

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Viruel, J. et al. (2023). A Bioinformatic Pipeline to Estimate Ploidy Level from Target Capture Sequence Data Obtained from Herbarium Specimens. In: Heitkam, T., Garcia, S. (eds) Plant Cytogenetics and Cytogenomics. Methods in Molecular Biology, vol 2672. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3226-0_5

Download citation

DOI: https://doi.org/10.1007/978-1-0716-3226-0_5
Published: 20 June 2023
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-3225-3
Online ISBN: 978-1-0716-3226-0
eBook Packages: Springer Protocols

Publish with us

Policies and ethics