Abstract
DNA barcoding has largely established itself as a mainstay for rapid molecular taxonomic identification in both academic and applied research. The use of DNA barcoding as a molecular identification method depends on a “DNA barcode gap”—the separation between the maximum within-species difference and the minimum between-species difference. Previous work indicates the presence of a gap hinges on sampling effort for focal taxa and their close relatives. Furthermore, both theory and empirical work indicate a gap may not occur for related pairs of biological species. Here, we present a novel evaluation approach in the form of an easily calculated set of nonparametric metrics to quantify the extent of proportional overlap in inter- and intraspecific distributions of pairwise differences among target species and their conspecifics. The metrics are based on a simple count of the number of overlap** records for a species falling within the bounds of maximum intraspecific distance and minimum interspecific distance. Our approach takes advantage of the asymmetric directionality inherent in pairwise genetic distance distributions, which has not been previously done in the DNA barcoding literature. We apply the metrics to the predatory diving beetle genus Agabus as a case study because this group poses significant identification challenges due to its morphological uniformity despite both relative sampling ease and well-established taxonomy. Results herein show that target species and their nearest neighbor species were found to be tightly clustered and therefore difficult to distinguish. Such findings demonstrate that DNA barcoding can fail to fully resolve species in certain cases. Moving forward, we suggest the implementation of the proposed metrics be integrated into a common framework to be reported in any study that uses DNA barcoding for identification. In so doing, the importance of the DNA barcode gap and its components for the success of DNA-based identification using DNA barcodes can be better appreciated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hebert PDN, Cywinska A, Ball SL, deWaard JR (2003a) Biological identifications through DNA barcodes. Proc Biol Sci 270:313–321
Hebert PDN, Ratnasingham S, deWaard JR (2003b) Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species. Proc Biol Sci 270(Suppl. 1):S96–S99
Ballard JWO, Rand DM (2005) The population biology of mitochondrial DNA and its phylogenetic implications. Annu Rev Ecol Evol Syst 36:621–642
Phillips JD, Gillis DJ, Hanner RH (2019) Incomplete estimates of genetic diversity within species: implications for DNA barcoding. Ecol Evol 9:2996–3010
Ratnasingham S, Hebert PDN (2007) Bold: the barcode of life data system (http://www.barcodinglife.org). Mol Ecol Notes 7:355–364
Zhang AB, He LJ, Crozier RH, Muster C, Zhu C-D (2010) Estimating sample sizes for DNA barcoding. Mol Phylogenet Evol 54:1035–1039
Meyer CP, Paulay G (2005) DNA barcoding: error rates based on comprehensive sampling. PLoS Biol 3:e422
Meier R (2008) The use of mean instead of smallest interspecific distances exaggerates the size of the “barcoding gap” and leads to misidentification. Syst Biol 57:809–813
Dasmahapatra KK, Elias M, Hill RI, Hoffman JI, Mallet J (2010) Mitochondrial DNA barcoding detects some species that are real, and some that are not. Mol Ecol Resour 10:254–273
Hickerson MJ, Meyer CP, Moritz C (2006) DNA barcoding will often fail to discover new animal species in broad parameter space. Syst Biol 55:729–739
Stoeckle MY, Thaler DS (2014) DNA barcoding works in practice but not in (neutral) theory. PLoS One 9:e100755
Phillips JD, Gillis DJ, Hanner RH (2022) Lack of statistical rigor in DNA barcoding likely invalidates the presence of a true species’ barcode gap. Front Ecol Evol 10:859099
Bergsten J, Bilton DT, Fujisawa T, Elliott M, Monaghan MT, Balke M, Hendrich L, Geijer J, Herrmann J, Foster GN, Ribera I, Nilsson AN, Barraclough TG, Vogler AP (2012) The effect of geographical scale of sampling on DNA barcoding. Syst Biol 61:851–869
Matz M, Nielsen R (2005) A likelihood ratio test for species membership based on DNA sequence data. Philos Trans R Soc Lond B Biol Sci 360:1969–1974
Nielsen R, Matz M (2006) Statistical approaches for DNA barcoding. Syst Biol 55:162–169
Pons J, Barraclough T, Gomez-Zurita J, Cardoso A, Hazell S, Kamoun S, Sumlin WD, Vogler AP (2006) Sequence-based species delimitation for the DNA taxonomy of undescribed insects. Syst Biol 55:585–609
Puillandre N, Lambert A, Brouillet S, Achez G (2011) ABGD, Automatic Barcode Gap Discovery for species delimitation. Mol Ecol 21:1864–1877
Zhang J, Kapli P, Pavlidis P, Stamatakis P (2013) A general species delimitation method with applications to phylogenetic placements. Bioinformatics 29:2869–2876
Ratnasingham S, Hebert PDN (2013) A DNA-based registry for all animal species: the Barcode Index Number (BIN) system. PLoS One 8:e66213
Puillandre N, Brouillet S, Achez G (2021) ASAP: assemble species by automatic partitioning. Mol Ecol Resour 21:609–622
Ezard T, Fujisawa T, Barraclough T (2017) Splits: species LImits by threshold statistics. R package version 1.0
Eckert EM, Fontaneto D, Coci M, Callieri C (2015) Does a barcoding gap exist in prokaryotes? Evidences from species delimitation in cyanobacteria. Life 5:50–64
Zimmerman J, Jahn R, Gemeinholzer B (2011) Barcoding diatoms: evaluation of the V4 subregion on the 18S rRNA gene, including new primers and protocols. Org Divers Evol 11:1–20
Kingman JFC (1982) The coalescent. Stoch Process Appl 13:235–248
Hubert N, Hanner R (2015) DNA barcoding, species delineation and taxonomy: a historical perspective. DNA Barcodes 3:44–58
Rannala B, Yang Z (2003) Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci. Genetics 164:1645–1656
Yang Z, Rannala B (2017) Bayesian species identification under the multispecies coalescent provides significant improvements to DNA barcoding analyses. Mol Ecol 26:3028–3036
Collins RA, Cruickshank RH (2014) Known knowns, known unknowns, unknown unknowns and unknown knowns in DNA barcoding: a comment on Dowton et al. Syst Biol 63:1005–1009
Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro HN (ed) Mammalian protein metabolism. Academic Press, New York, pp 21–132
Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120
Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26
Young RG, Gill R, Gillis D, Hanner RH (2021) Molecular Acquisition, Cleaning and Evaluation in R (MACER) – A tool to assemble molecular marker datasets from BOLD and GenBank. Biodivers Data J 9:e71378
R Core Team (2022) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. URL https://www.R-project.org/
Canty A, Ripley B (2021) boot: Bootstrap R (S-plus) functions. R package version 1.3–28
Davison AC, Hinkley DV (1997) Bootstrap methods and their applications. Cambridge University Press, Cambridge
Wickham H (2016) ggplot2: elegant graphics for data analysis. Springer-Verlag, New York
Kullback S, Leibler R (1951) On information and sufficiency. Ann Math Stat 22:49–86
Hebert PDN, Stoeckle MY, Zemlak TS, Francis CM (2004) Identification of birds through DNA barcodes. PLoS Biol 2:e312
D’Ercole J, Dapporto L, Schmidt BC, Dincă V, Talavera G, Vila R, Hebert PDN (2022) Patterns of DNA barcode diversity in butterfly species (Lepidoptera) introduced to the Nearctic. Eur J Entomol 119:379–387
Martin MP, Daniëls PP, Erickson D, Spouge JL (2020) Figures of merit and statistics for detecting faulty species identification with DNA barcodes: a case study in Ramaria and related fungal genera. PLoS One 15:e0237507
Spouge J, Mariño-Ramirez L (2012) The practical evaluation of DNA barcode efficacy. In: Kress WJ, Erickson DL (eds) DNA barcodes: methods and protocols. Springer
Suwannasai N, Martin MP, Phosri C, Sihanonth P, Whalley AJS, Spouge JL (2013) Fungi in Thailand: A case study of the efficacy of an ITS barcode for automatically identifying species within the Annulohypoxylon and Hypoxylon genera. PLoS One 8:e54529
Birky CWJ, Wolf C, Maughan H, Herbertson L, Henry E (2005) Speciation and selection without sex. Hydrobiologia 546:29–45
Birky CWJ, Barraclough TG (2009) Asexual speciation. In: Schon I, Martens K, van Dijk P (eds) Lost sex. Springer, New York, pp 201–216
Birky CWJ, Adams J, Gemmel M, Perry J (2010) Using population genetic theory and DNA sequences for species detection and identification in asexual organisms. PLoS One 5:e10609
Birky CWJ, Ricci C, Melone G, Fontaneto D (2011) Integrating DNA and morphological taxonomy to describe diversity in poorly studied microscopic animals: new species of the genus Abrochtha Bryce, 1910 (Rotifera: Bdelloidea: Philodinavidae). Zool J Linnean Soc 161:723–734
Birky CWJ (2013) Species detection and identification in sexual organisms using population genetic theory and DNA sequences. PLoS One 8:e52544
Birky CWJ, Maughan H (2020) Evolutionary genetic species detected in prokaryotes by applying the K/θ ratio to DNA sequences. bioRxiv. https://www.biorxiv.org/content/10.1101/2020.04.27.062828v3.full
Spöri Y, Stoch F, Dellicour S, Birky CWJ, Flot J-F (2021) KoT: an automatic implementation of the K/θ method for species delimitation. bior**v. https://www.biorxiv.org/content/10.1101/2021.08.17.454531v2
Rosenberg N (2007) Statistical tests for taxonomic distinctiveness from observation of monophyly. Evolution 61:317–323
De Sanctis B, Money D, Winther Pedersen M, Durbin R (2021) A theoretical analysis of taxonomic binning accuracy. Mol Ecol Resour 22:2208–2219
Phillips JD, French SH, Hanner RH, Gillis DJ (2020) HACSim: an R package to estimate intraspecific sample sizes for genetic diversity assessment using haplotype accumulation curves. PeerJ Comput Sci 6:1–37
Acknowledgments
We acknowledge that the University of Guelph resides on the ancestral lands of the Attawandaron people and the treaty lands and territory of the Mississaugas of the Credit. We recognize the significance of the Dish with One Spoon Covenant to this land and offer our respect to our Anishinaabe, Haudenosaunee, and Métis neighbors as we strive to strengthen our relationships with them.
Author Contributions
J.D.P., C.K.G., and R.G.Y. derived and coded the DNA barcode gap coalescent metrics, analyzed and interpreted the data, generated figures, and wrote the manuscript. N.H. and R.H.H. provided insight on DNA barcoding and other applications of the present work. All authors commented on and approved the final version.
Data Accessibility
FASTA files and R code can be found on GitHub at https://github.com/jphill01/DNA-Barcode-Gap-Coalescent.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Phillips, J.D., Griswold, C.K., Young, R.G., Hubert, N., Hanner, R.H. (2024). A Measure of the DNA Barcode Gap for Applied and Basic Research. In: DeSalle, R. (eds) DNA Barcoding. Methods in Molecular Biology, vol 2744. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3581-0_24
Download citation
DOI: https://doi.org/10.1007/978-1-0716-3581-0_24
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-3580-3
Online ISBN: 978-1-0716-3581-0
eBook Packages: Springer Protocols