A Measure of the DNA Barcode Gap for Applied and Basic Research

  • Protocol
  • First Online:
DNA Barcoding

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2744))

  • 442 Accesses

Abstract

DNA barcoding has largely established itself as a mainstay for rapid molecular taxonomic identification in both academic and applied research. The use of DNA barcoding as a molecular identification method depends on a “DNA barcode gap”—the separation between the maximum within-species difference and the minimum between-species difference. Previous work indicates the presence of a gap hinges on sampling effort for focal taxa and their close relatives. Furthermore, both theory and empirical work indicate a gap may not occur for related pairs of biological species. Here, we present a novel evaluation approach in the form of an easily calculated set of nonparametric metrics to quantify the extent of proportional overlap in inter- and intraspecific distributions of pairwise differences among target species and their conspecifics. The metrics are based on a simple count of the number of overlap** records for a species falling within the bounds of maximum intraspecific distance and minimum interspecific distance. Our approach takes advantage of the asymmetric directionality inherent in pairwise genetic distance distributions, which has not been previously done in the DNA barcoding literature. We apply the metrics to the predatory diving beetle genus Agabus as a case study because this group poses significant identification challenges due to its morphological uniformity despite both relative sampling ease and well-established taxonomy. Results herein show that target species and their nearest neighbor species were found to be tightly clustered and therefore difficult to distinguish. Such findings demonstrate that DNA barcoding can fail to fully resolve species in certain cases. Moving forward, we suggest the implementation of the proposed metrics be integrated into a common framework to be reported in any study that uses DNA barcoding for identification. In so doing, the importance of the DNA barcode gap and its components for the success of DNA-based identification using DNA barcodes can be better appreciated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
EUR 44.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 209.00
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
EUR 267.49
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hebert PDN, Cywinska A, Ball SL, deWaard JR (2003a) Biological identifications through DNA barcodes. Proc Biol Sci 270:313–321

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Hebert PDN, Ratnasingham S, deWaard JR (2003b) Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species. Proc Biol Sci 270(Suppl. 1):S96–S99

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Ballard JWO, Rand DM (2005) The population biology of mitochondrial DNA and its phylogenetic implications. Annu Rev Ecol Evol Syst 36:621–642

    Article  Google Scholar 

  4. Phillips JD, Gillis DJ, Hanner RH (2019) Incomplete estimates of genetic diversity within species: implications for DNA barcoding. Ecol Evol 9:2996–3010

    Article  PubMed  PubMed Central  Google Scholar 

  5. Ratnasingham S, Hebert PDN (2007) Bold: the barcode of life data system (http://www.barcodinglife.org). Mol Ecol Notes 7:355–364

  6. Zhang AB, He LJ, Crozier RH, Muster C, Zhu C-D (2010) Estimating sample sizes for DNA barcoding. Mol Phylogenet Evol 54:1035–1039

    Article  CAS  PubMed  Google Scholar 

  7. Meyer CP, Paulay G (2005) DNA barcoding: error rates based on comprehensive sampling. PLoS Biol 3:e422

    Article  PubMed  PubMed Central  Google Scholar 

  8. Meier R (2008) The use of mean instead of smallest interspecific distances exaggerates the size of the “barcoding gap” and leads to misidentification. Syst Biol 57:809–813

    Article  PubMed  Google Scholar 

  9. Dasmahapatra KK, Elias M, Hill RI, Hoffman JI, Mallet J (2010) Mitochondrial DNA barcoding detects some species that are real, and some that are not. Mol Ecol Resour 10:254–273

    Article  Google Scholar 

  10. Hickerson MJ, Meyer CP, Moritz C (2006) DNA barcoding will often fail to discover new animal species in broad parameter space. Syst Biol 55:729–739

    Article  PubMed  Google Scholar 

  11. Stoeckle MY, Thaler DS (2014) DNA barcoding works in practice but not in (neutral) theory. PLoS One 9:e100755

    Article  PubMed  PubMed Central  Google Scholar 

  12. Phillips JD, Gillis DJ, Hanner RH (2022) Lack of statistical rigor in DNA barcoding likely invalidates the presence of a true species’ barcode gap. Front Ecol Evol 10:859099

    Article  Google Scholar 

  13. Bergsten J, Bilton DT, Fujisawa T, Elliott M, Monaghan MT, Balke M, Hendrich L, Geijer J, Herrmann J, Foster GN, Ribera I, Nilsson AN, Barraclough TG, Vogler AP (2012) The effect of geographical scale of sampling on DNA barcoding. Syst Biol 61:851–869

    Article  PubMed  PubMed Central  Google Scholar 

  14. Matz M, Nielsen R (2005) A likelihood ratio test for species membership based on DNA sequence data. Philos Trans R Soc Lond B Biol Sci 360:1969–1974

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Nielsen R, Matz M (2006) Statistical approaches for DNA barcoding. Syst Biol 55:162–169

    Article  PubMed  Google Scholar 

  16. Pons J, Barraclough T, Gomez-Zurita J, Cardoso A, Hazell S, Kamoun S, Sumlin WD, Vogler AP (2006) Sequence-based species delimitation for the DNA taxonomy of undescribed insects. Syst Biol 55:585–609

    Article  Google Scholar 

  17. Puillandre N, Lambert A, Brouillet S, Achez G (2011) ABGD, Automatic Barcode Gap Discovery for species delimitation. Mol Ecol 21:1864–1877

    Article  PubMed  Google Scholar 

  18. Zhang J, Kapli P, Pavlidis P, Stamatakis P (2013) A general species delimitation method with applications to phylogenetic placements. Bioinformatics 29:2869–2876

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Ratnasingham S, Hebert PDN (2013) A DNA-based registry for all animal species: the Barcode Index Number (BIN) system. PLoS One 8:e66213

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Puillandre N, Brouillet S, Achez G (2021) ASAP: assemble species by automatic partitioning. Mol Ecol Resour 21:609–622

    Article  PubMed  Google Scholar 

  21. Ezard T, Fujisawa T, Barraclough T (2017) Splits: species LImits by threshold statistics. R package version 1.0

    Google Scholar 

  22. Eckert EM, Fontaneto D, Coci M, Callieri C (2015) Does a barcoding gap exist in prokaryotes? Evidences from species delimitation in cyanobacteria. Life 5:50–64

    Article  CAS  Google Scholar 

  23. Zimmerman J, Jahn R, Gemeinholzer B (2011) Barcoding diatoms: evaluation of the V4 subregion on the 18S rRNA gene, including new primers and protocols. Org Divers Evol 11:1–20

    Article  Google Scholar 

  24. Kingman JFC (1982) The coalescent. Stoch Process Appl 13:235–248

    Article  Google Scholar 

  25. Hubert N, Hanner R (2015) DNA barcoding, species delineation and taxonomy: a historical perspective. DNA Barcodes 3:44–58

    Article  Google Scholar 

  26. Rannala B, Yang Z (2003) Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci. Genetics 164:1645–1656

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Yang Z, Rannala B (2017) Bayesian species identification under the multispecies coalescent provides significant improvements to DNA barcoding analyses. Mol Ecol 26:3028–3036

    Article  CAS  PubMed  Google Scholar 

  28. Collins RA, Cruickshank RH (2014) Known knowns, known unknowns, unknown unknowns and unknown knowns in DNA barcoding: a comment on Dowton et al. Syst Biol 63:1005–1009

    Article  CAS  PubMed  Google Scholar 

  29. Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro HN (ed) Mammalian protein metabolism. Academic Press, New York, pp 21–132

    Chapter  Google Scholar 

  30. Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120

    Article  CAS  PubMed  Google Scholar 

  31. Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26

    Article  Google Scholar 

  32. Young RG, Gill R, Gillis D, Hanner RH (2021) Molecular Acquisition, Cleaning and Evaluation in R (MACER) – A tool to assemble molecular marker datasets from BOLD and GenBank. Biodivers Data J 9:e71378

    Article  PubMed  PubMed Central  Google Scholar 

  33. R Core Team (2022) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. URL https://www.R-project.org/

    Google Scholar 

  34. Canty A, Ripley B (2021) boot: Bootstrap R (S-plus) functions. R package version 1.3–28

    Google Scholar 

  35. Davison AC, Hinkley DV (1997) Bootstrap methods and their applications. Cambridge University Press, Cambridge

    Book  Google Scholar 

  36. Wickham H (2016) ggplot2: elegant graphics for data analysis. Springer-Verlag, New York

    Book  Google Scholar 

  37. Kullback S, Leibler R (1951) On information and sufficiency. Ann Math Stat 22:49–86

    Article  Google Scholar 

  38. Hebert PDN, Stoeckle MY, Zemlak TS, Francis CM (2004) Identification of birds through DNA barcodes. PLoS Biol 2:e312

    Article  PubMed  PubMed Central  Google Scholar 

  39. D’Ercole J, Dapporto L, Schmidt BC, Dincă V, Talavera G, Vila R, Hebert PDN (2022) Patterns of DNA barcode diversity in butterfly species (Lepidoptera) introduced to the Nearctic. Eur J Entomol 119:379–387

    Article  Google Scholar 

  40. Martin MP, Daniëls PP, Erickson D, Spouge JL (2020) Figures of merit and statistics for detecting faulty species identification with DNA barcodes: a case study in Ramaria and related fungal genera. PLoS One 15:e0237507

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Spouge J, Mariño-Ramirez L (2012) The practical evaluation of DNA barcode efficacy. In: Kress WJ, Erickson DL (eds) DNA barcodes: methods and protocols. Springer

    Google Scholar 

  42. Suwannasai N, Martin MP, Phosri C, Sihanonth P, Whalley AJS, Spouge JL (2013) Fungi in Thailand: A case study of the efficacy of an ITS barcode for automatically identifying species within the Annulohypoxylon and Hypoxylon genera. PLoS One 8:e54529

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Birky CWJ, Wolf C, Maughan H, Herbertson L, Henry E (2005) Speciation and selection without sex. Hydrobiologia 546:29–45

    Article  CAS  Google Scholar 

  44. Birky CWJ, Barraclough TG (2009) Asexual speciation. In: Schon I, Martens K, van Dijk P (eds) Lost sex. Springer, New York, pp 201–216

    Chapter  Google Scholar 

  45. Birky CWJ, Adams J, Gemmel M, Perry J (2010) Using population genetic theory and DNA sequences for species detection and identification in asexual organisms. PLoS One 5:e10609

    Article  PubMed  PubMed Central  Google Scholar 

  46. Birky CWJ, Ricci C, Melone G, Fontaneto D (2011) Integrating DNA and morphological taxonomy to describe diversity in poorly studied microscopic animals: new species of the genus Abrochtha Bryce, 1910 (Rotifera: Bdelloidea: Philodinavidae). Zool J Linnean Soc 161:723–734

    Article  Google Scholar 

  47. Birky CWJ (2013) Species detection and identification in sexual organisms using population genetic theory and DNA sequences. PLoS One 8:e52544

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Birky CWJ, Maughan H (2020) Evolutionary genetic species detected in prokaryotes by applying the K/θ ratio to DNA sequences. bioRxiv. https://www.biorxiv.org/content/10.1101/2020.04.27.062828v3.full

  49. Spöri Y, Stoch F, Dellicour S, Birky CWJ, Flot J-F (2021) KoT: an automatic implementation of the K/θ method for species delimitation. bior**v. https://www.biorxiv.org/content/10.1101/2021.08.17.454531v2

  50. Rosenberg N (2007) Statistical tests for taxonomic distinctiveness from observation of monophyly. Evolution 61:317–323

    Article  PubMed  Google Scholar 

  51. De Sanctis B, Money D, Winther Pedersen M, Durbin R (2021) A theoretical analysis of taxonomic binning accuracy. Mol Ecol Resour 22:2208–2219

    Article  Google Scholar 

  52. Phillips JD, French SH, Hanner RH, Gillis DJ (2020) HACSim: an R package to estimate intraspecific sample sizes for genetic diversity assessment using haplotype accumulation curves. PeerJ Comput Sci 6:1–37

    Article  Google Scholar 

Download references

Acknowledgments

We acknowledge that the University of Guelph resides on the ancestral lands of the Attawandaron people and the treaty lands and territory of the Mississaugas of the Credit. We recognize the significance of the Dish with One Spoon Covenant to this land and offer our respect to our Anishinaabe, Haudenosaunee, and Métis neighbors as we strive to strengthen our relationships with them.

Author Contributions

J.D.P., C.K.G., and R.G.Y. derived and coded the DNA barcode gap coalescent metrics, analyzed and interpreted the data, generated figures, and wrote the manuscript. N.H. and R.H.H. provided insight on DNA barcoding and other applications of the present work. All authors commented on and approved the final version.

Data Accessibility

FASTA files and R code can be found on GitHub at https://github.com/jphill01/DNA-Barcode-Gap-Coalescent.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jarrett D. Phillips .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Phillips, J.D., Griswold, C.K., Young, R.G., Hubert, N., Hanner, R.H. (2024). A Measure of the DNA Barcode Gap for Applied and Basic Research. In: DeSalle, R. (eds) DNA Barcoding. Methods in Molecular Biology, vol 2744. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3581-0_24

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-3581-0_24

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-3580-3

  • Online ISBN: 978-1-0716-3581-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics

Navigation