Assembly of Phytoplasma Genome Drafts from Illumina Reads Using Phytoassembly

  • Protocol
  • First Online:
Phytoplasmas

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1875))

Abstract

Genome drafts for the phytoplasmas may be rapidly and efficiently assembled from NGS sequence data alone exploiting the proper bioinformatic tools and starting from properly collected samples. Here, we describe the use of the Phytoassembly pipeline (https://github.com/cpolano/phytoassembly), a fully automated tool that accepts as input row Illumina data from two samples (a phytoplasma infected sample and a healthy reference sample) to produce a phytoplasma genome draft, using the healthy plant host genome as a filter and profiting from the difference in reads coverage between the genome of the pathogen and that of the host. For phytoplasma infected samples containing >2% of pathogen DNA and an isogenic healthy reference sequence the resulting assemblies span the almost entire genomes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
USD 59.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Tran-Nguyen LTT, Gibb KS (2007) Optimizing phytoplasma DNA purification for genome analysis. J Biomol Tech 18:104–112

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Saeed E, Seemüller E, Schneider B et al (1994) Molecular cloning, detection of chromosomal DNA of the Mycoplasmalike organism (MLO) associated with Faba bean (Vicia faba L.) phyllody by southern blot hybridization and the polymerase chain reaction (PCR). J Phytopathol 142:97–106. https://doi.org/10.1111/j.1439-0434.1994.tb04519.x

    Article  CAS  Google Scholar 

  3. Oshima K, Kakizawa S, Nishigawa H et al (2004) Reductive evolution suggested from the complete genome sequence of a plant-pathogenic phytoplasma. Nat Genet 36:27–29. https://doi.org/10.1038/ng1277

    Article  CAS  PubMed  Google Scholar 

  4. Bai X, Zhang J, Ewing A et al (2006) Living with genome instability: the adaptation of phytoplasmas to diverse environments of their insect and plant hosts. J Bacteriol 188:3682–3696. https://doi.org/10.1128/JB.188.10.3682-3696.2006

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Kube M, Schneider B, Kuhl H et al (2008) The linear chromosome of the plant-pathogenic mycoplasma “Candidatus Phytoplasma Mali”. BMC Genomics 9:306. https://doi.org/10.1186/1471-2164-9-306

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Tran-Nguyen LTT, Kube M, Schneider B et al (2008) Comparative genome analysis of “Candidatus Phytoplasma australiense” (subgroup tuf-Australia I; rp-a) and “Ca. Phytoplasma asteris” strains OY-M and AY-WB. J Bacteriol 190:3979–3991. https://doi.org/10.1128/JB.01301-07

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Andersen MT, Liefting LW, Havukkala I, Beever RE (2013) Comparison of the complete genome sequence of two closely related isolates of “Candidatus Phytoplasma australiense” reveals genome plasticity. BMC Genomics 14:529. https://doi.org/10.1186/1471-2164-14-529

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Mitrovic J, Siewert C, Duduk B et al (2014) Generation and analysis of draft sequences of “Stolbur” phytoplasma from multiple displacement amplification templates. J Mol Microbiol Biotechnol 24:1–11. https://doi.org/10.1159/000353904

    Article  CAS  PubMed  Google Scholar 

  9. Lee I-M, Shao J, Bottner-Parker KD et al (2015) Draft genome sequence of “Candidatus Phytoplasma pruni” strain CX, a plant-pathogenic bacterium. Genome Announc 3:e01117–e01115. https://doi.org/10.1128/genomeA.01117-15

    Article  PubMed  PubMed Central  Google Scholar 

  10. Kakizawa S, Makino A, Ishii Y et al (2014) Draft genome sequence of “Candidatus Phytoplasma asteris” strain OY-V, an unculturable plant-pathogenic bacterium. Genome Announc 2:e00944-14. https://doi.org/10.1128/genomeA.00944-14

    Article  PubMed  PubMed Central  Google Scholar 

  11. Fischer A, Santana-Cruz I, Wambua L et al (2016) Draft genome sequence of “Candidatus Phytoplasma oryzae” strain Mbita1, the causative agent of Napier grass stunt disease in Kenya. Genome Announc 4:e00297–e00216. https://doi.org/10.1128/genomeA.00297-16

    Article  PubMed  PubMed Central  Google Scholar 

  12. Saccardo F, Martini M, Palmano S et al (2012) Genome drafts of four phytoplasma strains of the ribosomal group 16SrIII. Microbiology 158:2805–2814. https://doi.org/10.1099/mic.0.061432-0

    Article  CAS  PubMed  Google Scholar 

  13. Chung W-C, Chen L-L, Lo W-S et al (2013) Comparative analysis of the peanut witches’-broom phytoplasma genome reveals horizontal transfer of potential mobile units and effectors. PLoS One 8:e62770. https://doi.org/10.1371/journal.pone.0062770

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Davis RE, Zhao Y, Dally EL et al (2013) “Candidatus Phytoplasma pruni”, a novel taxon associated with X-disease of stone fruits, Prunus spp.: multilocus characterization based on 16S rRNA, secY, and ribosomal protein genes. Int J Syst Evol Microbiol 63:766–776. https://doi.org/10.1099/ijs.0.041202-0

    Article  CAS  PubMed  Google Scholar 

  15. Quaglino F, Zhao Y, Casati P et al (2013) “Candidatus Phytoplasma solani”, a novel taxon associated with stolbur- and bois noir-related diseases of plants. Int J Syst Evol Microbiol 63:2879–2894. https://doi.org/10.1099/ijs.0.044750-0

    Article  CAS  PubMed  Google Scholar 

  16. Chen W, Li Y, Wang Q et al (2014) Comparative genome analysis of wheat blue dwarf phytoplasma, an obligate pathogen that causes wheat blue dwarf disease in China. PLoS One 9:e96436. https://doi.org/10.1371/journal.pone.0096436

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Quaglino F, Kube M, Jawhari M et al (2015) “Candidatus Phytoplasma phoenicium” associated with almond witches’-broom disease: from draft genome to genetic diversity among strain populations. BMC Microbiol 15:148. https://doi.org/10.1186/s12866-015-0487-4

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Tritt A, Eisen JA, Facciotti MT, Darling AE (2012) An integrated pipeline for de novo assembly of microbial genomes. PLoS One 7:e42304. https://doi.org/10.1371/journal.pone.0042304

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Aziz RK, Bartels D, Best AA et al (2008) The RAST server: rapid annotations using subsystems technology. BMC Genomics 9:75. https://doi.org/10.1186/1471-2164-9-75

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Glass EM, Meyer F (2011) The Metagenomics RAST server: a public resource for the automatic phylogenetic and functional analysis of metagenomes. In: Metagenomics complement approaches, Handbook of molecular microbial ecology, vol 8, pp 325–331. https://doi.org/10.1002/9781118010518.ch37

    Chapter  Google Scholar 

  21. Li H, Durbin R (2009) Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics 25:1754–1760. https://doi.org/10.1093/bioinformatics/btp324

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giuseppe Firrao .

Editor information

Editors and Affiliations

Appendix: How Phytoassembly Works

Appendix: How Phytoassembly Works

  1. 1.

    The procedure uses the A5 pipeline to assemble the healthy plant sequence reads (Healthy.contigs.fasta), if no assembly was provided. The remaining files are archived. Next, the diseased plant reads are assembled (producing the file Diseased.contigs.fasta). A step in the A5 pipeline produces error corrected reads (Diseased.ec.fastq), which are used in all the subsequent steps.

  2. 2.

    The assembled reference sequence file is then indexed and aligned with the error corrected reads by the BWA tool [21] using the index and mem commands. Using the samtools (http://www.htslib.org/doc/samtools.html) commands view, sort, index and idxstats, a summary of statics is produced (Diseased.sorted.csv), consisting of the reference sequence name, sequence length, number of mapped, and unmapped reads.

  3. 3.

    This summary is passed to a phytocount.pl to estimate a cutoff value, by running once with cutoff 0, then using a fraction of the ratio between the sum of the lengths of the non-map** reads at cutoff 0 (Stage2.0.nonmatch.fastq, see below) and the sum of the lengths of the error corrected reads (Diseased.ec.fastq) of the diseased plant. Alternatively, if the user wants to supply a range of specifies fixed cutoff values, then the pipeline repeats the following steps from the lowest to the highest values provided (represented here as $cutoffval).

  4. 4.

    From the summary of statistical data (Diseased.sorted.csv), per-contig coverages are calculated and saved in a text file (Diseased.sorted.cov.csv).

  5. 5.

    The contigs with a coverage higher than $cutoffval are exported (Diseased.cutoff.$cutoffval.fasta, where $cutoffval is, e.g., “10”). The error-corrected reads from the diseased plant (Assembly.ec.fastq) are then aligned to the contigs in that last file using BWA (Stage1.$cutoffval.match.sam).

  6. 6.

    Using phytofilter.pl the reads above the cutoff from the alignment file are extracted and exported (Stage1.$cutoffval.match.fastq), using the sam flag #4 (“the query sequence itself is unmapped”) as filter.

  7. 7.

    These reads are now aligned with BWA against the healthy plant reference (Healthy.contigs.fasta), and the reads that do not align are exported (Stage2.$cutoffval.nonmatch.fastq). These non-aligned reads are assembled with the A5 pipeline (Stage3.$cutoffval.contigs.fasta).

  8. 8.

    A blast nucleotide database is created, using phytoblast.pl, from the reference healthy plant file (Healthy.contigs.fasta, which could also be a combination of different references) and used to query the contigs outputted by the previous stage (Stage3.$cutoffval.contigs.fasta) using tblastx. The results are saved in table (Stage3.$cutoffval.contigs.csv), which is then filtered according to the identity percentage (IP): entries with an IP greater than 95% are attributed to the plant (Stage3.$cutoffval.contigs.plant.csv), while those with a lower IP are attributed to the phytoplasma (Stage3.$cutoffval.contigs.phyto.csv). Using this last file the contigs pertaining to the phytoplasma are extracted from the query (Stage3.$cutoffval.phyto.fasta).

  9. 9.

    Lastly, the main outputs are archived and moved to a folder (Results_$timestamp), statistical data such as contigs size and number are calculated, and intermediate files are moved to a sub-folder (Other_files).

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Polano, C., Firrao, G. (2019). Assembly of Phytoplasma Genome Drafts from Illumina Reads Using Phytoassembly. In: Musetti, R., Pagliari, L. (eds) Phytoplasmas. Methods in Molecular Biology, vol 1875. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8837-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-8837-2_16

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-8836-5

  • Online ISBN: 978-1-4939-8837-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics

Navigation