Reference Genome Sequence of Flax

  • Chapter
  • First Online:
The Flax Genome

Part of the book series: Compendium of Plant Genomes ((CPG))

Abstract

Flax (Linum usitatissimum L.) is an economically important fibre and oilseed crop with a relatively small genome size, estimated at 370–455 Mb, depending on the genotypes. The first and second versions of the flax reference genome sequence for the Canadian linseed cultivar CDC Bethune were released in 2012 and 2018, respectively. Since then, a few other representative flax genotypes have been sequenced using second and third-generation sequencing technologies. These are the Chinese linseed cultivar Longya-10, the Chinese fibre cultivars Heiya-14 and Yiya-5, the Russian fibre cultivar Atlant, and a pale flax (Linum bienne) accession. These genome sequences provide a wealth of genomic information to assist research endeavors toward a better understanding of the flax genome and through facilitating genetic studies in flax. This chapter presents a brief review of the major advances in flax genome sequencing, assembly, annotation, and comparative analysis between these genome sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 181.89
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 235.39
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 235.39
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Agrios GN (2005) Plant Pathology. Elsevier Academic Press, Amsterdam

    Google Scholar 

  • Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815

    Google Scholar 

  • Bairoch A, Apweiler R (2000) The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res 28:45–48

    CAS  PubMed  PubMed Central  Google Scholar 

  • Belton JM, McCord RP, Gibcus JH, Naumova N, Zhan Y et al (2012) Hi-C: a comprehensive technique to capture the conformation of genomes. Methods 58:268–276

    CAS  PubMed  Google Scholar 

  • Bocklandt S, Hastie A, Cao H (2019) Bionano genome map**: high-throughput, ultra-long molecule genome analysis system for precision genome assembly and haploid-resolved structural variation discovery. Adv Exp Med Biol 1129:97–118

    CAS  PubMed  Google Scholar 

  • Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W (2011) Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 27:578–579

    CAS  PubMed  Google Scholar 

  • Bolsheva NL, Melnikova NV, Kirov IV, Dmitriev AA, Krasnov GS et al (2019) Characterization of repeated DNA sequences in genomes of blue-flowered flax. BMC Evol Biol 19:49

    PubMed  PubMed Central  Google Scholar 

  • Borsch T, Quandt D (2009) Mutational dynamics and phylogenetic utility of noncoding chloroplast DNA. Plant Syst Evol 282:169–199

    CAS  Google Scholar 

  • Brozynska M, Furtado A, Henry RJ (2016) Genomics of crop wild relatives: expanding the gene pool for crop improvement. Plant Biotechnol J 14:1070–1085

    CAS  PubMed  Google Scholar 

  • Bruna T, Hoff KJ, Lomsadze A, Stanke M, Borodovsky M (2021) BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genom Bioinform 3:lqaa108

    Google Scholar 

  • Buchfink B, **e C, Huson DH (2015) Fast and sensitive protein alignment using DIAMOND. Nat Methods 12:59–60

    CAS  PubMed  Google Scholar 

  • Burton JN, Adey A, Patwardhan RP, Qiu R, Kitzman JO et al (2013) Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol 31:1119–1125

    CAS  PubMed  PubMed Central  Google Scholar 

  • Chen H, Zeng Y, Yang Y, Huang L, Tang B et al (2020) Allele-aware chromosome-level genome assembly and efficient transgene-free genome editing for the autotetraploid cultivated alfalfa. Nat Commun 11:2494

    CAS  PubMed  PubMed Central  Google Scholar 

  • Cloutier S, Ragupathy R, Miranda E, Radovanovic N, Reimer E et al (2012) Integrated consensus genetic and physical maps of flax (Linum usitatissimum L.). Theor Appl Genet 125:1783–1795

    PubMed  PubMed Central  Google Scholar 

  • de Santana LA, Pacheco TG, Santos KGD, Vieira LDN, Guerra MP et al (2018) The Linum usitatissimum L. plastome reveals a typical structural evolution, new editing sites, and the phylogenetic position of Linaceae within Malpighiales. Plant Cell Rep 37:307–328

    Google Scholar 

  • Diederichsen A, Ulrich A (2009) Variability in stem fibre content and its association with other characteristics in 1177 flax (Linum usitatissimum L.) genebank accessions. Ind Crops Prod 30:33–39

    CAS  Google Scholar 

  • Dmitriev AA, Pushkova EN, Novakovskiy RO, Beniaminov AD, Rozhmina TA et al (2020) Genome sequencing of fiber flax cultivar Atlant using Oxford Nanopore and Illumina platforms. Front Genet 11:590282

    CAS  PubMed  Google Scholar 

  • Feschotte C, Jiang N, Wessler SR (2002) Plant transposable elements: where genetics meets genomics. Nat Rev Genet 3:329–341

    CAS  PubMed  Google Scholar 

  • Foulk JA, Akin DE, Dodd RB, Frederick JR (2004) Optimising flax production in the South Atlantic region of the USA. J Sci Food Agri 84:870–876

    CAS  Google Scholar 

  • Fu Y-B (2021) Characterizing chloroplast genomes and inferring maternal divergence of the Triticum-Aegilops complex. Sci Rep 11:15363

    CAS  PubMed  PubMed Central  Google Scholar 

  • Fu YB (2011) Genetic evidence for early flax domestication with capsular dehiscence. Genet Resour Crop Evol 58:1119–1128

    Google Scholar 

  • Ghurye J, Pop M (2019) Modern technologies and algorithms for scaffolding assembled genomes. PLoS Comput Biol 15:e1006994

    CAS  PubMed  PubMed Central  Google Scholar 

  • Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN et al (2011) High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci USA 108:1513–1518

    CAS  PubMed  Google Scholar 

  • Goldblatt P (2007) The index to plant chromosome numbers: past and future. Taxon 56:984–986

    Google Scholar 

  • Gonzalez LG, Deyholos MK (2012) Identification, characterization and distribution of transposable elements in the flax (Linum usitatissimum L.) genome. BMC Genomics 13:644

    Google Scholar 

  • Guisinger MM, Kuehl JV, Boore JL, Jansen RK (2008) Genome-wide analyses of Geraniaceae plastid DNA reveal unprecedented patterns of increased nucleotide substitutions. Proc Natl Acad Sci USA 105:18424–18429

    CAS  PubMed  PubMed Central  Google Scholar 

  • Guo YY, Yang JX, Li HK, Zhao HS (2021) Chloroplast genomes of two species of Cypripedium: expanded genome size and proliferation of AT-biased repeat sequences. Front Plant Sci 12:609729

    PubMed  PubMed Central  Google Scholar 

  • Harris MA, Clark J, Ireland A, Lomax J, Ashburner M et al (2004) The gene ontology (GO) database and informatics resource. Nucleic Acids Res 32:D258-261

    CAS  PubMed  Google Scholar 

  • Hastie AR, Dong L, Smith A, Finklestein J, Lam ET et al (2013) Rapid genome map** in nanochannel arrays for highly complete and accurate de novo sequence assembly of the complex Aegilops tauschii genome. PLoS ONE 8:e55864

    CAS  PubMed  PubMed Central  Google Scholar 

  • Hoff KJ, Lange S, Lomsadze A, Borodovsky M, Stanke M (2016) BRAKER1: unsupervised RNA-Seq-Based genome annotation with GeneMark-ET and AUGUSTUS. Bioinformatics 32:767–769

    CAS  PubMed  Google Scholar 

  • Holt C, Yandell M (2011) MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12:491

    PubMed  PubMed Central  Google Scholar 

  • Hon T, Mars K, Young G, Tsai YC, Karalius JW et al (2020) Highly accurate long-read HiFi sequencing data for five complex genomes. Sci Data 7:399

    CAS  PubMed  PubMed Central  Google Scholar 

  • Hong CP, Park J, Lee Y, Lee M, Park SG et al (2017) accD nuclear transfer of Platycodon grandiflorum and the plastid of early Campanulaceae. BMC Genomics 18:607

    PubMed  PubMed Central  Google Scholar 

  • Huerta-Cepas J, Forslund K, Coelho LP, Szklarczyk D, Jensen LJ et al (2017) Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper. Mol Biol Evol 34:2115–2122

    CAS  PubMed  PubMed Central  Google Scholar 

  • Huerta-Cepas J, Szklarczyk D, Heller D, Hernandez-Plaza A, Forslund SK et al (2019) eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res 47:D309–D314

    CAS  PubMed  Google Scholar 

  • Jain M, Olsen HE, Paten B, Akeson M (2016) The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol 17:239

    PubMed  PubMed Central  Google Scholar 

  • Jansen RK, Kaittanis C, Saski C, Lee S-B, Tomkins J et al (2006) Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids. BMC Evol Biol 6:32

    PubMed  PubMed Central  Google Scholar 

  • Jansen RK, Raubeson LA, Boore JL, dePamphilis CW, Chumley TW et al (2005) Methods for obtaining and analyzing whole chloroplast genome sequences. Methods Enzymol 395:348–384

    CAS  PubMed  Google Scholar 

  • Jones P, Binns D, Chang HY, Fraser M, Li W et al (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30:1236–1240

    CAS  PubMed  PubMed Central  Google Scholar 

  • Kanehisa M, Goto S, Kawashima S, Nakaya A (2002) The KEGG databases at GenomeNet. Nucleic Acids Res 30:42–46

    CAS  PubMed  PubMed Central  Google Scholar 

  • Knox EB (2014) The dynamic history of plastid genomes in the Campanulaceae sensu lato is unique among angiosperms. Proc Natl Acad Sci USA 111:11097–11102

    CAS  PubMed  PubMed Central  Google Scholar 

  • Kolmogorov M, Yuan J, Lin Y, Pevzner PA (2019) Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 37:540–546

    CAS  PubMed  Google Scholar 

  • Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH et al (2017) Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27:722–736

    CAS  PubMed  PubMed Central  Google Scholar 

  • Korf I (2004) Gene finding in novel genomes. BMC Bioinformatics 5:59

    PubMed  PubMed Central  Google Scholar 

  • Lam ET, Hastie A, Lin C, Ehrlich D, Das SK et al (2012) Genome map** on nanochannel arrays for structural variation analysis and sequence assembly. Nat Biotechnol 30:771–776

    CAS  PubMed  Google Scholar 

  • Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921

    CAS  PubMed  Google Scholar 

  • Li B, Zheng Y (2018) Dynamic evolution and phylogenomic analysis of the chloroplast genome in Schisandraceae. Sci Rep 8:9285

    PubMed  PubMed Central  Google Scholar 

  • Li R, Yu C, Li Y, Lam TW, Yiu SM et al (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25:1966–1967

    CAS  PubMed  Google Scholar 

  • Liu F-H, Chen X, Long B, Shuai R-Y, Long C-L (2011) Historical and botanical evidence of distribution, cultivation and utilization of Linum usitatissimum L. (flax) in China. Veget Hist Archaeobot 20:561–566

    Google Scholar 

  • Lomsadze A, Burns PD, Borodovsky M (2014) Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm. Nucleic Acids Res 42:e119

    PubMed  PubMed Central  Google Scholar 

  • Luo MC, Ma Y, You FM, Anderson OD, Kopecky D et al (2010) Feasibility of physical map construction from fingerprinted bacterial artificial chromosome libraries of polyploid plant species. BMC Genomics 11:122

    PubMed  PubMed Central  Google Scholar 

  • Luo R, Liu B, **e Y, Li Z, Huang W et al (2012) SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience 1:18

    PubMed  PubMed Central  Google Scholar 

  • Manni M, Berkeley MR, Seppey M, Zdobnov EM (2021) BUSCO: assessing genomic data quality and beyond. Curr Protoc 1:e323

    PubMed  Google Scholar 

  • Marcussen T, Meseguer AS (2017) Species-level phylogeny, fruit evolution and diversification history of Geranium (Geraniaceae). Mol Phylogenet Evol 110:134–149

    PubMed  Google Scholar 

  • Marks RA, Hotaling S, Frandsen PB, VanBuren R (2021) Representation and participation across 20 years of plant genome sequencing. Nat Plants 7:1571–1578

    CAS  PubMed  PubMed Central  Google Scholar 

  • Mehrotra S, Goyal V (2014) Repetitive sequences in plant nuclear DNA: types, distribution, evolution and function. Genom Proteom Bioinform 12:164–171

    Google Scholar 

  • Mistry J, Chuguransky S, Williams L, Qureshi M, Salazar GA et al (2021) Pfam: the protein families database in 2021. Nucleic Acids Res 49:D412–D419

    CAS  PubMed  Google Scholar 

  • Morgante M (2006) Plant genome organisation and diversity: the year of the junk! Curr Opin Biotechnol 17:168–173

    CAS  PubMed  Google Scholar 

  • Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV et al (2022) The complete sequence of a human genome. Science 376:44–53

    CAS  PubMed  PubMed Central  Google Scholar 

  • Ottai MES, Al-Kordy MAA, Afiah SA (2011) Evaluation, correlation and path coefficient analysis among seed yield and its attributes of oil flax (Linum usitatissimum) genotypes. Aust J Basic Appl Sci 5:252–258

    Google Scholar 

  • Ragupathy R, Rathinavelu R, Cloutier S (2011) Physical map** and BAC-end sequence analysis provide initial insights into the flax (Linum usitatissimum L.) genome. BMC Genomics 12:217

    Google Scholar 

  • Raubeson LA, Jansen RK (2005) Plant diversity and evolution: genotypic and phenotypic variation in higher plants. In: Henry RJ (ed) Chloroplast genomes of plants. CABI Publishing, Wallingford, pp 45–68

    Google Scholar 

  • Rice A, Glick L, Abadi S, Einhorn M, Kopelman NM et al (2014) The chromosome counts database (CCDB)—a community resource of plant chromosome numbers. New Phytol 206:19–26

    PubMed  Google Scholar 

  • Rowland GG, Hormis YA, Rashid KY (2002) CDC Bethune flax. Can J Plant Sci 82:101–102

    Google Scholar 

  • Ruan J, Li H (2020) Fast and accurate long-read assembly with wtdbg2. Nat Methods 17:155–158

    CAS  PubMed  Google Scholar 

  • Sa R, Yi L, Siqin B, An M, Bao H et al (2021) Chromosome-level genome assembly and annotation of the fiber flax (Linum usitatissimum) genome. Front Genet 12:735690

    CAS  PubMed  PubMed Central  Google Scholar 

  • Saski C, Lee SB, Daniell H, Wood TC, Tomkins J et al (2005) Complete chloroplast genome sequence of Gycine max and comparative analyses with other legume genomes. Plant Mol Biol 59:309–322

    CAS  PubMed  Google Scholar 

  • Schwarz EN, Ruhlman TA, Sabir JSM, Hajrah NH, Alharbi NS et al (2015) Plastid genome sequences of legumes reveal parallel inversions and multiple losses of rps16 in papilionoids. J Syst Evol 53:458–468

    Google Scholar 

  • Seol Y-J, Kim K, Kang S-H, Perumal S, Lee J et al (2017) The complete chloroplast genome of two Brassica species, Brassica nigra and B. Oleracea. Mitochondrial DNA Part A 28:167–168

    Google Scholar 

  • Shafin K, Pesout T, Lorig-Roach R, Haukness M, Olsen HE et al (2020) Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes. Nat Biotechnol 38:1044–1053

    CAS  PubMed  PubMed Central  Google Scholar 

  • Shinozaki K, Ohme M, Tanaka M, Wakasugi T, Hayashida N et al (1986) The complete nucleotide sequence of the tobacco chloroplast genome: its gene organization and expression. EMBO J 5:2043–2049

    CAS  PubMed  PubMed Central  Google Scholar 

  • Singh KK, Mridula D, Rehal J, Barnwal P (2011) Flaxseed: a potential source of food, feed and fiber. Crit Rev Food Sci Nutr 51:210–222

    CAS  PubMed  Google Scholar 

  • Soni S (2021) A complete guide on flaxseed cultivation. https://krishijagran.com/agripedia/a-complete-guide-on-flaxseed-cultivation/

  • Stanke M, Keller O, Gunduz I, Hayes A, Waack S et al (2006) AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res 34:W435-439

    CAS  PubMed  PubMed Central  Google Scholar 

  • Stankova H, Hastie AR, Chan S, Vrana J, Tulpova Z et al (2016) BioNano genome map** of individual chromosomes supports physical map** and sequence assembly in complex plant genomes. Plant Biotechnol J 14:1523–1531

    CAS  PubMed  PubMed Central  Google Scholar 

  • Tollis M, Boissinot S (2012) The evolutionary dynamics of transposable elements in eukaryote genomes. Genome Dyn 7:68–91

    CAS  PubMed  Google Scholar 

  • Vaser R, Sovic I, Nagarajan N, Sikic M (2017) Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res 27:737–746

    CAS  PubMed  PubMed Central  Google Scholar 

  • Wambugu PW, Brozynska M, Furtado A, Waters DL, Henry RJ (2015) Relationships of wild and domesticated rices (Oryza AA genome species) based upon whole chloroplast genome sequences. Sci Rep 5:13957

    PubMed  PubMed Central  Google Scholar 

  • Wang Z, Hobson N, Galindo L, Zhu S, Shi D et al (2012) The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads. Plant J 72:461–473

    PubMed  Google Scholar 

  • Wee Y, Bhyan SB, Liu Y, Lu J, Li X et al (2019) The bioinformatics tools for the genome assembly and analysis based on third-generation sequencing. Brief Funct Genomics 18:1–12

    CAS  PubMed  Google Scholar 

  • Wicker T, Sabot F, Hua-Van A, Bennetzen JL, Capy P et al (2007) A unified classification system for eukaryotic transposable elements. Nat Rev Genet 8:973–982

    CAS  PubMed  Google Scholar 

  • Wu CS, Chaw SM (2014) Highly rearranged and size-variable chloroplast genomes in conifers II clade (cupressophytes): evolution towards shorter intergenic spacers. Plant Biotechnol J 12:344–353

    CAS  PubMed  Google Scholar 

  • Wu CS, Wang YN, Hsu CY, Lin CP, Chaw SM (2011) Loss of different inverted repeat copies from the chloroplast genomes of Pinaceae and cupressophytes and influence of heterotachy on the evaluation of gymnosperm phylogeny. Genome Biol Evol 3:1284–1295

    CAS  PubMed  PubMed Central  Google Scholar 

  • Wu Z (2016) The completed eight chloroplast genomes of tomato from Solanum genus. Mitochondrial DNA A DNA Mapp Seq Anal 27:4155–4157

    CAS  PubMed  Google Scholar 

  • You FM, Cloutier S, Shan Y, Ragupathy R (2015) LTR Annotator: automated identification and annotation of LTR retrotransposons in plant genomes. Int J Biosci Biochem Bioinforma 5:165–174

    CAS  Google Scholar 

  • You FM, Duguid SD, Lam I, Cloutier S, Rashid KY et al (2016) Pedigrees and genetic base of the flax varieties registered in Canada. Can J Plant Sci 96:837–852

    Google Scholar 

  • You FM, Jia G, **ao J, Duguid SD, Rashid KY et al (2017) Genetic variability of 27 traits in a core collection of flax (Linum usitatissimum L.). Front Plant Sci 8:1636

    Google Scholar 

  • You FM, **ao J, Li P, Yao Z, Gao J et al (2018) Chromosome-scale pseudomolecules refined by optical, physical, and genetic maps in flax. Plant J 95:371–384

    CAS  PubMed  Google Scholar 

  • Zhang J, Qi Y, Wang L, Wang L, Yan X et al (2020) Genomic comparison and population diversity analysis provide onsights into the domestication and improvement of flax. iScience 23:100967

    Google Scholar 

  • Zhang Y, Edwards D, Batley J (2021) Comparison and evolutionary analysis of Brassica nucleotide binding site leucine rich repeat (NLR) genes and importance for disease resistance breeding. Plant Genome 14:e20060

    CAS  PubMed  Google Scholar 

  • Zimin AV, Puiu D, Hall R, Kingan S, Clavijo BJ et al (2017a) The first near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum. Gigascience 6:1–7

    CAS  PubMed  PubMed Central  Google Scholar 

  • Zimin AV, Puiu D, Luo MC, Zhu T, Koren S et al (2017b) Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii, a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm. Genome Res 27:787–792

    CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors thank Dr. Bourlaye Fofana for reviewing and editing and Tara Edwards for English editing.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frank M. You .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 His Majesty the King in Right of Canada, as represented by the Minister of Agriculture and Agri-Food

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

You, F.M., Moumen, I., Khan, N., Cloutier, S. (2023). Reference Genome Sequence of Flax. In: You, F.M., Fofana, B. (eds) The Flax Genome. Compendium of Plant Genomes. Springer, Cham. https://doi.org/10.1007/978-3-031-16061-5_1

Download citation

Publish with us

Policies and ethics

Navigation