Abstract
Flax (Linum usitatissimum L.) is an economically important fibre and oilseed crop with a relatively small genome size, estimated at 370–455 Mb, depending on the genotypes. The first and second versions of the flax reference genome sequence for the Canadian linseed cultivar CDC Bethune were released in 2012 and 2018, respectively. Since then, a few other representative flax genotypes have been sequenced using second and third-generation sequencing technologies. These are the Chinese linseed cultivar Longya-10, the Chinese fibre cultivars Heiya-14 and Yiya-5, the Russian fibre cultivar Atlant, and a pale flax (Linum bienne) accession. These genome sequences provide a wealth of genomic information to assist research endeavors toward a better understanding of the flax genome and through facilitating genetic studies in flax. This chapter presents a brief review of the major advances in flax genome sequencing, assembly, annotation, and comparative analysis between these genome sequences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agrios GN (2005) Plant Pathology. Elsevier Academic Press, Amsterdam
Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815
Bairoch A, Apweiler R (2000) The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res 28:45–48
Belton JM, McCord RP, Gibcus JH, Naumova N, Zhan Y et al (2012) Hi-C: a comprehensive technique to capture the conformation of genomes. Methods 58:268–276
Bocklandt S, Hastie A, Cao H (2019) Bionano genome map**: high-throughput, ultra-long molecule genome analysis system for precision genome assembly and haploid-resolved structural variation discovery. Adv Exp Med Biol 1129:97–118
Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W (2011) Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 27:578–579
Bolsheva NL, Melnikova NV, Kirov IV, Dmitriev AA, Krasnov GS et al (2019) Characterization of repeated DNA sequences in genomes of blue-flowered flax. BMC Evol Biol 19:49
Borsch T, Quandt D (2009) Mutational dynamics and phylogenetic utility of noncoding chloroplast DNA. Plant Syst Evol 282:169–199
Brozynska M, Furtado A, Henry RJ (2016) Genomics of crop wild relatives: expanding the gene pool for crop improvement. Plant Biotechnol J 14:1070–1085
Bruna T, Hoff KJ, Lomsadze A, Stanke M, Borodovsky M (2021) BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genom Bioinform 3:lqaa108
Buchfink B, **e C, Huson DH (2015) Fast and sensitive protein alignment using DIAMOND. Nat Methods 12:59–60
Burton JN, Adey A, Patwardhan RP, Qiu R, Kitzman JO et al (2013) Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol 31:1119–1125
Chen H, Zeng Y, Yang Y, Huang L, Tang B et al (2020) Allele-aware chromosome-level genome assembly and efficient transgene-free genome editing for the autotetraploid cultivated alfalfa. Nat Commun 11:2494
Cloutier S, Ragupathy R, Miranda E, Radovanovic N, Reimer E et al (2012) Integrated consensus genetic and physical maps of flax (Linum usitatissimum L.). Theor Appl Genet 125:1783–1795
de Santana LA, Pacheco TG, Santos KGD, Vieira LDN, Guerra MP et al (2018) The Linum usitatissimum L. plastome reveals a typical structural evolution, new editing sites, and the phylogenetic position of Linaceae within Malpighiales. Plant Cell Rep 37:307–328
Diederichsen A, Ulrich A (2009) Variability in stem fibre content and its association with other characteristics in 1177 flax (Linum usitatissimum L.) genebank accessions. Ind Crops Prod 30:33–39
Dmitriev AA, Pushkova EN, Novakovskiy RO, Beniaminov AD, Rozhmina TA et al (2020) Genome sequencing of fiber flax cultivar Atlant using Oxford Nanopore and Illumina platforms. Front Genet 11:590282
Feschotte C, Jiang N, Wessler SR (2002) Plant transposable elements: where genetics meets genomics. Nat Rev Genet 3:329–341
Foulk JA, Akin DE, Dodd RB, Frederick JR (2004) Optimising flax production in the South Atlantic region of the USA. J Sci Food Agri 84:870–876
Fu Y-B (2021) Characterizing chloroplast genomes and inferring maternal divergence of the Triticum-Aegilops complex. Sci Rep 11:15363
Fu YB (2011) Genetic evidence for early flax domestication with capsular dehiscence. Genet Resour Crop Evol 58:1119–1128
Ghurye J, Pop M (2019) Modern technologies and algorithms for scaffolding assembled genomes. PLoS Comput Biol 15:e1006994
Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN et al (2011) High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci USA 108:1513–1518
Goldblatt P (2007) The index to plant chromosome numbers: past and future. Taxon 56:984–986
Gonzalez LG, Deyholos MK (2012) Identification, characterization and distribution of transposable elements in the flax (Linum usitatissimum L.) genome. BMC Genomics 13:644
Guisinger MM, Kuehl JV, Boore JL, Jansen RK (2008) Genome-wide analyses of Geraniaceae plastid DNA reveal unprecedented patterns of increased nucleotide substitutions. Proc Natl Acad Sci USA 105:18424–18429
Guo YY, Yang JX, Li HK, Zhao HS (2021) Chloroplast genomes of two species of Cypripedium: expanded genome size and proliferation of AT-biased repeat sequences. Front Plant Sci 12:609729
Harris MA, Clark J, Ireland A, Lomax J, Ashburner M et al (2004) The gene ontology (GO) database and informatics resource. Nucleic Acids Res 32:D258-261
Hastie AR, Dong L, Smith A, Finklestein J, Lam ET et al (2013) Rapid genome map** in nanochannel arrays for highly complete and accurate de novo sequence assembly of the complex Aegilops tauschii genome. PLoS ONE 8:e55864
Hoff KJ, Lange S, Lomsadze A, Borodovsky M, Stanke M (2016) BRAKER1: unsupervised RNA-Seq-Based genome annotation with GeneMark-ET and AUGUSTUS. Bioinformatics 32:767–769
Holt C, Yandell M (2011) MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12:491
Hon T, Mars K, Young G, Tsai YC, Karalius JW et al (2020) Highly accurate long-read HiFi sequencing data for five complex genomes. Sci Data 7:399
Hong CP, Park J, Lee Y, Lee M, Park SG et al (2017) accD nuclear transfer of Platycodon grandiflorum and the plastid of early Campanulaceae. BMC Genomics 18:607
Huerta-Cepas J, Forslund K, Coelho LP, Szklarczyk D, Jensen LJ et al (2017) Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper. Mol Biol Evol 34:2115–2122
Huerta-Cepas J, Szklarczyk D, Heller D, Hernandez-Plaza A, Forslund SK et al (2019) eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res 47:D309–D314
Jain M, Olsen HE, Paten B, Akeson M (2016) The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol 17:239
Jansen RK, Kaittanis C, Saski C, Lee S-B, Tomkins J et al (2006) Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids. BMC Evol Biol 6:32
Jansen RK, Raubeson LA, Boore JL, dePamphilis CW, Chumley TW et al (2005) Methods for obtaining and analyzing whole chloroplast genome sequences. Methods Enzymol 395:348–384
Jones P, Binns D, Chang HY, Fraser M, Li W et al (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30:1236–1240
Kanehisa M, Goto S, Kawashima S, Nakaya A (2002) The KEGG databases at GenomeNet. Nucleic Acids Res 30:42–46
Knox EB (2014) The dynamic history of plastid genomes in the Campanulaceae sensu lato is unique among angiosperms. Proc Natl Acad Sci USA 111:11097–11102
Kolmogorov M, Yuan J, Lin Y, Pevzner PA (2019) Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 37:540–546
Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH et al (2017) Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27:722–736
Korf I (2004) Gene finding in novel genomes. BMC Bioinformatics 5:59
Lam ET, Hastie A, Lin C, Ehrlich D, Das SK et al (2012) Genome map** on nanochannel arrays for structural variation analysis and sequence assembly. Nat Biotechnol 30:771–776
Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921
Li B, Zheng Y (2018) Dynamic evolution and phylogenomic analysis of the chloroplast genome in Schisandraceae. Sci Rep 8:9285
Li R, Yu C, Li Y, Lam TW, Yiu SM et al (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25:1966–1967
Liu F-H, Chen X, Long B, Shuai R-Y, Long C-L (2011) Historical and botanical evidence of distribution, cultivation and utilization of Linum usitatissimum L. (flax) in China. Veget Hist Archaeobot 20:561–566
Lomsadze A, Burns PD, Borodovsky M (2014) Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm. Nucleic Acids Res 42:e119
Luo MC, Ma Y, You FM, Anderson OD, Kopecky D et al (2010) Feasibility of physical map construction from fingerprinted bacterial artificial chromosome libraries of polyploid plant species. BMC Genomics 11:122
Luo R, Liu B, **e Y, Li Z, Huang W et al (2012) SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience 1:18
Manni M, Berkeley MR, Seppey M, Zdobnov EM (2021) BUSCO: assessing genomic data quality and beyond. Curr Protoc 1:e323
Marcussen T, Meseguer AS (2017) Species-level phylogeny, fruit evolution and diversification history of Geranium (Geraniaceae). Mol Phylogenet Evol 110:134–149
Marks RA, Hotaling S, Frandsen PB, VanBuren R (2021) Representation and participation across 20 years of plant genome sequencing. Nat Plants 7:1571–1578
Mehrotra S, Goyal V (2014) Repetitive sequences in plant nuclear DNA: types, distribution, evolution and function. Genom Proteom Bioinform 12:164–171
Mistry J, Chuguransky S, Williams L, Qureshi M, Salazar GA et al (2021) Pfam: the protein families database in 2021. Nucleic Acids Res 49:D412–D419
Morgante M (2006) Plant genome organisation and diversity: the year of the junk! Curr Opin Biotechnol 17:168–173
Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV et al (2022) The complete sequence of a human genome. Science 376:44–53
Ottai MES, Al-Kordy MAA, Afiah SA (2011) Evaluation, correlation and path coefficient analysis among seed yield and its attributes of oil flax (Linum usitatissimum) genotypes. Aust J Basic Appl Sci 5:252–258
Ragupathy R, Rathinavelu R, Cloutier S (2011) Physical map** and BAC-end sequence analysis provide initial insights into the flax (Linum usitatissimum L.) genome. BMC Genomics 12:217
Raubeson LA, Jansen RK (2005) Plant diversity and evolution: genotypic and phenotypic variation in higher plants. In: Henry RJ (ed) Chloroplast genomes of plants. CABI Publishing, Wallingford, pp 45–68
Rice A, Glick L, Abadi S, Einhorn M, Kopelman NM et al (2014) The chromosome counts database (CCDB)—a community resource of plant chromosome numbers. New Phytol 206:19–26
Rowland GG, Hormis YA, Rashid KY (2002) CDC Bethune flax. Can J Plant Sci 82:101–102
Ruan J, Li H (2020) Fast and accurate long-read assembly with wtdbg2. Nat Methods 17:155–158
Sa R, Yi L, Siqin B, An M, Bao H et al (2021) Chromosome-level genome assembly and annotation of the fiber flax (Linum usitatissimum) genome. Front Genet 12:735690
Saski C, Lee SB, Daniell H, Wood TC, Tomkins J et al (2005) Complete chloroplast genome sequence of Gycine max and comparative analyses with other legume genomes. Plant Mol Biol 59:309–322
Schwarz EN, Ruhlman TA, Sabir JSM, Hajrah NH, Alharbi NS et al (2015) Plastid genome sequences of legumes reveal parallel inversions and multiple losses of rps16 in papilionoids. J Syst Evol 53:458–468
Seol Y-J, Kim K, Kang S-H, Perumal S, Lee J et al (2017) The complete chloroplast genome of two Brassica species, Brassica nigra and B. Oleracea. Mitochondrial DNA Part A 28:167–168
Shafin K, Pesout T, Lorig-Roach R, Haukness M, Olsen HE et al (2020) Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes. Nat Biotechnol 38:1044–1053
Shinozaki K, Ohme M, Tanaka M, Wakasugi T, Hayashida N et al (1986) The complete nucleotide sequence of the tobacco chloroplast genome: its gene organization and expression. EMBO J 5:2043–2049
Singh KK, Mridula D, Rehal J, Barnwal P (2011) Flaxseed: a potential source of food, feed and fiber. Crit Rev Food Sci Nutr 51:210–222
Soni S (2021) A complete guide on flaxseed cultivation. https://krishijagran.com/agripedia/a-complete-guide-on-flaxseed-cultivation/
Stanke M, Keller O, Gunduz I, Hayes A, Waack S et al (2006) AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res 34:W435-439
Stankova H, Hastie AR, Chan S, Vrana J, Tulpova Z et al (2016) BioNano genome map** of individual chromosomes supports physical map** and sequence assembly in complex plant genomes. Plant Biotechnol J 14:1523–1531
Tollis M, Boissinot S (2012) The evolutionary dynamics of transposable elements in eukaryote genomes. Genome Dyn 7:68–91
Vaser R, Sovic I, Nagarajan N, Sikic M (2017) Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res 27:737–746
Wambugu PW, Brozynska M, Furtado A, Waters DL, Henry RJ (2015) Relationships of wild and domesticated rices (Oryza AA genome species) based upon whole chloroplast genome sequences. Sci Rep 5:13957
Wang Z, Hobson N, Galindo L, Zhu S, Shi D et al (2012) The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads. Plant J 72:461–473
Wee Y, Bhyan SB, Liu Y, Lu J, Li X et al (2019) The bioinformatics tools for the genome assembly and analysis based on third-generation sequencing. Brief Funct Genomics 18:1–12
Wicker T, Sabot F, Hua-Van A, Bennetzen JL, Capy P et al (2007) A unified classification system for eukaryotic transposable elements. Nat Rev Genet 8:973–982
Wu CS, Chaw SM (2014) Highly rearranged and size-variable chloroplast genomes in conifers II clade (cupressophytes): evolution towards shorter intergenic spacers. Plant Biotechnol J 12:344–353
Wu CS, Wang YN, Hsu CY, Lin CP, Chaw SM (2011) Loss of different inverted repeat copies from the chloroplast genomes of Pinaceae and cupressophytes and influence of heterotachy on the evaluation of gymnosperm phylogeny. Genome Biol Evol 3:1284–1295
Wu Z (2016) The completed eight chloroplast genomes of tomato from Solanum genus. Mitochondrial DNA A DNA Mapp Seq Anal 27:4155–4157
You FM, Cloutier S, Shan Y, Ragupathy R (2015) LTR Annotator: automated identification and annotation of LTR retrotransposons in plant genomes. Int J Biosci Biochem Bioinforma 5:165–174
You FM, Duguid SD, Lam I, Cloutier S, Rashid KY et al (2016) Pedigrees and genetic base of the flax varieties registered in Canada. Can J Plant Sci 96:837–852
You FM, Jia G, **ao J, Duguid SD, Rashid KY et al (2017) Genetic variability of 27 traits in a core collection of flax (Linum usitatissimum L.). Front Plant Sci 8:1636
You FM, **ao J, Li P, Yao Z, Gao J et al (2018) Chromosome-scale pseudomolecules refined by optical, physical, and genetic maps in flax. Plant J 95:371–384
Zhang J, Qi Y, Wang L, Wang L, Yan X et al (2020) Genomic comparison and population diversity analysis provide onsights into the domestication and improvement of flax. iScience 23:100967
Zhang Y, Edwards D, Batley J (2021) Comparison and evolutionary analysis of Brassica nucleotide binding site leucine rich repeat (NLR) genes and importance for disease resistance breeding. Plant Genome 14:e20060
Zimin AV, Puiu D, Hall R, Kingan S, Clavijo BJ et al (2017a) The first near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum. Gigascience 6:1–7
Zimin AV, Puiu D, Luo MC, Zhu T, Koren S et al (2017b) Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii, a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm. Genome Res 27:787–792
Acknowledgements
The authors thank Dr. Bourlaye Fofana for reviewing and editing and Tara Edwards for English editing.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 His Majesty the King in Right of Canada, as represented by the Minister of Agriculture and Agri-Food
About this chapter
Cite this chapter
You, F.M., Moumen, I., Khan, N., Cloutier, S. (2023). Reference Genome Sequence of Flax. In: You, F.M., Fofana, B. (eds) The Flax Genome. Compendium of Plant Genomes. Springer, Cham. https://doi.org/10.1007/978-3-031-16061-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-16061-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16060-8
Online ISBN: 978-3-031-16061-5
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)