Lampshade web spider Ectatosticta davidi chromosome-level genome assembly provides evidence for its phylogenetic position

Fan, Zheng; Wang, Lu-Yu; **ao, Lin; Tan, Bing; Luo, Bin; Ren, Tian-Yu; Liu, Ning; Zhang, Zhi-Sheng; Bai, Ming

doi:10.1038/s42003-023-05129-x

Lampshade web spider Ectatosticta davidi chromosome-level genome assembly provides evidence for its phylogenetic position

Article
Open access
Published: 18 July 2023

Volume 6, article number 748, (2023)
Cite this article

Download PDF

You have full access to this open access article

Communications Biology

Lampshade web spider Ectatosticta davidi chromosome-level genome assembly provides evidence for its phylogenetic position

Download PDF

Zheng Fan^1,2^na1,
Lu-Yu Wang²^na1,
Lin **ao²^na1,
Bing Tan²,
Bin Luo²,
Tian-Yu Ren²,
Ning Liu ORCID: orcid.org/0000-0002-4601-7532¹^na2,
Zhi-Sheng Zhang ORCID: orcid.org/0000-0002-9304-1789²^na2 &
…
Ming Bai ORCID: orcid.org/0000-0001-9197-5900^1,3,4^na2

1613 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

The spider of Ectatosticta davidi, belonging to the lamp-shade web spider family, Hypochilidae, which is closely related to Hypochilidae and Filistatidae and recovered as sister of the rest Araneomorphs spiders. Here we show the final assembled genome of E. davidi with 2.16 Gb in 15 chromosomes. Then we confirm the evolutionary position of Hypochilidae. Moreover, we find that the GMC gene family exhibit high conservation throughout the evolution of true spiders. We also find that the MaSp genes of E. davidi may represent an early stage of MaSp and MiSp genes in other true spiders, while CrSp shares a common origin with AgSp and PySp but differ from MaSp. Altogether, this study contributes to addressing the limited availability of genomic sequences from Hypochilidae spiders, and provides a valuable resource for investigating the genomic evolution of spiders.

Insights into the karyotype and genome evolution of haplogyne spiders indicate a polyploid origin of lineage with holokinetic chromosomes

Article Open access 28 February 2019

The house spider genome reveals an ancient whole-genome duplication during arachnid evolution

Article Open access 31 July 2017

Chromosome-level genome assembly of the snakefly Mongoloraphidia duomilia (Raphidioptera: Raphidiidae)

Article Open access 04 June 2024

Introduction

Spiders (Araneae) are one of the most successful terrestrial arthropod groups, with high diversity (>51,000 described species) worldwide¹. The vast majority of spiders (>93%) belong to the infraorder Araneomorphae (suborder Opisthothelae), also known as true or modern spiders. The lampshade web spider family Hypochilidae had ever been thought as the sister group of all other true spiders^2,3,4,5. However, recent phylogenomic analysis confirmed that it was the sister group of the crevice weaver spider family Filistatidae and the sistership of (Hypochilidae + Filistatidae) with Haplogynae or Synspermiata^6,7,8, a true spider clade with relatively simple genitalia.

Morphologically, several primitive characters of Araneomorphae have been presented in Hypochilidae (Ectatosticta and Hypochilus), such as two pairs of booklungs, a wide and short, undivided cribellum, and simple genitalia^3,9. However, there is little molecular evidence representing this primitive spider group, which include the mitochondrial genome of Hypochilus thorelli and some conserved genes of both genera for phylogenetic analysis or species delimitation^{8,10,11,12,13,14,15,16}.

Genomic data offers a large amount of genetic information for species, enabling a deeper understanding of their evolution, adaptation, and serving as a basis for further investigations into their biological mechanisms and practical applications. Currently, there are a total of 30 publicly accessible spider genome sequences by April, 2023 (Supplementary Table 1). These resources have made important contributions to research on adaptive evolution^{17,18,19,20,21,22,23,24}, behavior²⁵, and unique spider traits like silk production^26,27,28,29 and venom composition^30,31,32. However, it is important to note that the available spider genome data represent only a fraction of the genetic diversity found within the vast number of spider species, amounting to less than 1000th of the total species. This highlights the pressing need for further genomic studies to encompass a broader range of spiders and enhance our understanding of their genetic landscape.

The spider Ectatosticta davidi (Supplementary Fig. 1), belongs to the hypochilid genus, Ectatosticta from China, which can be usually found in valleys above 1000 m of altitude, building a large sheet web under/inside stones, caves, earth crevices, and tree cavities near rivers or in humid habitat¹⁴. The Ectatosticta spiders often hang themselves under their web, like spiders of Pimoidae and Psechridae. Here, we obtained a high-quality genome sequence of E. davidi, which is helpful to get more genetic characteristics, refine the phylogenetic position of this group, and further our understanding of their environmental adaptative evolution.

Results

Chromosome-level genome sequencing, assembly, and annotation

We obtained ~115 Gb of data via Illumina short-read sequencing, 193 Gb via PacBio long-read sequencing, and 278 Gb via Hi-C read sequencing, corresponding to 53×, 89×, and 128× genome coverage, respectively.

Evaluation of genome characteristics indicated that the genome was ~1.9 Gb, and the heterozygosity was 1.19–1.29% (Supplementary Fig. 2), thus suggesting a complex genome of E. davidi. We obtained a draft genome assembly of 2.16 Gb in length with a scaffold N50 value of 146.18 Mb (Supplementary Table 2), and the complete BUSCO analysis was 95.4% (of which 90.8% was single-copy), which ensured its suitability for downstream analysis. Each step of genome assembly is shown in Supplementary Table 2. The de novo genome assembly of E. davidi mainly comprised 15 chromosomes (Fig. 1a).

**Fig. 1: The genome fetures of *Ectatosticta davidi* .**

A total of 1.44 Gb of repeat sequences, accounting for 66.73% of the E. davidi genome, were identified (Supplementary Table 3 and Fig. 1b). Specifically, 2.54% of repeat sequences were short interspersed nuclear elements (SINEs), 10.69% long interspersed nuclear elements (LINEs), 2.16% long terminal repeats (LTRs), 10.83% DNA transposons, and 36.21% unclassified. In addition, we identified 11.66 Mb of small RNA sequences, 3.73 Mb of satellites, 21.50 Mb of simple repeats, and 1.79 Mb of low complexity.

Three methods were used for gene prediction, and 15,651 genes were annotated. A total of 15,392 genes (98.34%) were anchored to 15 chromosomes. The average gene length was 33,888.9 bp, and the average intron length was 3,735.67 bp. At the protein level, BUSCO completeness score was 90.4% (n = 1013), including 832 (82.1%) single-copy genes and 84 (8.3%) duplicated genes. Approximately 15,600 (~99.67%) genes were functionally annotated using the SwissProt or TrEMBL databases. InterProScan and EggnOG analyses identified protein domains for 13,670 (87.34%) genes, 11,296 GO terms, 10,259 KEGG ko terms, 6357 KEGG pathways, 14,090 COG categories, and 3457 enzyme codes. We also identified 10,866 noncoding RNAs, including 270 miRNAs, 81 rRNAs, 305 snRNAs, 6 ribozymes, and 753 other RNAs. A total of 9451 tRNAs were identified, which accounted for the majority of noncoding RNAs.

We noticed that the piggyBac transposases are greatly expanded in the E. davidi genome (Supplementary Tables 3 and 5). We identified 58 piggyBac genes in the E. davidi genome, including seven PGBD1, three PGBD2, 19 PGBD3, and 29 PGBD4 genes (Supplementary Table 6). And the piggyBac genes were distributed among all chromosomes in the E. davidi genome (Supplementary Fig. 3).

Synteny analysis among E. davidi, Trichonephila antipodiana (Araneidae), and Latrodectus elegans (Theridiidae)

Synteny analysis for E. davidi and T. antipodiana showed that the genomes of the two species had 292 syntenic blocks with 4547 collinear genes (Fig. 1c), that for E. davidi and L. elegans showed 140 syntenic blocks with 2056 collinear genes (Fig. 1d), and that for L. elegans and T. antipodiana showed 327 syntenic blocks with 5857 collinear genes (Fig. 1e).

Phylogenetic analysis

A total of 347 single-copy genes were used to construct phylogenetic relationships (Fig. 2). The phylogenetic tree revealed that the divergence time of true spiders is 288.20 Ma, whereas the lampshade web spider emerged in 240.96 Ma.

**Fig. 2: Phylogenetic and expansion gene family analyses of *E. davidi*.**

A total of 1065 expanded gene families and 4390 contracted gene families were identified in E. davidi. Among them, 110 gene families underwent rapid evolution (P < 0.05), with 55 rapidly evolving expanding families and 55 rapidly evolving contracting families (Fig. 2).

GMC gene family

The Gld genes, which belong to the enzymes of the glucose-methanol-choline (GMC) oxidoreductase family, were greatly expanded in the E. davidi genome, compared with other seven representative spider species. This is the first report of GMC gene family in spiders. In the E. davidi genome, the GMC gene showed an expansion of 44 copies. We also identified GMC genes in other spiders, including 27 in the genome of Argiope bruennichi (Araneidae), 19 in Caerostris darwini (Araneidae), 30 in Caerostris extrusa (Araneidae), 34 in Nephila pilipes (Araneidae), 37 in Parasteatoda tepidariorum (Theridiidae), 14 in Stegodyphus dumicola (Eresidae), 13 in Stegodyphus mimosarum (Eresidae), 16 in Trichonephila antipodiana (Araneidae), and 25 in Trichonephila clavipes (Supplementary Fig. 4b and Supplementary Table 7).

We build a phylogenetic tree of the GMC genes between the E. davidi and some representative insects such as fruit fly D. melanogaster, mosquito Anopheles gambiae, the honeybee A. mellifera, and the flour beetle Tribolium castaneum (Fig. 3a). The GMC genes of E. davidi were separated into two subfamilies: NinaG, which is also found in insects, and an unknown spider-specific subfamily.

**Fig. 3: GMC gene family analysis in *E. davidi*.**

To analyze the spider-specific GMC genes, we build an ML tree with eight spiders and the Arizona bark scorpion C. sculpturatus as the outgroup (Fig. 3b). The tree showed four major clades (excluding outgroup sequences), and the sequences clustered in each clade were classified as subfamilies. Bootstrap resampling analysis indicated that the clustering of these subfamilies was reliable. We found that the GMC genes of E. davidi in most subfamilies were at the position of the sister to the rest genes, which is the same with its phylogenetic position. In this study, we did not name these spider-specific subfamilies.

To investigate the function of GMC genes in spiders, we examined the expression of these genes. Because of insufficient tissue from E. davidi for RNA sequencing, we downloaded the P. tepidariorum transcriptome at different stages (stages 1–10) (Supplementary Fig. 4c). In P. tepidariorum, some GMC genes, such as LOC107453087, were expressed at all stages (Supplementary Fig. 4c). Some genes were expressed during the early stages (stages 1 and 2), such as LOC107443921 and LOC107453228, and some genes were expressed in late stages (stages 6, 7, 8, and 10), such as LOC107438235 and LOC107449348 (Supplementary Fig. 4c). In addition, the distribution of GMC genes in the E. davidi genome was on chr1, chr4, and chr6 (Supplementary Fig. 4a).

Ir/iGluR and cytochrome P450 gene family

We identified 101 IR/iGluR genes in the E. davidi genome, which include 82 complete genes: 59 exhibiting the specific domain signature of the ionotropic glutamate receptors (IPR001320) and 8 with all three characteristic domains (ATD domain, PF01094; LBD-domain, PF10613; and LCD-domain, PF00060). We used the complete IR/iGluR genes in E. davidi to perform a phylogenetic analysis, with D. melanogaster as the outgroup. The phylogenetic tree showed that the IR/iGluR genes belonged to some gene groups, including NMDA, non-NMDA iGluR, Divergent IR, Antennal IR, IR25a/IR8a, and one special E. davidi expansion group, which was a sister group to the Antennal IR group (Fig. 4a). In the E. davidi genome, the IR/iGluR genes were distributed among all chromosomes, except chr10 (Supplementary Fig. 5).

**Fig. 4: Phylogenetic analysis of IR/iGluR and P450 gene families in *E. davidi*.**

We identified 68 P450 genes comprising four major classes: the CYP2 clade (28 genes), mitochondrial P450 clade (9), CYP3 clade (22), and CYP4 clade (9). We reconstructed an ML tree with P450 genes from E. davidi, with D. melanogaster as the outgroup (Fig. 4b). The CYP2 and CYP3 clade genes showed expansion when compared to D. melanogaster.

Silk and venom genes in E. davidi

Silk is an important tool for spider to forage, locomote, nest, mate, egg protect, and communication³³. The venom is utilized by spiders in defensive and predatory interactions³⁴. We identified the silk and toxin genes in E. davidi.

In E. davidi, four silk genes were identified: TuSp, MaSp, AcSp, and CrSp (Supplementary Table 8). Phylogenetic analysis of the N-terminal sequence revealed that Ectatosticta_davidi_00014541 was at sister group of MaSp clade, and the gene Ectatosticta_davidi_00004156 was at sister group of the TuSp clade (Fig. 5a). The repeat regions of the four silk genes are shown in Fig. 5b. We also compared the N-terminal domain of the CrSp gene of E. davidi with the “primitive” spider species Heptathela kimurai (Liphistiidae), Heptathela yanbaruensis (Liphistiidae), Ryuthela nishihirai (Liphistiidae), and the diverse RTA clade Stegodyphus sp. (Eresidae) and Octonoba sybotides (Uloboridae). We found that these sequences bear a close similarity (Fig. 5c). The amino acid composition of the spider silk protein gene was also identified, and the top three amino acids were Gly, Ser, and Ala (Supplementary Fig. 6).

**Fig. 5: Spider silk gene analysis in *E. davidi*.**

In total, 45 toxin genes were identified in the E. davidi genome (Supplementary Table 9) and classified in seven types: angiotensin-converting enzyme (ACE), sphingomyelin phosphodiesterase D (Smase-4), group 7 allergen (ALL7), cysteine-rich secretory proteins (CRISPs), and arginine kinase (AK). The phylogenetic analyses of ACE, AK, ALL7, SMase-4, and CRISPs toxin gene families and the protein domain structures of E. davidi, H. graminicola, and T. antipodiana are shown in Fig. 6. Phylogenetic analysis showed that the toxin genes in E. davidi were correctly identified (Fig. 6). The toxin genes in the E. davidi genome were distributed on all chromosomes (Supplementary Fig. 7).

**Fig. 6: Phylogenetic analysis and protein domain structure of toxin gene families in *E. davidi*, *H. graminicola*, and *T. antipodiana*.**

Discussion

The high-quality genome sequence of E. davidi provides a valuable resource for studying spiders’ evolution and adaptability

To date, the majority of whole genome-sequenced spiders come from well-studied spider groups such as Araneoidea (Araneidae, Tetragnathidae, Theriidae, Linyphiidae)^{19,23,25,28,29,31,32,35,36,37,38,39,40,41} and the marronoid clade (Lycosidae, Pisauridae)⁴². A few genomes have been obtained from Synspermiata (Drymusidae, Dysderidae)¹⁸ and Mygalomorphae (Theraphosidae)⁴³ (Supplementary Table 1). Notably, the genome sequence of E. davidi represents the first high-quality genome from the Hypochilidae family. It provides crucial genetic data to advance our understanding of spider evolution, adaptability, and biology. The genome of E. davidi measures 2.16 Gb in size, with a BUSCO quality evaluation of 95.4%. Furthermore, it was assembled into 15 chromosomes. These findings demonstrate that this genome is of moderate size, exhibits high-quality sequencing, and possesses a moderate number of chromosomes compared with other spiders (Supplementary Table 1).

The genome of E. davidi supports the previous phylogenomics hypothesis

Phylogenetic analysis was performed to determine the phylogenetic position of E. davidi (Hypochiidae) based on available genome data of spiders, including two species of Synspermiata (Dysdera silvatica and Loxosceles reclusa) and five species of Entelegynae (C. darwini, A. bruennichi, Trichonephila antipodiana, Parasteatoda tepidariorum, and Stegodyphus mimosarum). Theoretically, genomic data of representatives of the suborder Mesothelae, the infraorder Mygalomorphae, and the family Filistatidae should be included. However, the genomes of Mesothelae and Filistatidae are presently unavailable, and the genome contiguity quality of the Mygalomorphae (Theraphosidae, A. geniculata) was low with Contig N50 of 0.54 kbp (Supplementary Table 1). The result (Fig. 2a) showed that the lampshade web spider is a sister group of Synspermiata, in accordance with several phylogenetic or phylogenomic results recently^6,7,8. The phylogenomic results showed that the divergence time of Araneomorphae from their common ancestor might be Early Permian (288.20 Ma) while the lampshade web spider should be Early Triassic (240.96 Ma).

The evolutionary trajectory of diverging populations and likelihood of speciation can be heavily influenced by recombination⁴⁴. Genomic rearrangements in animals have been broadly studied, and it has been suggested that synteny blocks and their composition (number of genes and their maximum and average size) correspond to phylogenetic distribution⁴⁵. Synteny analysis was performed for E. davidi with two representative true spiders (T. antipodiana and L. elegans) (Fig. 1c–e). Compared to the number of collinear genes between E. davidi and the two spiders (T. antipodiana and L. elegans), there were more collinear genes between E. davidi and T. antipodiana than L. elegans. It seems most genes of E. davidi was “inherited” by other true spiders, although T. antipodiana (and maybe other true spiders) undergoes a long history and variety of interchromosomal rearrangements. Using the E. davidi chr1 as an example, most of the synteny blocks of E. davidi chr1 matched T. antipodiana chr4 (Fig. 1c) and L. elegans chr3 (Fig. 1d). T. antipodiana chr4 had a good genome synteny relationship with L. elegans chr3 (Fig. 1e). However, the number of synteny blocks between E. davidi chr1 and T. antipodiana chr4 was greater than L. elegans chr3, which may be related to the divergence time of these two species and their adaptation to the environment.

The phylogenetic tree of the GMC gene family among spiders showed that most of the related genes of E. davidi were located in the basal lineage of the phylogenetic tree of the four GMC subfamilies among spiders, indicating their highly conserved characteristics (Fig. 3b). In insects, four core genes (MCδ, ε, ζ, and θ) in the middle of the GMC cluster have remained in tandem and in the same orientation for hundreds of millions of years, strongly suggesting that this cluster is conserved⁴⁶. Although the types of core genes among spiders and insects were different, GMC genes were partially or entirely conserved.

As spiders evolved, the types of silk refined and increased⁴⁷. Mygalomorphae spiders are known to retain a higher number of ancestral states and are more primitive than the Araneomorphae. Spiders from this clade possess a simpler undifferentiated spinning apparatus consisting of uniform spigots that lead to 1–3 types of globular silk glands⁴⁸. The most architecturally complex spider webs have evolved within a group of Araneoidea. For example, spiders of Araneidae have up to six morphologically distinct spinning glands⁴⁹. If we consider the ecological functions of these silk proteins, the evolutionary relationships between these spiders can be determined. MaSps and MiSps are structural silks, AgSps and PySps form gluey silks, and AcSps and TuSps are both used to produce protective sacs for prey and eggs. Previous studies showed the presence of spidroin paralogs prior to the divergence of Mygalomorph and Araneomorph spiders, for Mygalomorph Spidroin 2 from Ancylometes juruensis (Ctenidae) clustered together within orbicularian MaSp2 sequences^50,51,52,53. From the phylogenetic tree of spidroin genes (Fig. 5a), we found that TuSp, AcSp, MaSp, and CrSp of E. davidi were all located in the basal lineage of each clade. If we consider E. davidi as primitive, MaSp and MiSp may have the same origin from similar MaSp genes of E. davidi (Ectatosticta_davidi_00014541-RA), AgSp and PySp from similar CrSp genes of E. davidi (Ectatosticta_davidi_00014990-RA), and TuSp and AcSp from similar AcSp genes of E. davidi (Ectatosticta_davidi_00014568-RA). In addition, MaSp+Misp has a different origin from that of AcSp+TuSp+AgSp+PySp+Crsp. Our study supports the previously validated hypothesis.

Gene family analysis suggests the unique adaptation evolution of E. davidi

The piggyBac transposable element is currently the vector of choice for transgenesis, enhancer trap**, gene discovery, and determination of gene function in both insects and mammals^54,55,56. Genome sequence analysis of various species, such as silkworms (Bombyx mori), ants (Camponotus floridanus and Harpegnathos saltator), moths (Macdunnoughia crassisigna), and bats (Myotis lucifugus) shows that a number of previously unrecognized genes were derived from piggyBac transposases and other transposable elements^{57,58,59,60,61,62}. The piggyBac transposases showed great expansion in the E. davidi genome (Supplementary Tables 3, 5, and 6), and is distributed on every chromosome (Supplementary Fig. 3). The expansion of piggyBac gene family in the E. davidi genome suggests that it may be helpful in creating new genes to adapt to the environment.

Compared to other spiders, there were more GMC genes in the E. davidi genome (Supplementary Table 7). The GMC genes of insects may have different roles in basic physiological processes and diverse metabolic processes, such as glucose metabolism, immunity, suppression of host plant defense responses, and basic physiological processes^46,63,64,65. In spiders, there is little information on GMC genes. Phylogenetic analyses of spiders show that only the NinaG gene subfamily was similar to that of insects, whereas other genes belonged to the spider-specific GMC gene subfamily. Therefore, we conjecture that the spider’s NinaG gene may have the same function as that of an insect in the biogenesis of the rhodopsin chromophore, (3 S)-3-hydroxyretinal^66,67. Analysis of different stages of P. tepidariorum transcriptome suggested that the spider GMC genes may be related to development (Supplementary Fig. 4c). The GMC genes were arranged in clusters in the E. davidi genome (Supplementary Fig. 4a), similar to that observed in insects⁴⁶.

Chemoreception is important for animals to experience changes in nature. The iGluR superfamily is a large and ancient gene family, and the IR family is a variant lineage of the iGluR superfamily of ligand-gated ions⁶⁸. The functional roles of IR/iGluRs are related to the sensing of hearing, olfaction, taste, temperature, and humidity^{69,70,71,72,73}. Phylogenetic analysis of E. davidi confirmed that some genes may play the same role as in insects. For example, Ectatosticta_davidi_00009759 was homologous to IR76b (Fig. 4a), which was reported to be broadly expressed in both olfactory gustatory neurons with diverse chemical specificities in insects⁷⁴. The Ectatosticta_davidi_00009363 gene was homologous to IR93a (Fig. 4a), which has been reported to play an important role in both temperature and humidity sensing^74,75. The IR/iGluR genes in E. davidi showed a special expansion clade, which was the sister clade with the Antennal IR clade (Fig. 4a). Evidence from D. melanogaster research has shown that changes in IRs may contribute to changes in preferred food and habitat⁷⁶. Therefore, the special expansion clade may be related to spider adaptation to changes in food preferences and living habits. In E. davidi, 101 IR/iGluR genes were identified, whereas 435 are found in the spider Dysdera sylvatica¹⁸. We believe that the difference in IR/iGluR gene numbers between these two species may be related to their lifestyle. The spider E. davidi prefers living in stony debris in open, semi-open, and forest-covered habitats and obtains food through the web⁷⁷, whereas D. sylvatica is an active nocturnal hunter of woodlice⁷⁸.

For toxin genes, we identified 15 ALL7 genes in E. davidi, which were the most abundant in comparison to other species (Supplementary Table 10). ALL7 was first reported in the spider venom of Hylyphantes graminicola³². There are six ALL7 coding genes on chr4 and five genes on chr8 of E. davidi (Supplementary Fig. 7). These repeats may have been caused by gene duplication. Phylogenetic analysis of the toxin genes showed that those found in E. davidi were correctly identified (Fig. 6).

In conclusion, the assembly of the E. davidi genomic sequence is the first high-quality chromosome-level genome of Hypochilidae. Phylogenetic results based on genome and gene family (GMC and spidroin) of E. davidi and chromosomal synteny analyses confirm the position of Hypochilidae as recovered in the previous analysis. Our study supports the previously validated hypothesis that MaSp+Misp has a different origin from that of AcSp+TuSp+AgSp+PySp+Crsp. And the silk genes in E. davidi might be the most primitive spider silk genes of the true spiders. The expansion of gene families such as GMC (oxidoreductase enzymes, related to metabolism), piggyBac (one type of transposable element), Ir/iGluR (related to chemoreception), cytochrome P450 (related to metabolic detoxification) and spider venom ALL7 (related to prey) gene family, which is helpful for E. davidi’s to adaptation to the environment. In summary, this work provides a valuable genomic resource for further biological and genetic studies on spiders.

Methods

Sample collection and sequencing

Female specimens of E. davidi were collected from the Qinling Mountains, Chang’an District of ** paired reads.

To estimate the genome size and other characteristics, all filtered reads were used for the survey analysis. The k-mer distribution was estimated using “khist.sh”, and the 17-mer, 19-mer and 21-mer were all selected to investigate the genome size. The genome size was calculated using GenomeScope v1.0.0⁸⁰, and the maximum k-mer coverage cutoff was set to 10,000. And we selected the results of 19-mer for its models fits best (Supplementary Fig. 2).

Genome assembly and annotation

To obtain the high-quality E. davidi genome sequence, PacBio long reads were assembled into contigs using raven v1.6.1⁸¹. The heterozygous regions were reduced using Purge Haplotigs v1.1.0, with a 50% cutoff for identifying contigs as haplotigs⁸². Single-base errors in the genome assembly were corrected using the filtered Illumina reads by NextPolish (v1.3.1) over two rounds⁸³. Minimap2 v2.12 was used as the read aligner⁸⁴. The Hi-C sequencing reads generated a chromosome-level assembly of the genome using 3d-DNA and Juicer v1.6.2⁸⁵.

Potential contaminant sequences were inspected using HS-BLASTN and BLAST+ (blastn) v2.7.1 against the NCBI nucleotide (nt) and UniVec databases⁸⁶.

Repetitive element annotation of the E. davidi genome sequence was performed using a combination of ab initio and homology-based searching. The an-initio database was constructed using RepeatModeler v2.0.2⁸⁷. We combined the an-initio database and repeat library (Repbase) as the reference repeat database. Repetitive elements were finally identified using RepeatMasker v4.1.2⁸⁸.

Protein-coding gene annotation was performed using Maker pipline v3.01.03 by integrating ab initio, transcriptome-based, and protein homology-based evidence⁸⁹. Previously, RNA-seq data were mapped to the E. davidi assembled genome sequence using HISAT2 v2.2.1⁹⁰, and then assembled into transcripts using Stringtie v2.1.6⁹¹. For ab initio gene prediction, we used Augustus v3.4.0⁹² and GeneMark-ES/ET/EP v4.68_lic⁹³. To accurately model the sequence properties, both gene finders were initially trained using the BRAKER v2.1.6 pipeline⁹⁴, which uses the mapped transcriptome sequence data. For protein homology-based evidence, we downloaded the protein sequences of Araneus ventricosus (GCA_013235015.1), Argiope bruennichi (GCA_015342795.1), Trichonephila inaurata madagascariensis (GCA_019973955.1), Trichonephila clavipes (GCA_002102615.1), Parasteatoda tepidariorum (GCA_000365465.3), Stegodyphus mimosarum (GCA_000611955.2), Caerostris extrusa (GCA_021605095.1), Caerostris darwini (GCA_021605075.1), Oedothorax gibbosus (GCA_019343175.1), Nephila pilipes (GCA_019974015.1), Drosophila melanogaster (GCA_000001215.4), Ixodes scapularis (GCA_002892825.2), Strigamia maritima (GCA_000239455.1), Daphnia pulex (GCA_900092285.2) from NCBI, and Trichonephila antipodiana from GigaDB¹⁹. For the Maker pipeline, the transcripts were provided as input via the “est” option and protein homology-based evidence as input via the “protein” option. And then removed redundant isoforms, kept the longest isoforms, and checked the possible errors for “two mRNAs extracted for single redundant seq”, and deleted proteins of length smaller than 50.

The predicted genes were functionally annotated using the following three ways: EggNOG-mapper v2.1.5⁹⁵ was used to identify GO, EC (expression coherence), KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways, KEGG orthologous groups (KOs), and COG (clusters of orthologous groups) with eggNOG v5.0 database. Diamond v2.0 was used to annotate homology-based gene functions with the SWISS-PROT and TrEMBL databases^96,97. InterProScan v5.48-83.0 was used to screen protein sequences using the Pfam, Panther, Gene3D, Superfamily, and CDD databases^{98,99,100,101,102,103}.

Noncoding RNA annotation was performed using infernal v1.1.4, and tRNAscan-SE v2.0.9^104,105.

To assess the completeness of the genome or protein sequences of E. davidi, we used the BUSCO v5.2.2 pipeline¹⁰⁶ and the arthropod reference set of arthropoda_odb 10 (n = 1013).

Phylogenetic analyses and divergence time estimation

Single-copy orthologous gene families were identified by gene orthology analysis and then used for comparative genome analysis. For gene orthology analysis, we compared the protein-coding genes of E. davidi and other seven representative spider species, including Araneidae (A. bruennichi, C. darwini, and T. antipodiana), Theridiidae (P. tepidariorum), Eresidae (S. mimosarum), Sicariidae (Loxosceles reclusa), and Dysderidae (Dysdera silvatica), with Scorpiones (C. sculpturatus), and **phosura (T. tridentatus) as outgroups. Orthologous gene clusters were classified using OrthoFinder v2.5.4¹⁰⁷.

Phylogenetic analysis was performed using previously identified single-copy genes. First, the protein sequences of single-copy genes were separately aligned using MAFFT v7.487, based on the L-INS-I strategy¹⁰⁸. The resulting alignments were then fed to trimAl v1.4, to remove sites of unclear homology, using the heuristic method “automated1”¹⁰⁹. All the well-trimmed single-copy genes in each species were concatenated to one super gene for each species using FASconCAT-G v1.04¹¹⁰. Finally, maximum-likelihood-based phylogenetic analysis was performed using IQ-TREE v2.1.3, with extended model selection followed by tree inference, model set by LG, with the number of partition pairs for the cluster algorithm, replicates for ultrafast bootstrap, and Shimodaira-Hasegawa (SH) approximate likelihood ratio tests of 1000, 10, and 1000, respectively¹¹¹.

Fossil records were downloaded from the paleobiodb database (https://paleobiodb.org/) and TimeTree database (http://www.timetree.org/), with Nephilinae stem (43–47.8 Mya), Palpimanoidea stem (173.1–183.4 Mya), and split between scorpions and spiders (435–439 Mya). The divergence time was estimated using the MCMC Tree program in the PAML package v4.9j¹¹² with the following parameters: independent clock rates; BD paras-related birth, death, and sampling rates of 1, 1, and 0.1, respectively; and Burnin, sampfreq, and nsample of 2000, 5, and 10000, respectively.

Gene family evolution analysis

Café v4.2.1 and v5.0.0 were used to identify the likelihood of gene family expansion and contraction^113,114. CAFE5 was used to predict the birth-death parameter lambda. The results were fed to CAFE4 and run with a P-value threshold of 0.01. And the conditional P value for each gene family was calculated. If the P values <0.05, the gene family was treated as having a significantly accelerated rate of expansion or contraction. And Gene families with >200 copies in one of the species were removed.

Annotation of gene families

To manually annotate the genes of glucose-methanol-choline (GMC), piggyBac, ionotropic receptors and ionotropic glutamate receptors (Ir/iGluR) and P450 gene families, we initially downloaded the amino acid sequences of related species from the GenBank database, or related articles were used as the reference query. The reference GMC homologous protein sequences for Drosophila melanogaster, Anopheles gambiae, Apis mellifera, Tribolium castaneum, Escherichia coli, Caenorhabditis elegans, Aspergillus niger, Aspergillus oryzae, and Penicillium amagasakiense were downloaded from a previous study⁴⁶. The reference piggyBac sequence accession number is shown in Supplementary Table 4. The reference for chemosensory sequence accession was downloaded from the dataset by Vizueta¹¹⁵.

We used the BITACORA pipeline to identify Ir/iGluR genes¹¹⁶. The “incomplete” (or “partial”) genes were checked for the length of the encoded protein, which contained less than 80% of the protein domain length characteristic of the family.

To identify GMC, piggyBac, and P450 genes, we performed gene family analysis in three ways. First, a blastp-like search was performed by MMseqs2 v11 with four rounds of iteration¹¹⁷. Interproscan v5.48-83.0 was used to confirm specific conserved domains using the Pfam database⁹⁸. Candidate proteins were filtered using MMseqs2 with a TBLATN-like search to delete invalid matches. And the method for identification P450 gene families was same with Fan¹⁹.

For the spidroin gene set, we downloaded protein sequences of the seven spidroin gene classes from the dataset by Arakawa²⁶, and Latrodectus elegans data were downloaded from the dataset by Wang³¹. The reference CrSp gene was downloaded from the dataset by Arakawa²⁶.

The reference toxin gene set was downloaded from the dataset by Zhu³².

Phylogenetic analyses of the gene families

Multiple alignments of protein sequences were generated using MAFFT v7.487¹⁰⁸, with the default parameters and necessary manual adjustments. The tree was constructed using IQ-TREE v2.1.3¹¹¹. The tree was viewed and edited using FigTree v1.4.3 and the Evolview v3 webserver¹¹⁸. The position of the genes on the chromosome is shown using the online tool MG2C¹¹⁹.

Synteny analysis

To look for changes in chromosomes among the ancient Araneomorphae spider and other true spiders, the synteny analysis between E. davidi and other spiders (including T. antipodiana and L. elegans) was carried out by MCScanX¹²⁰, and the results are shown in TBtools¹²¹.

GMC gene expression analysis

The RNA sequencing data¹²² of P. tepidariorum at different stages (stages 1–10) was downloaded from NCBI with the accession number of GSE112712 by SRA Toolkit v3.0.1 (http://www.ncbi.nlm.nih.gov/books/NBK158900/). The clean data was mapped to the reference genome by the software of HISAT2 v2.2.1⁹⁰. The featureCounts v1.6.4 software was used to calculate the fragments per kilobase million (FPKM) values¹²³. The R packages of DESeq2 were used to analyze the gene expression differences.

Statistics and reproducibility

The genome assembly reported here was derived from the female of E. davidi. Our annotation pipeline was performed by integrating three evidence, such as ab initio, transcriptome-based, and protein homology-based evidence.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The sequencing data sets supporting the results of this article are available in NCBI (BioProjectID PRJNA853523). Illumina paired-end reads have been uploaded with SRA accession SRR19913594, Pacific Biosciences long-read data are associated with SRA accession SRR20336950, Hi-C data are available at SRR19905029, and RNA sequencing data generated for annotation are available with SRA accession SRR19913735. The genome assembly of E. davidi were deposited in ScienceDB Digital Repository with https://doi.org/10.57760/sciencedb.06872¹²⁴. All other relevant data are available upon request.

References

WSC. World Spider Catalog. Version 24.0 Natural History Museum Bern. http://wsc.nmbe.ch (2023).
Lehtinen, P. T. Classification of the cribellate spiders and some allied families, with notes on the evolution of the suborder Araneomorpha. Ann. Zool. Fenn. 4, 199–468 (1967).
Google Scholar
Platnick, N. I. The hypochiloid spiders: a cladistic analysis, with notes on the Atypoidea (Arachnida, Araneae). Am. Mus. Novit. 2627, 1–23 (1977).
Google Scholar
Coddington, J. A. Ontogeny and homology in the male palpus of orb-weaving spiders and their relatives, with comments on phylogeny (Araneoclada: Araneoidea, Deinopoidea). Smithson. Contrib. Zool. 496, 1–52 (1990).
Article Google Scholar
Coddington J. A. Phylogeny and classification of spiders. in Spiders of North America: An identification manual. (eds Ubick, D. et al.) 18–24 (American Arachnological Society, 2005).
Bond, J. E. et al. Phylogenomics resolves a spider backbone phylogeny and rejects a prevailing paradigm for orb web evolution. Curr. Biol: CB, 24, 1765–1771 (2014).
Article CAS PubMed Google Scholar
Garrison, N. L. et al. Spider phylogenomics: untangling the spider tree of life. PeerJ 4, https://doi.org/10.7717/peerj.1719 (2016).
Wheeler, W. C. et al. The spider tree of life: phylogeny of Araneae based on target-gene analyses from an extensive taxon sampling. Cladistics 33, 574–616 (2017).
Article PubMed Google Scholar
Jocqué, R. & Dippenaar-Schoeman, A. S. In Spider Families of the World. (eds Jocqué, R. & Dippenaar-Schoeman, A. S) 144–145 (Royal Museum for Central Africa, Tervuren, 2006).
Masta, S. E. & Boore, J. L. Parallel evolution of truncated transfer RNA genes in arachnid mitochondrial genomes. Mol. Biol. Evol. 25, 949–959 (2008).
Article CAS PubMed Google Scholar
Li, J. N., Yan, X. Y., Lin, Y. J., Li, S. Q. & Chen, H. F. Challenging Wallacean and Linnean shortfalls: Ectatosticta spiders (Araneae, Hypochilidae) from China. Zool. Res. 42, 792–795 (2021).
Article PubMed PubMed Central Google Scholar
Li, M. et al. Mitochondrial phylogenomics provides insights into the phylogeny and evolution of spiders (Arthropoda: Araneae). Zool. Res. 43, 566–584 (2022).
PubMed PubMed Central Google Scholar
Lin, Y. & Li, S. Taxonomic studies on the genus Ectatosticta (Araneae, Hypochilidae) from China, with descriptions of two new species. Zookeys 954, 17–29 (2020).
Article PubMed PubMed Central Google Scholar
Wang, L. Y., Zhao, J. X., Irfan, M. & Zhang, Z. S. Review of the spider genus Ectatosticta Simon, 1892 (Araneae: Hypochilidae) with description of four new species from China. Zootaxa 5016, 523–542 (2021).
Article PubMed Google Scholar
Wang, L. Y., Zhao, J. X., Irfan, M. & Zhang, Z. S. Further revision of the spider genus Ectatosticta Simon, 1892 (Hypochilidae), with the description of three new species. Acta Arachnologica Sin. 30, 91–98 (2021).
Google Scholar
Ciaccio, E., Debray, A. & Hedin, M. Phylogenomics of paleoendemic lampshade spiders (Araneae, Hypochilidae, Hypochilus), with the description of a new species from montane California. Zookeys 1086, 163–204 (2022).
Article PubMed PubMed Central Google Scholar
Bechsgaard, J. et al. Comparative genomic study of arachnid immune systems indicates loss of beta-1,3-glucanase-related proteins and the immune deficiency pathway. J. Evol. Biol. 29, 277–291 (2016).
Article CAS PubMed Google Scholar
Escuer, P. et al. The chromosome-scale assembly of the Canary Islands endemic spider Dysdera silvatica (Arachnida, Araneae) sheds light on the origin and genome structure of chemoreceptor gene families in chelicerates. Mol. Ecol. Resour. 22, 375–390 (2022).
Article CAS PubMed Google Scholar
Fan, Z. et al. A chromosome-level genome of the spider Trichonephila antipodiana reveals the genetic basis of its polyphagy and evidence of an ancient whole-genome duplication event. GigaScience 10, https://doi.org/10.1093/gigascience/giab016 (2021).
Gendreau, K. L. et al. House spider genome uncovers evolutionary shifts in the diversity and expression of black widow venom proteins associated with extreme toxicity. BMC Genomics 18, 178 (2017).
Article PubMed PubMed Central Google Scholar
Liu, S., Aageaard, A., Bechsgaard, J. & Bilde, T. DNA Methylation Patterns in the Social Spider, Stegodyphus dumicola. Genes 10, https://doi.org/10.3390/genes10020137 (2019).
Sanchez-Herrero, J. F. et al. The draft genome sequence of the spider Dysdera silvatica (Araneae, Dysderidae): a valuable resource for functional and evolutionary genomic studies in chelicerates. GigaScience 8, https://doi.org/10.1093/gigascience/giz099 (2019).
Sheffer, M. M. et al. Chromosome-level reference genome of the European wasp spider Argiope bruennichi: a resource for studies on range expansion and evolutionary adaptation. GigaScience 10, https://doi.org/10.1093/gigascience/giaa148 (2021).
Yu, N. et al. Classification and functional characterization of spidroin genes in a wandering spider, Pardosa pseudoannulata. Insect Biochem Mol Biol. 151, 103862 (2022)
Purcell, J. & Pruitt, J. N. Are personalities genetically determined? Inferences from subsocial spiders. BMC Genomics 20, 867 (2019).
Article PubMed PubMed Central Google Scholar
Arakawa, K. et al. 1000 spider silkomes: linking sequences to silk physical properties. Sci. Adv. 8, https://doi.org/10.1126/sciadv.abo6043 (2022).
Babb, P. L. et al. The Nephila clavipes genome highlights the diversity of spider silk genes and their complex expression. Nat. Genet. 49, 895–903 (2017).
Article CAS PubMed Google Scholar
Kono, N. et al. Orb-weaving spider Araneus ventricosus genome elucidates the spidroin gene catalogue. Sci. Rep. 9, https://doi.org/10.1038/s41598-019-44775-2 (2019).
Kono, N. et al. Multicomponent nature underlies the extraordinary mechanical properties of spider dragline silk. Proc. Natl Acad. Sci. USA 118, https://doi.org/10.1073/pnas.2107065118 (2021).
Sanggaard, K. W. et al. Spider genomes provide insight into composition and evolution of venom and silk. Nat. Commun. 5, https://doi.org/10.1038/ncomms4765 (2014).
Wang, Z. et al. Chromosome-level genome assembly of the black widow spider Latrodectus elegans illuminates composition and evolution of venom and silk proteins. GigaScience 11, https://doi.org/10.1093/gigascience/giac049 (2022).
Zhu, B. et al. Chromosomal-level genome of a sheet-web spider provides insight into the composition and evolution of venom. Mol. Ecol. Resour. 22, 2333–2348 (2022).
Article CAS PubMed Google Scholar
Asakura, T. Structure and dynamics of spider silk studied with solid-state nuclear magnetic resonance and molecular dynamics simulation. Molecules 25, https://doi.org/10.3390/molecules25112634 (2020).
Michalek, O., Kuhn-Nentwig, L. & Pekar, S. High specific efficiency of venom of two prey-specialized spiders. Toxins 11, 687 (2019).
Baker, R. H., Corvelo, A. & Hayashi, C. Y. Rapid molecular diversification and homogenization of clustered major ampullate silk genes in Argiope garden spiders. PLoS Genet. 18, e1010537 (2022).
Article CAS PubMed PubMed Central Google Scholar
Babb, P. L. et al. Characterization of the genome and silk-gland transcriptomes of Darwin’s bark spider (Caerostris darwini). PLoS ONE 17, e0268660 (2022).
Henriques, S. et al. The genome sequence of the cave orb-weaver, Meta bourneti (Simon, 1922). Wellcome Open Res. 7, 311 (2022).
Article PubMed PubMed Central Google Scholar
Cerca, J. et al. The Tetragnatha kauaiensis genome sheds light on the origins of genomic novelty in spiders. Genome Biol. Evol. 13, https://doi.org/10.1093/gbe/evab262 (2021).
Adams, S. A. et al. Reference genome of the long-jawed orb-weaver, Tetragnatha versicolor (Araneae: Tetragnathidae). J. Hered. https://doi.org/10.1093/jhered/esad013 (2023).
Hilbrant, M., Damen, W. G. & McGregor, A. P. Evolutionary crossroads in developmental biology: the spider Parasteatoda tepidariorum. Development 139, 2655–2662 (2012).
Article CAS PubMed Google Scholar
Hendrickx, F. et al. A masculinizing supergene underlies an exaggerated male reproductive morph in a spider. Nat. Ecol. Evol. 6, 195 (2021).
Article PubMed Google Scholar
Zhong, W., Tan, Z., Wang, B. & Yan, H. Next-generation sequencing analysis of Pardosa pseudoannulata’s diet composition in different habitats. Saudi J. Biol. Sci. 26, 165–172 (2019).
Article CAS PubMed Google Scholar
Veenstra, J. A. Neuropeptide evolution: chelicerate neurohormone and neuropeptide genes may reflect one or more whole genome duplications. Gen. Comp. Endocrinol. 229, 41–55 (2016).
Article CAS PubMed Google Scholar
Blankers, T., Oh, K. P., Bombarely, A. & Shaw, K. L. The genomic architecture of a rapid island radiation: recombination rate variation, chromosome structure, and genome assembly of the Hawaiian Cricket. Laupala. Genet. 209, 1329–1344 (2018).
Article CAS Google Scholar
Yoshizawa, K. et al. Mitochondrial phylogenomics and genome rearrangements in the barklice (Insecta: Psocodea). Mol. Phylogenet. Evol. 119, 118–127 (2018).
Article CAS PubMed Google Scholar
Iida, K., Cox-Foster, D. L., Yang, X., Ko, W. Y. & Cavener, D. R. Expansion and evolution of insect GMC oxidoreductases. BMC Evol. Biol. 7, https://doi.org/10.1186/1471-2148-7-75 (2007).
Starrett, J., Garb, J. E., Kuelbs, A., Azubuike, U. O. & Hayashi, C. Y. Early events in the evolution of spider silk genes. PLoS ONE 7, e38084 (2012).
Article CAS PubMed PubMed Central Google Scholar
Palmer, J. M. The silk and silk production system of the funnel-web mygalomorph spider Euagrus (Araneae, Dipluridae). J. Morphol. 186, 195–207 (1985).
Article PubMed Google Scholar
Correa-Garhwal, S. M., Babb, P. L., Voight, B. F. & Hayashi, C. Y. Golden orb-weaving spider (Trichonephila clavipes) silk genes with sex-biased expression and atypical architectures. G3 11, https://doi.org/10.1093/g3journal/jkaa039 (2021).
Gatesy, J., Hayashi, C., Motriuk, D., Woods, J. & Lewis, R. Extreme diversity, conservation, and convergence of spider silk fibroin sequences. Science 291, 2603–2605 (2001).
Article CAS PubMed Google Scholar
Garb, J. E., DiMauro, T., Lewis, R. V. & Hayashi, C. Y. Expansion and intragenic homogenization of spider silk genes since the triassic: evidence from mygalomorphae (Tarantulas and their kin) spidroins. Mol. Biol. Evol. 24, 2454–2464 (2007).
Article CAS PubMed Google Scholar
Bittencourt, D., Dittmar, K., Lewis, R. V. & Rech, E. L. A MaSp2-like gene found in the Amazon mygalomorph spider Avicularia juruensis. Comp. Biochem. Physiol. B Biochem. Mol. Biol. 155, 419–426 (2010).
Article PubMed Google Scholar
Bittencourt, D., Oliveira, P. F., Prosdocimi, F. & Rech, E. L. Protein families, natural history and biotechnological aspects of spider silk. Genet Mol. Res. 11, 2360–2380 (2012).
Article CAS PubMed Google Scholar
Sarkar, A. et al. Molecular evolutionary analysis of the widespread piggyBac transposon family and related "domesticated" sequences. Mol. Genet. Genomics 270, 173–180 (2003).
Article CAS PubMed Google Scholar
Yusa, K. piggyBac Transposon. Microbiol. Spectr. 3, https://doi.org/10.1128/microbiolspec.MDNA3-0028-2014 (2015).
Bonizzoni, M., Gomulski, L. M., Malacrida, A. R., Capy, P. & Gasperi, G. Highly similar piggyBac transposase-like sequences from various Bactrocera (Diptera, Tephritidae) species. Insect Mol. Biol. 16, 645–650 (2007).
CAS PubMed Google Scholar
Bonasio, R. et al. Genomic comparison of the ants Camponotus floridanus and Harpegnathos saltator. Science 329, 1068–1071 (2010).
Article CAS PubMed PubMed Central Google Scholar
Daimon, T. et al. Recent transposition of yabusame, a novel piggyBac-like transposable element in the genome of the silkworm, Bombyx mori. Genome 53, 585–593 (2010).
Article CAS PubMed Google Scholar
Hikosaka, A., Kobayashi, T., Saito, Y. & Kawahara, A. Evolution of the Xenopus piggyBac transposon family TxpB: Domesticated and untamed strategies of transposon subfamilies. Mol. Biol. Evol. 24, 2648–2656 (2007).
Article CAS PubMed Google Scholar
Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).
Article CAS PubMed Google Scholar
Mitra, R. et al. Functional characterization of piggyBat from the bat Myotis lucifugus unveils an active mammalian DNA transposon. Proc. Natl Acad. Sci. USA 110, 234–239 (2013).
Article CAS PubMed Google Scholar
Wu, M., Sun, Z. C., Hu, C. L., Zhang, G. F. & Han, Z. J. An active piggyBac-like element in Macdunnoughia crassisigna. Insect Sci. 15, 521–528 (2008).
Article CAS Google Scholar
Bede, J. C., Musser, R. O., Felton, G. W. & Korth, K. L. Caterpillar herbivory and salivary enzymes decrease transcript levels of Medicago truncatula genes encoding early enzymes in terpenoid biosynthesis. Plant Mol. Biol. 60, 519–531 (2006).
Article CAS PubMed Google Scholar
Diezel, C., von Dahl, C. C., Gaquerel, E. & Baldwin, I. T. Different lepidopteran elicitors account for cross-talk in herbivory-induced phytohormone signaling. Plant Physiol. 150, 1576–1586 (2009).
Article CAS PubMed PubMed Central Google Scholar
Sun, W. et al. Expansion of the silkworm GMC oxidoreductase genes is associated with immunity. Insect Biochem. Mol. Biol. 42, 935–945 (2012).
Article CAS PubMed Google Scholar
Ahmad, S. T., Joyce, M. V., Boggess, B. & O'Tousa, J. E. The role of Drosophila ninaG oxidoreductase in visual pigment chromophore biogenesis. J. Biol. Chem. 281, 9205–9209 (2006).
Article CAS PubMed Google Scholar
Sarfare, S., Ahmad, S. T., Joyce, M. V., Boggess, B. & O'Tousa, J. E. The Drosophila ninaG oxidoreductase acts in visual pigment chromophore production. J. Biol. Chem. 280, 11895–11901 (2005).
Article CAS PubMed Google Scholar
Benton, R., Vannice, K. S., Gomez-Diaz, C. & Vosshall, L. B. Variant ionotropic glutamate receptors as chemosensory receptors in Drosophila. Cell 136, 149–162 (2009).
Article CAS PubMed PubMed Central Google Scholar
Abuin, L. et al. Functional architecture of olfactory ionotropic glutamate receptors. Neuron 69, 44–60 (2011).
Article CAS PubMed PubMed Central Google Scholar
Ganguly, A. et al. A molecular and cellular context-dependent role for Ir76b in detection of amino acid taste. Cell Rep. 18, 737–750 (2017).
Article CAS PubMed PubMed Central Google Scholar
Hussain, A. et al. Ionotropic chemosensory receptors mediate the taste and smell of polyamines. PLoS Biol. 14, https://doi.org/10.1371/journal.pbio.1002454 (2016).
van Giesen, L. & Garrity, P. A. More than meets the IR: the expanding roles of variant ionotropic glutamate receptors in sensing odor, taste, temperature and moisture. F1000Res. 6, https://doi.org/10.12688/f1000research.12013.1 (2017).
Vieira, F. G. & Rozas, J. Comparative genomics of the odorant-binding and chemosensory protein gene families across the Arthropoda: origin and evolutionary history of the chemosensory system. Genome Biol. Evol. 3, 476–490 (2011).
Article CAS PubMed PubMed Central Google Scholar
Zhang, X. X. & Wang, G. R. Advances in research on the identification and function of ionotropic receptors in insects. Insect Sci. 57, 1046–1055 (2020).
Google Scholar
Greppi, C. et al. Mosquito heat seeking is driven by an ancestral cooling receptor. Science 367, 681–684 (2020).
Article CAS PubMed PubMed Central Google Scholar
Shiao, M. S. et al. Expression divergence of chemosensory genes between Drosophila sechellia and its sibling species and its implications for host shift. Genome Biol. Evol. 7, 2843–2858 (2015).
Article CAS PubMed PubMed Central Google Scholar
Platnick, N. & Jaeger, P. A new species of the basal araneomorph spider genus Ectatosticta. ZooKeys 16, 209–215 (2009).
Article Google Scholar
Toft, S. & Macias-Hernandez, N. Prey acceptance and metabolic specialisations in some Canarian Dysdera spiders. J. Insect Physiol. 131, https://doi.org/10.1016/j.**sphys.2021.104227 (2021).
Bushnell, B. BBMap. https://sourceforge.net/projects/bbmap/ (2014).
Vurture, G. W. et al. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics 33, 2202–2204 (2017).
Article CAS PubMed PubMed Central Google Scholar
Vaser, R. & Šikić, M. Time- and memory-efficient genome assembly with Raven. Nat. Comput. Sci. 1, 332–336 (2021).
Article Google Scholar
Roach, M. J., Schmidt, S. A. & Borneman, A. R. Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies. BMC Bioinforma. 19, 460 (2018).
Article CAS Google Scholar
Hu, L., Chen, Q., Yao, J., Shao, Z. & Chen, X. Structural changes in spider dragline silk after repeated supercontraction-stretching processes. Biomacromolecules 21, 5306–5314 (2020).
Article CAS PubMed Google Scholar
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Article CAS PubMed PubMed Central Google Scholar
Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).
Article CAS PubMed PubMed Central Google Scholar
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, https://doi.org/10.1186/1471-2105-10-421 (2009).
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl Acad. Sci. USA 117, 9451–9457 (2020).
Article CAS PubMed PubMed Central Google Scholar
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinforma. 4, 4.10.1–4.10.14 (2009).
Google Scholar
Campbell, M. S., Holt, C., Moore, B. & Yandell, M. Genome annotation and curation using MAKER and MAKER-P. Curr. Protoc. Bioinforma. 48, 4.11.1–4.11.39 (2014).
Article Google Scholar
Kim, D., Landmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).
Article CAS PubMed PubMed Central Google Scholar
Kovaka, S. et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 20, https://doi.org/10.1186/s13059-019-1910-1 (2019).
Bruna, T., Lomsadze, A. & Borodovsky, M. GeneMark-EP+: eukaryotic gene prediction with self-training in the space of genes and proteins. NAR Genom. Bioinforma. 2, https://doi.org/10.1093/nargab/lqaa026 (2020).
Stanke, M., Steinkamp, R., Waack, S. & Morgenstern, B. AUGUSTUS: a web server for gene finding in eukaryotes. Nucleic Acids Res. 32, W309–W312 (2004).
Article CAS PubMed PubMed Central Google Scholar
Bruna, T., Hoff, K. J., Lomsadze, A., Stanke, M. & Borodovsky, M. BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genom. Bioinforma. 3, https://doi.org/10.1093/nargab/lqaa108 (2021).
Cantalapiedra, C. P., Hernandez-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).
Article CAS PubMed PubMed Central Google Scholar
Buchfink, B., **e, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
Article CAS PubMed Google Scholar
Apweiler, R. et al. The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res. 29, 37–40 (2001).
Article CAS PubMed PubMed Central Google Scholar
Mulder, N. & Apweiler, R. InterPro and InterProScan: tools for protein sequence classification and comparison. Methods Mol. Biol. 396, 59–70 (2007).
Article CAS PubMed Google Scholar
Mistry, J. et al. Pfam: The protein families database in 2021. Nucleic Acids Res. 49, D412–D419 (2021).
Article CAS PubMed Google Scholar
Mi, H. Y. et al. The PANTHER database of protein families, subfamilies, functions and pathways. Nucleic Acids Res. 33, D284–D288 (2005).
Article CAS PubMed Google Scholar
Lewis, T. E. et al. Gene3D: extensive prediction of globular domains in proteins. Nucleic Acids Res. 46, D435–D439 (2018).
Article CAS PubMed Google Scholar
Pandurangan, A. P., Stahlhacke, J., Oates, M. E., Smithers, B. & Gough, J. The SUPERFAMILY 2.0 database: a significant proteome update and a new webserver. Nucleic Acids Res. 47, D490–D494 (2019).
Article CAS PubMed Google Scholar
Marchler-Bauer, A. et al. CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Res. 45, D200–D203 (2017).
Article CAS PubMed Google Scholar
Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).
Article CAS PubMed PubMed Central Google Scholar
Chan, P. P., Lin, B. Y., Mak, A. J. & Lowe, T. M. tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res. 49, 9077–9096 (2021).
Article CAS PubMed PubMed Central Google Scholar
Manni, M., Berkeley, M. R., Seppey, M. & Zdobnov, E. M. BUSCO: assessing genomic data quality and beyond. Curr. Protoc. 1, https://doi.org/10.1002/cpz1.323 (2021).
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, https://doi.org/10.1186/s13059-019-1832-y (2019).
Nakamura, T., Yamada, K. D., Tomii, K. & Katoh, K. Parallelization of MAFFT for large-scale multiple sequence alignments. Bioinformatics 34, 2490–2492 (2018).
Article CAS PubMed PubMed Central Google Scholar
Gutierrez, S., Silla Martinez, J. M. & Gabaldon, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).
Article Google Scholar
Kueck, P. & Longo, G. C. FASconCAT-G: extensive functions for multiple sequence alignment preparations concerning phylogenetic studies. Front. Zool. 11, https://doi.org/10.1186/s12983-014-0081-x (2014).
Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).
Article CAS PubMed PubMed Central Google Scholar
Puttick, M. N. MCMCtreeR: functions to prepare MCMCtree analyses and visualize posterior ages on trees. Bioinformatics 35, 5321–5322 (2019).
Article CAS PubMed Google Scholar
Mendes, F. K., Vanderpool, D., Fulton, B. & Hahn, M. W. CAFE 5 models variation in evolutionary rates among gene families. Bioinformatics 36, 5516–5518 (2020).
Article CAS Google Scholar
De Bie, T., Cristianini, N., Demuth, J. P. & Hahn, M. W. CAFE: a computational tool for the study of gene family evolution. Bioinformatics 22, 1269–1271 (2006).
Article PubMed Google Scholar
Vizueta, J., Rozas, J. & Sanchez-Gracia, A. Comparative genomics reveals thousands of novel chemosensory genes and massive changes in chemoreceptor repertories across chelicerates. Genome Biol. Evol. 10, 1221–1236 (2018).
Article CAS PubMed PubMed Central Google Scholar
Vizueta, J., Sanchez-Gracia, A. & Rozas, J. bitacora: a comprehensive tool for the identification and annotation of gene families in genome assemblies. Mol. Ecol. Resour. 20, 1445–1452 (2020).
Article CAS PubMed Google Scholar
Steinegger, M. & Soding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 35, 1026–1028 (2017).
Article CAS PubMed Google Scholar
Subramanian, B., Gao, S. H., Lercher, M. J., Hu, S. N. & Chen, W. H. Evolview v3: a webserver for visualization, annotation, and management of phylogenetic trees. Nucleic Acids Res. 47, W270–W275 (2019).
Article CAS PubMed PubMed Central Google Scholar
Chao, J. et al. MG2C: a user-friendly online tool for drawing genetic maps. Mol. Horticulture 1, https://doi.org/10.1186/s43897-021-00020-x (2021).
Wang, Y. et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40, e49 (2012).
Article CAS PubMed PubMed Central Google Scholar
Chen, C. et al. TBtools: an integrative toolkit developed for interactive analyses of big biological data. Mol. Plant 13, 1194–1202 (2020).
Article CAS PubMed Google Scholar
Iwasaki-Yokozawa, S., Akiyama-Oda, Y. & Oda, H. Genome-scale embryonic developmental profile of gene expression in the common house spider Parasteatoda tepidariorum. Data Brief. 19, 865–867 (2018).
Article PubMed PubMed Central Google Scholar
Liao, Y., Smyth, G. K. & Shi, W. FeatureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Article CAS PubMed Google Scholar
Fan, Z., Wang, L. Y. **ao, L., & Zhang, Z. S. Chromosome-level genome assembly of lampshade web spider Ectatosticta davidi. Sci. Data Bank https://doi.org/10.57760/sciencedb.06872. (2022).

Download references

Acknowledgements

Many thanks to two anonymous reviewers and Dr. Chao Tong (University of Pennsylvania, USA) for their constructive suggestions. This research supported by National Key R&D Program of China (Grant No. 2022YFC2601200); the National Science & Technology Fundamental Resources Investigation Program of China (Nos. 2022FY100500, 2019FY100400, and 2019FY101800); the project of the Northeast Asia Biodiversity Research Center (2572022DS09); the Bureau of International Cooperation, Chinese Academy of Sciences; the Second Tibetan Plateau Scientific Expedition and Research Program (STEP), Grant No. 2019QZKK05010101 to Ming Bai. This research is also funded by the Natural Science Foundation of Chongqing (No. cstc2019jcyj-zdxmX0006) to Zhisheng Zhang.

Author information

These authors contributed equally: Zheng Fan, Lu-Yu Wang, Lin **ao.
These authors jointly supervised this work: Ning Liu, Zhi-Sheng Zhang, Ming Bai.

Authors and Affiliations

Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, 100101, Bei**g, China
Zheng Fan, Ning Liu & Ming Bai
School of Life Sciences, Southwest University, 400700, Chongqing, China
Zheng Fan, Lu-Yu Wang, Lin **ao, Bing Tan, Bin Luo, Tian-Yu Ren & Zhi-Sheng Zhang
Northeast Asia Biodiversity Research Center, Northeast Forestry University, 150040, Harbin, China
Ming Bai
University of Chinese Academy of Sciences, 100049, Bei**g, China
Ming Bai

Authors

Zheng Fan
View author publications
You can also search for this author in PubMed Google Scholar
Lu-Yu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lin **ao
View author publications
You can also search for this author in PubMed Google Scholar
Bing Tan
View author publications
You can also search for this author in PubMed Google Scholar
Bin Luo
View author publications
You can also search for this author in PubMed Google Scholar
Tian-Yu Ren
View author publications
You can also search for this author in PubMed Google Scholar
Ning Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Sheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ming Bai
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

N.L., Z.Z. and M.B. designed of the original research. Z.F., L.Y.W. and L.X. performed data analysis and drafted the manuscript. B.T., B.L. and T.Y.R. performed sample preparation and data analysis. All authors revised and approved the final manuscript.

Corresponding authors

Correspondence to Ning Liu, Zhi-Sheng Zhang or Ming Bai.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Biology thanks Marshal Hedin and the other, anonymous, reviewers for their contribution to the peer review of this work. Primary Handling Editors: Joao Valente.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Fan, Z., Wang, LY., **ao, L. et al. Lampshade web spider Ectatosticta davidi chromosome-level genome assembly provides evidence for its phylogenetic position. Commun Biol 6, 748 (2023). https://doi.org/10.1038/s42003-023-05129-x

Download citation

Received: 31 December 2022
Accepted: 10 July 2023
Published: 18 July 2023
DOI: https://doi.org/10.1038/s42003-023-05129-x
Springer Nature Limited

Lampshade web spider Ectatosticta davidi chromosome-level genome assembly provides evidence for its phylogenetic position

Abstract

Similar content being viewed by others

Insights into the karyotype and genome evolution of haplogyne spiders indicate a polyploid origin of lineage with holokinetic chromosomes

The house spider genome reveals an ancient whole-genome duplication during arachnid evolution

Chromosome-level genome assembly of the snakefly Mongoloraphidia duomilia (Raphidioptera: Raphidiidae)

Introduction

Results

Chromosome-level genome sequencing, assembly, and annotation

Synteny analysis among E. davidi, Trichonephila antipodiana (Araneidae), and Latrodectus elegans (Theridiidae)

Phylogenetic analysis

GMC gene family

Ir/iGluR and cytochrome P450 gene family

Silk and venom genes in E. davidi

Discussion

The high-quality genome sequence of E. davidi provides a valuable resource for studying spiders’ evolution and adaptability

The genome of E. davidi supports the previous phylogenomics hypothesis

Gene family analysis suggests the unique adaptation evolution of E. davidi

Methods

Sample collection and sequencing

Genome assembly and annotation

Phylogenetic analyses and divergence time estimation

Gene family evolution analysis

Annotation of gene families

Phylogenetic analyses of the gene families

Synteny analysis

GMC gene expression analysis

Statistics and reproducibility

Reporting summary

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Supplementary information

Reporting Summary

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation