Background

Silkworm, a model for Lepidoptera, is a holometabolous insect whose developmental stages include egg, five larval instars, pupa, and adult. During molting and metamorphosis, conspicuous and relatively abrupt changes are seen in its cuticle. Insect cuticle is mainly composed of chitin nanofibres embedded in a matrix of cuticular proteins. In procuticle, a grou** of what has been called the exo- and endocuticle, cuticular proteins bound to chitin and cross-linked with the sclerotizing agents form one of the most infrangible known biological coverings [1]. Generally, cuticle plays essential roles in many physiological functions to protect the insect's body from dehydration, the invasion of pathogens, the penetration of insecticides, and physical injury [25].

As an important component of cuticle, hundreds of cuticular protein sequences have been identified in over 20 species of insects [6]. Many conserved motifs were identified in this data including R&R Consensus [7], CPF&CPFL [8], Tweedle [9], and others. Among them, the cuticular protein sequences containing R&R Consensus (CPR) were extensively studied in Anopheles gambiae, Drosophila melanogaster, Bombyx mori, and Apis mellifera by the annotation of genomic data [1013]. Togawa and coworkers subsequently examined the expression profile of 156 CPR genes in A. gambiae by real-time RT-PCR and found that most of them were expressed at single or multiple periods associated with molting [14].

Our bioinformatic analysis and previous work of others have identified more than two hundred cuticular protein genes in the silkworm genome [12], indicating that the silkworm employs more than 1.5% of its estimated protein-coding genes to encode cuticular proteins. These observations led us to focus on the following three questions: 1) How many genes including cuticular protein genes are expressed in silkworm epidermal tissues? 2) Is the expression of a special set of cuticular protein genes metamorphic stage-specific? and 3) Are cuticular protein genes coordinately regulated? The sequencing of the silkworm genome along with microarray technology offered us an opportunity to investigate gene expression profiles on a large scale to answer these questions. Eleven developmental stages were selected, which ranged from day 4 of the fourth instar larva to day 8 of pupa, and microarray-based expression profile analysis of all detectable genes in silkworm epidermal tissues was performed. Our data showed that a total of 6676 genes including the vast majority of silkworm cuticular protein genes were activated in selected stages, with no correlation between expression patterns and the presence of conserved motifs. In addition, twenty-six CPR protein genes distributed on chromosome 22 were co-expressed in larval and wandering stages and three common elements were identified in the 2 kb upstream region of these co-expressed CPR genes.

Results

Developmental expression profile of genes in epidermal tissues

In silkworm, oligonucleotide microarrays were employed to examine gene expression profiles of ten tissues on day 3 of the fifth instar larval stage, as reported by ** genes and eight yeast intergenic sequences were dotted in one block as positive and external controls, respectively. Dual channel microarray hybridization was performed with a Cy3-labeled control sample and Cy5-labeled test sample. Total RNAs extracted from the whole body of silkworm at day 3 of the fifth instar larvae served as a normalization control for data analysis.

Silkworm strain and reagents

Silkworm larvae (p50 strain) maintained at the Institute of Sericulture and System Biology (Southwest University, China) were reared on mulberry leaves at 25°C~26°C. Silkworms grow through five instars until cocoon spinning which begins at the end of the fifth instar larva day 7. After spinning for three days, silkworms develop into the pupal stage, which takes about 10 days, followed by emergence from the cocoon, mating and egg lay. We selected sixteen time points around the molting phases, ranged from the first instar larva day 3 to pupa day 8. Considering the small body size, we used whole larval bodies from the first to the third instar larvae to isolate total RNA. From the fourth instar larva day 4 to pupa day 8, we collected epidermal tissues to isolate total RNA. TRIzol reagent was obtained from Invitrogen (Carlsbad, CA, USA). Reverse transcriptase was made in Promega (Madison, WI, USA). ECL direct nucleic acid labeling and detection system was from GE Healthcare (Buckinghamshire, UK).

RNA isolation, amplification, labeling and array hybridizations

Total RNAs were isolated using TRIzol reagent and further purified using a NucleoSpin RNA clean-up kit (Macherey-Nagel, Germany). The amplification and labeling of mRNA were performed as described in previous studies [15, 54]. Five micrograms of total RNA were primed with 1 μl of 100 μM primer containing T7 RNA polymerase promoter sequence at 70°C for 10 min, then reversed transcribed at 42°C for 2 h in the presence of 200 U CbcScript (CapitalBio Corp, China). The second strand of cDNA was synthesized at 16°C for 2 h in the presence of RNaseH and DNA polymerase. cRNA was synthesized by T7 Enzyme Mix (CapitalBio Corp, China) using the cDNA template. 2 μl of cRNA were primed with 1 μl random primer at 65°C for 10 min, then reverse transcribed at 25°C for 10 min and 37°C for 1.5 h in the presence of CbcScript II (CapitalBio Corp, China). The Cy3- and Cy5-dCTP double-stranded cDNA was labeled using a CapitalBio cRNA Amplification and Labeling Kit (CapitalBio, Bei**g, China). Cy5-dCTP or Cy3-dCTP were added at a final concentration of 120 μM of each dATP, dGTP, and dTTP and 60 μM dCTP and 40 μM Cy5-dCTP for test samples. For reference samples, Cy3-dCTP was used. The Cy3- and Cy5-dCTP double-stranded cDNA was dissolved in 80 μl hybridization solution containing 3 × SSC, 0.2%SDS, 5 × Denhart's, and 25% formamide. The slides were covered with a LifterSlip™ coverslip (Erie Company, Portsmouth, NH, USA) and hybridized in a closed chamber at 42°C over-night. After hybridization, slides were washed three times in 0.2% SDS, 2 × SSC at 42°C for 5 minutes and three times in 0.2 × SSC at room temperature for 5 minutes before signal scanning.

Microarray data processing and analysis

The slides were scanned with a confocal LuxScan scanner (CapitalBio Corp.) and the raw data were extracted using LuxScan™ 3.0 software (CapitalBio Corp.). For dual-channels microarray data, the scanning setting for Cy3 and Cy5 channels were balanced by visual inspection of the external control spots. The LOWESS (Locally Weighted Scatterplot Smoothing) method was used to normalize the dual channel data using all the signals from the Cy3-labeled sample. The ratios of signal intensity of test and control samples were used to perform clustering analysis. The one with a fluorescence intensity higher than 800 after subtracting the background was considered as an expressed gene since the signal greater than that detection level was more reliable. The expression of a cuticular protein gene was defined by the ratio of the original signal intensity divided by 800. The X-fold values were used in the subsequent clustering analysis to display the expression of cuticular protein genes at different developmental stages. HCL (Hierarchical Clustering) analysis was carried out using both Cluster 3.0 software and Mev software (version 4.2.01) [55, 56]. In addition, Cluster 3.0 software was used for K-mean clustering analysis. Mev software was also used for QT (quality threshold) clustering. The parameter setting for clustering analysis was based on the distance metric of the Pearson correlation and the average linkage method. TreeView software was used to display heat map of clustering results. Gene ontology analysis was performed at the BGI WEGO website [57].

Computational identification of putative regulatory elements

The MEME algorithm (Multiple Expectation maximization for Motif Elicitation) was used to identify common elements present in the 2 kb promoter regions upstream of the transcription start sites of cuticular protein genes [58]. TOMTOM motif comparison tool was used to compare the elements identified in this study to known motifs [59]. In TOMTOM analysis, the TRANSFAC database was selected and the Pearson correlation coefficient was employed to survey the Motif Column Comparison Function. FIMO (Find Individual Motif Occurrences) was applied to search for whether the identified regulatory elements existed upstream of other genes [19]. In FIMO analysis, the Anopheles_gambiae _EnsEMBL_upstream database was selected as the reference and the p-value output threshold was set at l × e-5. TESS (Transcription Element Search System) was applied to search the binding sites for known insect transcription factors from the TRANSFAC database [20].

Northern hybridization

Northern hybridization was performed to confirm the microarray data. The sequences of cuticular protein genes used to design the hybridization probes were obtained from the Silkworm Genome Database [17]. DEPC water was employed to prepare the related solutions and to clean the associated equipments. Five micrograms total RNA per sample was loaded to perform denaturing formaldehyde gel electrophoresis. The transfer of RNA from gel to Hybond+ (GE) membrane was completed in 2 hr by using Transfer Equipment (Amersham Biosciences). All reagents used in prehybridization, probes labeling, hybridization and signal detection were provided by the Amersham ECL Direct Nucleic Acid Labeling and Detection Systems (GE Healthcare, Cat No: PRN 3001), which is based on enhanced chemiluminescence. The optimized temperature of hybridization mixture 42°C was adopted to protect the activity of the horseradish peroxidase. The cDNA probes, which were labeled with the enzyme horseradish peroxidase, were completely denatured to single-strand form to hybridize the target RNA on Hybond+ membrane. Membranes were washed to a stringency of 0.1 × SSC, and labeling and detection were carried out according to the manufacturer's instructions.