Background

Malaria remains one of the most important mosquito-borne infectious parasitic diseases in tropical and subtropical areas [1, 2]. According to the World Health Organization (WHO), an estimated 212 million new cases and 429,000 deaths associated with malaria were reported in 2015 [3]. Malaria has been extensively endemic in China for more than 4000 years, and several large-scale outbreaks of malaria have occurred in Chinese history [4, 1. The website has been tested on multiple platforms (Linux, Windows and Mac OS) with different web browsers (Firefox, Chrome, IE and Safari). The resistance surveillance map was not compatible with Firefox. We recommend using IE, Chrome, Opera or Safari to open the webpage of the insecticide-resistant map.

Fig. 1
figure 1

Overview of the ASGDB architecture

Results

Data content

There are three types of data included in ASGDB. Basic data: 19,317 putative protein-coding genes and 2425 non-coding RNAs predicted from 9597 An. sinensis assembled scaffolds are obtained. For all protein-coding genes, 1424 GO terms, 221 known pathways and 58,207 orthologs are displayed in ASGDB [36]. ASGDB also contains 4047 differentially expressed genes under different conditions from transcriptomic studies, including deltamethrin-resistant/ deltamethrin-susceptible and female/male [36]. Insecticide resistance-related gene data: ASGDB provides the identification and classification information of 93 cytochrome P450s (P450s), 31 glutathione S-transferases (gsts), 50 choline/carboxylesterases (cces) and 238 cuticular proteins from An. sinensis. Related information of these gene families in other species is also provided for comparison. The database also contains 13 unpublished datasets provided by Dr Chang (Bengbu Medical College). Altogether, 551 insecticide-resistant related events are stored and managed in ASGDB, and a user-friendly web interface was developed to help users search and use these data. A summary of data content in ASGDB is shown in Table 1.

Table 1 Summary of ASGDB content

Web interface and homepage

ASGDB interface provides direct links to eight individual pages including Home, JBrowse, Search, Download, Resistance-related gene, Resistance surveillance, Contact and Tutorial. All the links are clickable icons. A left mouse click on the icon leads to the respective page.

A brief introduction to ASGDB and photographs of An. sinensis taken at different life-cycle stages (egg, larva, pupa and adult) are included at the top part of the homepage. Below these, the latest publications related to An. sinensis are displayed. A link to the list of all relevant publications is also provided, allowing users acquire information of interest quickly and easily. “Resistance surveillance map” provides a map-based interface to access insecticide resistance-related data. All the collection sites are spotted with red pompons on the map. If users are interested in this content, they can click on the map and will be taken to a new page to acquire more detailed information. “Visitor Statistics” shows the number of visitors to the website every week.

JBrowse

Here, we use the JBrowse genome browser to visualize the whole draft genome sequence of An. sinensis. The page is divided into two parts. The relatively small panel on the left is a list of different types of factors, on which the icons, from top to bottom, are: “GC content”, “gene”, “exon”, “miRNA”, “rRNA”, “tRNA” and “reference sequence”. All these icons for each scaffold can be tracked to the detailed view. The larger window on the right is the tracks display region. Each track can be turned off by clicking the “cross” in front of the title, which allows users to hide unwanted information for a better user experience. Dragging the tracks up or down can change the positions of tracks to display datasets of interest at the top for convenience. JBrowse also provides efficient panning and zooming of a genomic region in the genome via embedded navigation buttons. With the help of these important and efficient visualization modules, users can easily browse and search on a large scale in a graphic interface.

Users can search for scaffolds to locate regions on the An. sinensis genome. These scaffolds are freely selected from the drop-down menu. If the users are interested in a particular region on the scaffold, they can also enter the starting and terminal positions of the region to retrieve detailed information. In the genomic view, rectangular frames with directionality represent the corresponding genes or ncRNAs from the positive or negative strand. A single click on the frame will open an information table, which provides detailed information such as annotation, location, GO, KEGG and sequences. The sequence data for the selected gene can be downloaded in the same page as FASTA files. Other people can see the same region of the An. sinensis genome and the collection of open tracks on their screen when the visible URL (accessible either via the browser address bar or the “Share” button) is shared.

Search

ASGDB affords a user-friendly search engine to make it easy to reach specific genes of interest. There are two sub-categories in the search part: simple search and BLAST search. To browse different types of genetic factors, a simple search can be performed using the following parameters: (i) NCBI or ASGDB accession numbers; (ii) Gene name or symbol; (iii) GO ID or GO term; and (iv) KEGG ID or KEGG annotation. Users can enter these parameters to obtain specific gene information from ASGDB, and fuzzy queries are supported. All the matched genes will be linked in the search job when more than one gene is matched with the input keyword. The BLAST search allows searching of genes using the ViroBLAST [48]. Users perform similarity searches against each type of sequences using various BLAST search forms (BLASTn, BLASTp, BLASTx, tBLASTn and tBLASTx). The reference database used for BLAST is all nucleic acid and amino acid sequences of An. sinensis. Users can enter nucleotide or protein query sequences or upload a local sequence file in the FASTA format to search against the reference database. The BLAST search tool allows users to set their favourite parameters, such as threshold, Word size, etc., in advanced search.

This multi-functional search module makes it easier to obtain a comprehensive view of each gene. Taking CYP 9 J53 as an example (Fig. 2), users can input many types of keywords, e.g. “KFB49800.1”, “CYP 9 J53” or “9 J53” as search content. Pressing on the “GeneID” button will display the detailed information for this gene. At the top part of the gene information page, the users can view some fundamental information about CYP 9 J53, such as description, length and location. Clicking on the right “JBrowse” button enables users to visualize CYP 9 J53 under the background of the scaffold. Below these, the sequence information of CYP 9 J53 is presented. The exon regions are highlighted in red, and the remaining sequences are introns. Clicking on the “show pep” button allows amino acid sequences to be displayed. The lower portion of the information page is the functional feature description of CYP 9 J53, including orthologs, GO and KEGG pathways. In the ortholog part, many-one or many-many orthologous genes to the An. sinensis CYP 9 J53 in other mosquito species and fruit fly are displayed. There are links to get the sequences of the orthologs of An. sinensis genes in VectorBase and FlyBase. The prediction of GO terms shows that CYP 9 J53 belongs to “iron ion binding” (0005506), “electron carrier activity” (0009055), “heme binding” (0020037) and “oxidation-reduction process” (0055114) categories. Users can click the GO term for detailed term information. CYP 9 J53 participates in “Linoleic acid metabolism” pathway. Clicking the KO (ko00591) will open a new page to show the reference pathway map. Transcriptional results are shown at the bottom of the results page and include technology, comparison, regulation, fold change, published articles, source as a whole. Moreover, users can perform BLAST to find the best hit for the gene of interest via copy-paste sequences or upload sequence FASTA file. Different parameters can be reset to filter and parse the results again. Click one of the links in Score field will locate the pair-wise alignment between the query sequence and subject sequence.

Fig. 2
figure 2

Screenshot showing the application of ASGDB for searching information

Download

ASGDB provides bulk data downloads, including assembled genome sequences, nucleotide sequences of putative genes, gene annotation and amino acid sequences. All data are hosted and accessible publicly via a browser directly. Meanwhile, manual curation of literature related to An. sinensis is carried out to fulfil the increasing research demands. We have collected An. sinensis-related English literature from PubMed (http://www.ncbi.nlm.nih.gov/pubmed/) and Chinese literature from CNKI (http://www.cnki.net/), Wanfang (http://www.wanfangdata.com.cn/) and VIP (http://www.cqvip.com/). The general information is organized as formatted lists, including title, authors, journal, year and volume. ASGDB also offers some full-text article links of English literature.

Insecticide resistance-related genes

ASGDB provides the identification and classification information of detoxification enzyme superfamilies and cuticular proteins from An. sinensis. Three detoxification enzyme superfamilies (P450s, GSTs and CCEs) are primarily responsible for metabolic resistance in mosquitoes [49]. The cuticle is a major route of insecticide penetration, thickening or changing in the chemical composition of the cuticle serve as another resistance mechanism [50]. Although cuticular resistance has not yet been fully characterized at the molecular level, several examples of putative cps that are the primary components of insect cuticle have been identified as the potential players in insecticide resistance [51,52,53,54].

On the basis of a literature review and in-depth data analysis, ASGDB provides detoxification enzyme and cp information from several mosquito species and the fruit fly Drosophila melanogaster, allowing users to perform rapid and convenient comparative analyses among different Diptera insect species. To aid insecticide resistance research, these genes are further classified into four P450 clans, seven gst classes, ten cce clades and eleven cp families.

Resistance surveillance

The current release of ASGDB has recorded 551 insecticide-resistant phenotypic and genotypic events in An. sinensis in China. So far, the geographical distributions of the data cover the majority of An. sinensis distribution areas in China.

There are two ways to retrieve insecticide-resistant information in the database (Fig. 3). ASGDB provides an OpenLayers map-based interface for users to obtain data. All the mosquito-sampling sites are identified with small pompons. By clicking on the pompon, a pop-up text box will appear on the map with the most recent insecticide-resistant related record. We choose different colours for pompons to indicate insecticide-resistant levels (Grey: Uncertain; Green: Susceptible; Yellow: Probable resistant; Red: Resistant). Users can also browse all the relevant information in this region on the same page just below the map. It should be noted that when different sampling sites are in proximity, the red pompons might be very near or even superposed on each other. To avoid clicking on the non-target pompon, the users can zoom in to magnify their view of the map by tap** the “plus” button in the upper left corner or by scrolling up with mouse’s scroll wheel. Double left click on the map can also simultaneously centre the map and zoom in at the position clicked. Once users think they are done, they can use the “minus” button or scroll down with mouse’s scroll wheel to zoom out. To move the map, users can click and drag at any point on the map. ASGDB also provides the users with a search engine to facilitate obtaining the information of interest, such as collection site, collection year, insecticide and resistance mechanism. For example, if the users want to perform an insecticide based search using “deltamethrin”, they can type two letters “de” and a list of all possible matching terms will appear. Users can choose the requested term and click on the “Search” button, and only the matching pompons will appear on the map, and all records related to deltamethrin resistance will appear below the map. The retrieved information can be downloaded in bulk.

Fig. 3
figure 3

Screenshot showing how to search and retrieve insecticide-resistant related data on “Resistance surveillance” page

The data were manually extracted and concisely presented in three aspects: “Collection details”, “Insecticide resistance monitoring” and “Insecticide resistance mechanisms”. “Collection details” provides temporal and spatial information regarding sample collection. “Insecticide resistance monitoring” surveys the insecticide resistance status in the field population under different insecticide selection pressures. “Insecticide resistance mechanisms” provides further investigations of the resistance mechanisms, including the analysis of gene expression changes, kdr or ace-1 mutations, or elevated enzyme activities of P450s, GSTs and CCEs. We also provided the citation of sources.

Users are encouraged to share their job data related to the insecticide resistance by using the submission procedure, which can ease the process of data collection and sharing, and benefit the dispersal of knowledge. In addition to relevant data, the participants also need to provide valid and open personal information, including institution and e-mail addresses. Although time consuming, it will not only improve content reliability but also increase users’ collaborations and communications.

Contact

To establish a platform for community integration of An. sinensis data and to aid efficient management of knowledge on An. sinensis, ASGDB requires more participation in knowledge curation. Researchers can submit any comments, suggestions, or questions regarding all aspects of ASGDB. To promote researchers’ sharing and exchange of knowledge and ideas, the submission process is simple. No registration requirement is imposed, although users need to provide a valid email address so that our team can contact them in case of any queries.

Tutorial

To help facilitate access and utilize these data, a general tutorial is also available in ASGDB. It provides schematic overview and demonstrates how to get started and navigate through the main features of ASGDB.

Discussion

The basic objective of the ASGDB is to provide an integrative and comparative genomic resource particular to An. sinensis. It gathers a wide variety of genetic information, such as the whole draft genome sequences, annotation, pathway, GO terms, orthologous relationships and differentially expressed. The diverse data integration makes it possible to display correlations among various genetic factors and thus to will help users obtain genetic information faster and more accurately, which is a fundamental step in exploring valuable information for further study. As a one-stop resource platform, ASGDB contains a user-friendly interface, convenient search options and enhanced visualization tools, all of which make it easy for researchers to access and analyse whole-genome genomic data and information for An. sinensis, even those with little knowledge of bioinformatics.

Another aim of ASGDB is to share insecticide resistance information for An. sinensis. We collect resistance-related gene information in An. sinensis and other species. This genome-level data-mining strategy of resistance-related genes will also be useful for the follow-up functional study of resistance. For example, researchers could conveniently choose one specific gene of interest or a group of genes (e.g. by classification or gene-expanded clusters) to investigate how resistance phenotypes are generated. The information for genes differentially expressed between deltamethrin-susceptible and deltamethrin-resistant strains should also help identify candidate genes of interest. On the basis of data from the published literature, ASGDB provides a panoramic view of current insecticide-resistant studies of An. sinensis in China. These data cover 19 provinces and municipalities in China region, which vary substantially in geography, and economic and social environment. The records were initiated in the mid-1980s and continue to this day. These records systematically and continuously track the dynamics of the insecticide-resistant phenotypic and genotypic information, providing invaluable information to help us understand how insecticide resistance occurs and spreads in An. sinensis temporally and spatially. The excessive use of agricultural insecticides should be slowed before the occurrence of high resistance; therefore, the phenotypic information will be useful to adjust the types and concentrations of insecticides in a rotation scheme to fit local environmental conditions. It could promote the existing resistance management strategies to prolong the effectiveness of insecticides and prevent the occurrence of resistance. The evolution of insecticide resistance is conferred through complex mechanisms, typically requiring the interaction of multiple genes [55,56,57,58]. The genotypic data in ASGDB could provide clues to explain the molecular mechanism of insecticide resistance systematically. For example, we could observe whether target-site and metabolic resistance mechanisms occur singly or simultaneously, or judge which mechanism plays more important role in An. sinensis at different insecticide-resistance levels. The information of insecticide resistance at the mechanistic level, combined with results of bioassays, could also assist in providing powerful molecular diagnostic tools to aid the monitoring of insecticide resistance in An. sinensis at an early stage.

With increasing research on An. sinensis, genome re-sequencing, transcriptomic, proteomic and other omics studies are expected to grow continuously, especially with the development of sequencing technologies in the next few years. ASGDB will integrate more types of data and will be updated periodically to fulfil the growing research needs in addressing the genetic complexity of An. sinensis. Meanwhile, more insecticide resistance information from neighbouring countries will be integrated into ASGDB in the next step. Also, we encourage researchers to share insecticide-resistant data with the whole scientific community. To attract more participation from the scientific community for ASGDB and to make it an important platform for the insecticide resistance studies of An. sinensis, we will develop an incentive system to reward participants according to their contributions.

Conclusions

To further extend our understanding of resistance mechanism and facilitate the implementation of resistance management strategies for mosquito vector control programs, a bioinformatics database named ASGDB was developed. High-quality draft genome sequence integrated with insecticide resistance-related literature, ASGDB provides (i) An. sinensis genome database; (ii) insecticide resistance-related gene data; and (iii) the insecticide resistance phenoty** and genoty** data in An. sinensis in China region. ASGDB was built to help users to mine data from the genome sequence of An. sinensis easily and effectively, especially with its advantages in insecticide resistance surveillance and control. The resistance surveillance related information collected in ASGDB provides dynamic and evolutionary resistance records in whole China region. Detailed records of insecticide resistance status of An. sinensis are useful for adjusting the types of insecticides in a rotation scheme to fit local environmental conditions.