Introduction

Parvoviruses are widespread pathogens that cause a variety of diseases in animals, including gastroenteritis, panleukopenia, cerebellar hypoplasia and myocarditis [1]. The family Parvoviridae is divided into two subfamilies: Parvovirinae and Densovirinae. The subfamily Parvovirinae is divided into five genera: Parvovirus, Erythrovirus, Dependovirus, Amdovirus and Bocavirus [2]. Bocaviruses have a linear single-stranded DNA genome of around 5 kb of either plus or minus polarity. The virus particles are non-enveloped, exhibit icosahedral symmetry, and have a diameter of 26 nm [3]. The genome contains three open reading frames (ORFs) encoding two nonstructural proteins, NS1 and NP1, and two structural proteins, the viral capsid proteins VP1 and VP2 [46].

Since 1970, bocaviruses have been detected in dogs (MVC) [5], cows [6] (BPV), humans [4] (HBoV), swine [7, 8] (PBoV), gorillas [9] (GBoV) and California sea lions [10] (CslBoV). HBoV was first identified in respiratory samples from Swedish children in 2005 [4]. Thus far, four species of human bocavirus have been identified [4, 1113]. Human bocaviruses have been associated with respiratory and enteric diseases in humans, though more evidence is needed [4, 11, 12, 1416]. The first PBoV, referred to as porcine boca-like virus (PBo-likeV; 1879 bp) was identified in porcine lymph nodes [7, 17]. Subsequently, two novel porcine bocaviruses (PBoV1-CHN and PBoV2-CHN) were reported in China [8]. In 2010, two novel bocavirus species (PBoV3-UK and PBoV4-UK) were isolated in cell culture from pigs with clinical post-weaning multisystemic wasting syndrome (PMWS) in Northern Ireland [18]. Recently, another two bocaviruses (PBoV3/4-HK) have been identified [19].

To elucidate the biological characteristics and even the possible cross-species transmission of the bocaviruses, further study of diverse porcine bocaviruses is necessary. Furthermore, the present nomenclature of PBoV is confused, and it is essential to develop a uniform nomenclature system. In this study, a novel porcine bocavirus (PBoV3C) was identified in healthy piglets with a high detection rate. The genome structure of this novel bocavirus was studied as well. We also propose a uniform PBoV nomenclature based on the VP1 gene.

Materials and methods

Specimen collection and DNA extraction

All fecal specimens were collected from healthy piglets (<15 days of age) from three different farms in Lulong County, Hebei Province, China, during 2006 to 2009. They were transported to the laboratory in dry ice and frozen at −80 °C before analysis. Samples were screened for rotavirus, calicivirus, and astrovirus by ELISA or routine PCR. Nine of the samples were negative for the viruses, and two of them were positive for PBoV 6 V/7 V (by routine PCR). These nine samples were mixed together to extract total viral nucleic acids as follows: the specimen was suspended in Hanks’ balanced salt solution (HBSS), vortexed vigorously, centrifuged at 15,000×g, and the supernatant was then filtered through 0.45-μm and 0.22-μm membrane filters (Millipore). Total nucleic acid was extracted from filtrates using a Viral Nucleic Acid Extraction Kit (Geneaid Biotech, Sijhih, Taipei, Taiwan) according to the manufacturer’s protocol.

High-throughput DNA sequencing

The whole cDNA library, which was generated by the random primer PCR method, was used in high-throughput DNA sequencing using the Genome Sequencer FLX Titanium pyrosequencing technology and reagents (Roche). All of the sequences that were generated were analyzed with a customized informatics pipeline [20], and all filtered viral sequences were then compared to those in the GenBank database using BLASTn and BLASTx.

Detection of a putative porcine bocavirus

The 92 fecal specimens collected from September to November of 2006 were screened for a putative PBoV. We designed two degenerate primers targeted to contig01006 (67.3 % identity at the nucleotide level with HBoV3), PBoV1-CHN and PBoV2-CHN (forward primer, 5′-TCAGACTCHTRAACWTCCAGG-3′; reverse primer, 5′-GCACAATGACTGGGTGGA-3′) to amplify a 268-nt region within the ORF of NS1 using Primer Premier 5. The reaction conditions were determined by preliminary gradient PCR studies: after 5 min at 95 °C, 35 cycles of amplification (94 °C for 1 min, 58 °C for 1 min and 72 °C for 1 min) were performed, followed by a final extension for 10 min at 72 °C. PCR products were purified and sequenced. The sequences of PCR products were compared with all known bocavirus NS1 sequences in the GenBank database.

Genome sequencing and assembly of PBoV3C

We used a specific primer PCR and genome-walking kit (TaKaRa) to obtain complete sequences of the potential novel PBoV. PCR products were run on a 1.5 % agarose gel. Positive fragments (200-1500 bp) were gel-purified using a QIAquick Gel Extraction kit (QIAGEN) and cloned into pGEM®-T Easy Vector (Promega). The plasmid inserts were sequenced and then compared with entries in the GenBank database. The final nearly full-length sequence, except the termini, was assembled using DNAMAN6 and confirmed by independent PCR using specific primers across overlap** regions. The sequence was submitted to the GenBank database (accession no. JN681175).

Genomic and phylogenetic analysis

The sequences were aligned and adjusted using ClustalW and Bioedit. To identify the open reading frames (ORFs), an ORF finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html) was used. Phylogenetic trees were obtained using the neighbor-joining (NJ) method with 1,000 bootstrap replicates using the MEGA ver. 4 software package. Analysis of recombination between PBoV3C and other parvoviruses was conducted using SimPlot. The secondary structure of the VP2 protein was predicted by ExPASy (http://www.expasy.ch/tools/), including sopma (http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_sopma.html) and GOR (http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_gor4.html); furthermore, we made use of PredictProtein (http://www.predictprotein.org/).

Results

High-throughput DNA sequencing

Based on sequence analysis, 29 contigs (147-1256 bp) exhibited 35.6 % to 98.1 % homology with bocaviruses. However, most were highly diverged from all known bocaviruses. Eleven of these contigs showed a close relationship to human bocavirus (61.1 % to 72.0 %). One of these, contig01006 (774 bp), located in a relatively conserved region of the NS1 gene, was selected for further analysis.

Detection of PBoV in fecal samples

Using degenerate primers targeted to contig01006, PBoV1-CHN and PBoV2-CHN, 53 of the 92 fecal samples were confirmed to be positive by PCR and sequencing. All of these NS1 sequences showed more than 70 % sequence identity to PBoV1-CHN and PBoV2-CHN. Within these positive samples, 18 sequences (19.6 %) shared 90 %-99 % identity with contig01006 and were therefore suspected to be members of a new species. Phylogenetic analysis revealed that these 18 sequences clustered together, separate from other bocaviruses (data not shown).

PBoV3C genome analysis

On the basis of one of the 18 sequences (Z557), specific primer PCR and genome-walking PCR were used to amplify the complete genomic sequence. Nearly full-length genomic sequences, with the exception of the bilateral terminus, were obtained: PBoV3C (5235 bp; accession no. JN681175). The base composition of PBoV3C was 26.7 % A, 19.9 % T, 25.5 % C and 27.9 G %, with 53.4 % GC. Using NCBI’s ORF Finder, three ORFs of PBoV3C were predicted. The left ORF encodes NS1 (336-2393 bp, 685 amino acids), the middle ORF encodes NP1 (2501-3175 bp, 224 amino acids), and the right ORF encodes VP1 (3165-5228 bp, 687 amino acids) and VP2 (3573-5228 bp, 551 amino acids).

Multiple alignments of the NS1 region of all bocaviruses revealed that the most conserved region was located at sites 427-592 (corresponding to base pairs 406-571 of reference NS1 sequence of PBoV3C). NP1 of PBoV3C shared a short overlap** sequence with the VP1 gene. It is noteworthy that this phenomenon occurs in all known bocaviruses except PBo-likeV. Like those of other bocaviruses, VP1 and VP2 of PBoV3C are overlap** proteins, with a distinction of 138 amino acids at the N-terminus. We also found a calcium-dependent phospholipase A2 (PLA2) enzymatic activity motif in the VP1 unique protein (VP1u), which could enhance capsid release into the cytoplasm. The HDXXY motif is the catalytic center, and the conserved YLGPF motif is the Ca2+-binding loop of PLA2. However, neither was found in PBo-likeV.

Prediction of the secondary structure of the viral capsid protein

Referring to the current secondary structure of VP2 of other parvoviruses such as B19 [21], FPV [22], AAV2 [23], MVM [24], and HBoV [25], a putative VP2 protein structure map (Fig. 1) was constructed. Conservation of core structural features in the VP2 protein, including the eight-stranded β-barrel, the αA helix, and the DE and HI loops, was consistent with their essential functions in capsid assembly and stability. Between the strands conserved at the primary structure level, many variable regions (VRs, commonly represented in the loop structures) were present, and these formed the surface of the viral capsid. For example, the GH loop between the βG and βH strands contained ~240 residues.

Fig. 1
figure 1

Prediction of the structure of VP2 of PBoV3C. (red, αA helix; blue, β-barrel; green, DE-loop and HI-loop, which form a channel on the surface; white, variable regions; yellow, bilateral terminus)

Phylogenetic and recombination analysis

Phylogenetic analysis of the nearly complete genome sequences of PBoV3C and other parvoviruses indicated that PBoV3C was more closely related to PBoV3/4-HK and PBoV3/4-UK [18, 19], with 78 %-81 % sequence identity. The genetic distance of PBoV3C to its closest relatives in the NS1 gene (PBoV3-HK) was 19 %. Furthermore, compared with other bocaviruses, PBoV3C showed nucleotide sequence divergence of >19 % and >16 % in NP1 and VP1, respectively. According to the most recent International Committee on Taxonomy of Viruses (ICTV) species demarcation criteria for members of the genus Bocavirus (at least 5 % divergence in nonstructural gene nucleotide sequences), PBoV3C represents a novel PBoV species (Fig. 2).

Fig. 2
figure 2

Phylogenetic analysis of viruses in the family Parvoviridae. The phylogenetic tree was generated from nearly full-length nucleotide sequences from members of the subfamilies Parvovirinae (genera Parvovirus, Erythrovirus, Dependovirus, Amdovirus, and Bocavirus) and Densovirinae, using the MEGA 4.1 software (neighbor-joining method with 1,000 bootstrap replicates). PBoV3C is indicated by a black circle. MVC, minute virus of canines; BPV, bovine parvovirus; HBoV, human bocavirus; PBoV-WUH1, porcine boca-like virus; GBoV, gorilla bocavirus; CslBoV, California sea lion bocavirus

In the NS1 region, PBoV3C shared the highest amino acid identities (77.5 %-81 %) with PBoV3/4-UK and PBoV3/4-HK. The tree showed five clusters: PBoV1-CHN, PBoV2-CHN, MVC and CslBoV comprised cluster 1; PBo-likeV, cluster 2; PBoV3/4-UK, PBoV3/4-HK and PBoV3C, cluster 3; HBoV and GBoV, cluster 4; and BPV alone comprised cluster 5 (Fig. 3a).

Fig. 3
figure 3

Phylogenetic relationships of three protein sequences (NS1, NP1, VP1) encoded by ORFs of bocaviruses. Phylogenetic trees of NS1 (a), NP1 (b) and VP1 (c) based on amino acid sequences were constructed by MEGA ver. 4.1 software (neighbor-joining method with 1,000 bootstrap replicates). PBoV3C is labeled with a black circle. MVC, minute virus of canines; BPV, bovine parvovirus; HBoV, human bocavirus; PBoV-WUH1, porcine boca-like virus; 6 V/7 V, partial Chinese porcine bocavirus sequences; GBoV, gorilla bocavirus; CslBoV, California sea lion bocavirus

In the NP1 region, there was also a high degree of amino acid sequence identity (71.7 %-73.4 %) between PBoV3C and PBoV3/4-UK and PBoV3/4-HK. However, compared with PBoV-WUH1 (PBo-likeV), PBoV3C showed very low similarity (28 %). A phylogenetic tree based on amino acid sequences indicated that PBoV could be divided into three clusters, which were represented by PBo-likeV, PBoV1-CHN and PBoV3C (Fig. 3b). Bootscan analysis showed possible recombination between PBoV clusters 2 and 3 in the NP1 region, which generate the NP1 of PBoV cluster 1 (Fig. 4). Putative recombination regions were located around site 346 in our alignment (corresponding to site 267 of the reference NP1 sequence of PBoV-WUH1 HQ223038). Additionally, for all bocaviruses, the putative recombination breakpoint near position 346 was also the demarcation between the conserved region and non-conserved regions of NP1 (Clustal X2, data not shown).

Fig. 4
figure 4

Putative recombination in the bocavirus NP1 region. Bootscan analysis using the complete NP1 gene sequence of PBoV cluster1 (PBoV-WUH1 and PBoV-H18) as the query sequence (SimPlot 3.5.1, F84 model; window size, 200 bp; step, 10 bp) on a nucleotide alignment, generated using Clustal_X2. PBoV cluster3 (PBoV3C, PBoV3/4-UK and PBoV3/4-HK) is indicated by a red line, and PBoV cluster2 (PBoV1/2-CHN and PBoV-A6) is indicated by a deep blue line

A phylogenetic tree based on the VP1 amino acid sequence showed three clusters. HBoV and GBoV comprised cluster 1, as they did in trees based on NS1 and NP1. PBoV3C was the most closely related to 6 V/7 V (82.9 %/84.1 %), PBoV3/4-UK (76 %/84.6 %) and PBoV3/4-HK (76.3 %/83.8 %). Similar to HBoV and MVC, PBoV3/4-UK, PBoV3/4-HK and PBoV-3C were more closely related to BPV and HBoV in the VP1 region, consequently forming cluster 2. Other bocaviruses, including PBo-likeV, PBoV1-CHN, PBoV2-CHN, MVC and CslBoV, constituted cluster 3 (Fig. 3c).

Discussion

Next-generation genome sequencing is a new molecular biology technology. This method expedites the entire process of novel virus discovery, identification and genome sequencing. In this study, high-throughput DNA sequencing and relevant data analysis were used to generate large numbers of cDNA sequences from fecal samples from piglets. In this way, a novel porcine bocavirus, PBoV3C, was successfully identified. Further, its genome of 5235 nucleotides was sequenced using the genome-walking method.

As a novel member of genus Bocavirus, the pathogenicity of PBoV has not been recognized clearly, with limited evidence. Blomstrom et al. detected PBo-likeV in 88 % and 46 % of pigs with and without PMWS, respectively, suggesting that PBo-likeV may be associated with PMWS [7]. Zhai et al. found a PBo-likeV positivity rate of 38.7 % in pigs suffering from respiratory tract symptoms and 7.3 % in healthy pigs [26]. However, in 2010, a serological investigation showed a high prevalence (nearly 40 %) of PBo-likeV in healthy pigs in Hubei Province, China [27]. Indeed, other PBo-likeV, PBoV1/2-CHN, PBoV 3/4-UK, and the newly reported PBoV3/4-HK, all had a high prevalence in healthy pigs [8, 18, 19, 28]. In this study, a high PBoV detection rate (57.6 %) was found in healthy piglets using a pan-PCR method. These sequences shared only 69 %-89.9 % sequence identity with known porcine bocaviruses, strongly suggesting the presence of a novel porcine bocavirus. These results indicate high diversity, prevalence and distribution of bocaviruses in swine. More novel bocaviruses will probably be identified in swine in the future.

Like other bocaviruses, the PBoV3C genome contained three ORFs encoding the NS1, NP1 and VP1/2 proteins. NS1, the major nonstructural protein in parvoviruses with DNA helicase and ATPase activity, is essential for DNA replication. As a relatively conserved protein, NS1 divergence is a common standard for species definition [2]. The molecular diagnostic screening test based on the NS1 sequence is the most important method for detecting bocaviruses. At the nucleotide level, PBoV3C shared only 80.2 % NS1 sequence identity with the most proximate member of the genus Bocavirus, PBoV4-UK. The second relatively conserved gene, NP1, is unique to bocaviruses. Although the biological function of NP1 is currently not clear, NP1 has been shown to be relevant to the process of DNA replication [29]. We found possible recombination in NP1, and the recombination breakpoint was at the boundary between the relatively conserved and non-conserved regions. No other recombination signal was found in either the complete genome or coding genes by SimPlot analysis (data not shown).

VP1 is a bocavirus capsid protein and probably influences tissue tropism and, potentially, pathogenesis [30]. A calcium-dependent phospholipase A2 (PLA2) enzymatic activity motif was found in VP1u of PBoV3C, which increases the efficiency of viral release [31]. A previous study indicated that VP2 plays a critical role in bocavirus antigenicity and infectivity [25]. In the absence of VP1, VP2 retained the capacity to assemble HBoV virus-like particles (VLPs) [32]. A detailed analysis of the viral VP2 sequence and secondary structure is important to understand viral capsid functions. Like HBoV, VP2 of PBoV3C contained an eight-stranded β-barrel, the αA helix, the DE loop, the HI loop and nine predicted variable surface regions (VRI-VRIX) [25]. Conserved secondary structural elements such as the β-barrel core and the αA helix, as well as the DE and HI loops, were found in PBoV3C by alignment with known bocaviruses. It is noteworthy that in parvoviruses, the conserved DE loops clustered closer to the 5-fold axis and constrict a channel that is postulated to have several functions, such as VP2 externalization [33], VP1u externalization to facilitate its PLA2 function [3437], and genomic DNA packaging. Genetic variation among bocaviruses is observed in the N-terminal, VRs and C-terminal regions. Compared with other bocaviruses, different degrees of deletions, especially in the EF and GH loops, were found in PBoV3C. Most parvovirus VRs are located within the loops between the β-strands and occur at and around the 3-fold axis [38], (which contains viral epitopes). Further evolutionary analysis of these gene loci is important to elucidate the tissue tropism, pathogenicity and recombination of members of the genus Bocavirus during infection [39].

In this study, we noted that classification results using different ORFs were incongruent, but in general, and consistent with a previous study [28], porcine bocaviruses can be divided into three groups: group 1 (represented by PBo-likeV), group 2 (PBoV1/2-CHN and PBoV2-A6), and group 3 (PBoV3/4-UK, PBoV3/4-HK and PBoV3C). Based on phylogenetic clustering and the homology matrix of known porcine bocaviruses (data not shown), we propose for future classification that PBoV strains showing >40 % nucleotide sequence difference in the complete VP1 gene should be considered members of different groups, whereas those showing >10 % nucleotide sequence difference should be considered members of different subgroups. The VP1 protein was selected because it is likely to strongly influence tissue tropism and, potentially, pathogenesis [30], and VP1-based classification (18 % protein and >10 % nucleotide difference in the complete VP1 gene) has been used for HBoV strains [13]. Therefore, groups 1, 2 and 3 should be defined as PBoV1, PBoV2 and PBoV3, respectively (Fig. 2). PBoV3 can be further divided into five subgroups: PBoV3A, PBoV3B, PBoV3C, PBoV3D and PBoV3E (previously known as PBoV3-UK, PBoV4-UK, PBoV3C, PBoV3-HK and PBoV4-HK).

Phylogenetic analysis indicated that the relationship between porcine bocaviruses and other bocaviruses is complicated. For example, PBoV2 and PBoV1 were closer to MVC and CslBoV in the NS1 region, and PBoV3 was closer to BPV, HBoV and GBoV in the VP1 region. All of the above evidence indicates that porcine bocaviruses are highly diverse and complex. Transmission of viral agents across the species barrier is known, such as in the cases of SARS coronavirus and the highly pathogenic avian influenza H5N1 virus. This has also occurred in parvovirus, as indicated by the recent emergence of canine parvovirus (CPV2) derived from feline parvovirus (FPV). The molecular mechanisms underlying viral evolution include mutation and several forms of recombination [40]. Compared with mutation, recombination usually leads to more rapid virus evolution. Parvoviruses, among the DNA viruses, are characterized by relatively rapid evolution [4145], with a rate close to that of some RNA viruses [46]. Several studies have indicated possible recombination between members of the same bocavirus species, such as HBoV [12, 13, 47] and PBoV [19]. Potential recombination of NP1 within PBoV was also observed in this study; however, no other recombination events were detected at the whole-genome level. To date, no definite evidence of recombination of bocaviruses from different hosts exists. However, further research on animal bocaviruses will lead to a greater understanding of the molecular changes necessary for bocaviruses to adapt to a new host, leading to possible cross-species transmission.