Background

Coronaviruses (CoVs) (order Nidovirales, family Coronaviridae, subfamily Coronavirinae) are enveloped viruses with a positive sense, single-stranded RNA genome. With genome sizes ranging from 26 to 32 kilobases (kb) in length, CoVs have the largest genomes for RNA viruses. Based on genetic and antigenic criteria, CoVs have been organised into three groups: α-CoVs, β-CoVs, and γ-CoVs (Table 1) [1, 2]. Coronaviruses primarily infect birds and mammals, causing a variety of lethal diseases that particularly impact the farming industry [3, 4]. They can also infect humans and cause disease to varying degrees, from upper respiratory tract infections (URTIs) resembling the common cold, to lower respiratory tract infections (LRTIs) such as bronchitis, pneumonia, and even severe acute respiratory syndrome (SARS) [5,6,7,8,9,10,11,12,14]. In recent years, it has become increasingly evident that human CoVs (HCoVs) are implicated in both URTIs and LRTIs, validating the importance of coronaviral research as agents of severe respiratory illnesses [7, 9, 15,16,17].

Table 1 Organisation of CoV species (adapted from Jimenez-Guardeño, Nieto-Torres [18])

Some CoVs were originally found as enzootic infections, limited only to their natural animal hosts, but have crossed the animal-human species barrier and progressed to establish zoonotic diseases in humans [19,20,21,22,23]. Accordingly, these cross-species barrier jumps allowed CoVs like the SARS-CoV and Middle Eastern respiratory syndrome (MERS)-CoV to manifest as virulent human viruses. The consequent outbreak of SARS in 2003 led to a near pandemic with 8096 cases and 774 deaths reported worldwide, resulting in a fatality rate of 9.6% [24]. Since the outbreak of MERS in April 2012 up until October 2018, 2229 laboratory-confirmed cases have been reported globally, including 791 associated deaths with a case-fatality rate of 35.5% [25]. Clearly, the seriousness of these infections and the lack of effective, licensed treatments for CoV infections underpin the need for a more detailed and comprehensive understanding of coronaviral molecular biology, with a specific focus on both their structural proteins as well as their accessory proteins [26,27,28,29,30]. Live, attenuated vaccines and fusion inhibitors have proven promising, but both also require an intimate knowledge of CoV molecular biology [29, 31,32,33,34,35,36].

The coronaviral genome encodes four major structural proteins: the spike (S) protein, nucleocapsid (N) protein, membrane (M) protein, and the envelope (E) protein, all of which are required to produce a structurally complete viral particle [29, 37, 38]. More recently, however, it has become clear that some CoVs do not require the full ensemble of structural proteins to form a complete, infectious virion, suggesting that some structural proteins might be dispensable or that these CoVs might encode additional proteins with overlap** compensatory functions [35, 37, 39,40,41,42]. Individually, each protein primarily plays a role in the structure of the virus particle, but they are also involved in other aspects of the replication cycle. The S protein mediates attachment of the virus to the host cell surface receptors and subsequent fusion between the viral and host cell membranes to facilitate viral entry into the host cell [42,43,44]. In some CoVs, the expression of S at the cell membrane can also mediate cell-cell fusion between infected and adjacent, uninfected cells. This formation of giant, multinucleated cells, or syncytia, has been proposed as a strategy to allow direct spreading of the virus between cells, subverting virus-neutralising antibodies [45,46,47].

Unlike the other major structural proteins, N is the only protein that functions primarily to bind to the CoV RNA genome, making up the nucleocapsid [48]. Although N is largely involved in processes relating to the viral genome, it is also involved in other aspects of the CoV replication cycle and the host cellular response to viral infection [49]. Interestingly, localisation of N to the endoplasmic reticulum (ER)-Golgi region has proposed a function for it in assembly and budding [50, 51]. However, transient expression of N was shown to substantially increase the production of virus-like particles (VLPs) in some CoVs, suggesting that it might not be required for envelope formation, but for complete virion formation instead [41, 42, 52, 53].

The M protein is the most abundant structural protein and defines the shape of the viral envelope [54]. It is also regarded as the central organiser of CoV assembly, interacting with all other major coronaviral structural proteins [29]. Homotypic interactions between the M proteins are the major driving force behind virion envelope formation but, alone, is not sufficient for virion formation [54,55,56]. Interaction of S with M is necessary for retention of S in the ER-Golgi intermediate compartment (ERGIC)/Golgi complex and its incorporation into new virions, but dispensable for the assembly process [37, 45, 57]. Binding of M to N stabilises the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and, ultimately, promotes completion of viral assembly [45, 58, 59]. Together, M and E make up the viral envelope and their interaction is sufficient for the production and release of VLPs [37, 60,61,62,63,64].

The E protein is the smallest of the major structural proteins, but also the most enigmatic. During the replication cycle, E is abundantly expressed inside the infected cell, but only a small portion is incorporated into the virion envelope [65]. The majority of the protein is localised at the site of intracellular trafficking, viz. the ER, Golgi, and ERGIC, where it participates in CoV assembly and budding [66]. Recombinant CoVs have lacking E exhibit significantly reduced viral titres, crippled viral maturation, or yield propagation incompetent progeny, demonstrating the importance of E in virus production and maturation [35, 39, 40, 67, 68].

Main text

The envelope protein

Structure

The CoV E protein is a short, integral membrane protein of 76–109 amino acids, ranging from 8.4 to 12 kDa in size [69,70,71]. The primary and secondary structure reveals that E has a short, hydrophilic amino terminus consisting of 7–12 amino acids, followed by a large hydrophobic transmembrane domain (TMD) of 25 amino acids, and ends with a long, hydrophilic carboxyl terminus, which comprises the majority of the protein (Fig. 1) [1, 60, 72,73,74,75]. The hydrophobic region of the TMD contains at least one predicted amphipathic α-helix that oligomerizes to form an ion-conductive pore in membranes [76,77,78].

Fig. 1
figure 1

Amino Acid Sequence and Domains of the SARS-CoV E Protein. The SARS-CoV E protein consists of three domains, i.e. the amino (N)-terminal domain, the transmembrane domain (TMD), and the carboxy (C)-terminal domain. Amino acid properties are indicated: hydrophobic (red), hydrophilic (blue), polar, charged (asterisks) [78]

Comparative and phylogenetic analysis of SARS-CoV E revealed that a substantial portion of the TMD consists of the two nonpolar, neutral amino acids, valine and leucine, lending a strong hydrophobicity to the E protein [79]. The peptide exhibits an overall net charge of zero, the middle region being uncharged and flanked on one side by the negatively charged amino (N)-terminus, and, on the other side, the carboxy (C)-terminus of variable charge. The C-terminus also exhibits some hydrophobicity but less than the TMD due to the presence of a cluster of basic, positively charged amino acids [80]. Computational predictions regarding the secondary structure of E suggest that the C-terminus of β- and γ-CoVs also contains a conserved proline residue centred in a β-coil-β motif [72]. This motif likely functions as a Golgi-complex targeting signal as mutation of this conserved proline was sufficient to disrupt the localization of a mutant chimeric protein to the Golgi complex and instead localized the protein to the plasma membrane [81].

The SARS-CoV E protein has recently been found to contain a binding motif known as the postsynaptic density protein 95 (PSD95)/Drosophila disc large tumour suppressor (Dlg1)/zonula occludens-1 protein (zo-1) (PDZ)-binding motif (PBM), located in the last four amino acids of the C terminus [82]. The PDZ domain is a protein-protein interaction module that can bind to the C-terminus of target proteins such as the cellular adapter proteins involved in host-cell processes important for viral infection [83,84,85,86]. Some interaction partners capable of binding to the PBM of SARS-CoV E have been identified and appears to be involved in the pathogenesis of SARS-CoV [18, 66, 82, 27,28].

Fig. 3
figure 3

Partial amino acid sequences of the E protein C-terminus for the different CoV genera. Red blocks represent the potential location of the predicted PBM motif [18]

Functions of the envelope protein

Despite its enigmatic nature, research conducted to date has been able to propose three roles for the CoV E protein. The interaction between the cytoplasmic tails of the M and E proteins drives VLP production, suggesting that E participates in (1) viral assembly [56, 61, 89]. The hydrophobic TMD of E is also crucial to the (2) release of virions [40, 53, 159]. Lastly, SARS-CoV E is implicated in the (3) pathogenesis of the virus [18, 82, 27,28, 247]. While an extensive amount of research has gone into identifying potential treatment options, most have only shown promise in vitro and will likely not progress further as they often have one or more limitations. Anti-viral candidates either exhibit only a narrow spectrum of activity, are only effective at unusually high therapeutic dosages or cause serious side effects or immune suppression [248]. A few studies have investigated the potential of rCoVs with a mutated E or lacking E, specifically focussing on SARS- and MERS-CoV, as live attenuated vaccine candidates with some promising results [34, 36, 165, 249, 250]. Vaccinated animal models developed robust immune responses, both cellular and humoral, and were protected against infective challenges. This shows that CoV vaccines with mutated or deficient in E can potentially be used for prophylactic treatment, but the duration of immunity does not seem to have been established yet.

Viruses exploit the extensive network of their host cell’s signalling pathways to promote viral replication and propagation [251, 252]. This dependence on PPIs offers the unique opportunity to target both viral-host and intraviral PPIs and, thereby, stop viral replication and propagation. Therapies that use small-molecule drugs have the advantage of small size, which allows the drugs to cross cell membranes efficiently, but it also severely limits the selectivity and targeting capabilities of the drug, which often leads to undesired side-effects [253]. Interactions between proteins take place over large, flat surface areas that feature shallow interaction sites. Small-molecule drugs, however, tend to bind to deep grooves or hydrophobic pockets not always found on the surface of target proteins, making it difficult for such drugs to disrupt PPIs (Fig. 6) [253,254,255]. Larger, protein-based therapies, on the other hand, make use of insulin, growth factors, and engineered antibodies, that form many more, and much stronger, interactions, making these therapies more potent and selective for their targets. Such properties result in fewer side-effects but the size of these agents also restricts their ability to cross the membranes of target cells [253]. This calls for therapeutic agents that can bridge the gap between molecules that are large enough to be specific and potent for their targets but still small enough to be able to cross target cell membranes efficiently and can also be manufactured easily.

Fig. 6
figure 6

Mechanisms of interaction between small molecules and proteins, and protein-protein interactions. Left: The binding of biotin to avidin occurs in a deep groove, while the interaction between the human growth hormone (hGH) and the hGH receptor (hGHR) occurs over a larger, flatter area [254]

Stapled peptides fulfil these criteria to a large extent and have been applied to various human diseases and fields such as cancer, infections, metabolism, neurology, and endocrinology [256,257,258,259,260]. In fact, Aileron Therapeutics have already developed two stapled peptides, ALRN-5281 and ATSP-7041. The company has already completed the first-in-human trail with ALRN-5281 for the treatment of rare endocrine diseases, such as adult growth hormone deficiency. Moreover, ATSP-7041 was designed to target intracellular PPIs, specifically murine double minute 2 (MDM2) and murine double minute X (MDMX) [261]. To the best of the author’s knowledge, only a few studies so far have investigated the potential of stapled peptides as antiviral agents, with promising results for both intracellular and extracellular targets. The focus so far has only been on HIV-1, RSV, and HCV [260, 262,263,264,265].

Granted, the therapeutic application of stapled peptides, particularly regarding viral infections, is still relatively new, but their numerous advantages give them tremendous potential as antiviral agents. Stapled peptides (1) can inhibit PPIs; (2) are more specific for their targets than small-molecule drugs, which also decreases the risk of unwanted side-effects; (3) can target diseases that are otherwise difficult to treat, referred to as “undruggable”; (4) can be modified easily to enhance membrane permeability, potency, and half-life; (5) have a short market time [253, 266, 267]. As more viral PPIs for CoV E are identified, the repertoire of stapled peptide targets also expands making it easier to limit viral replication, propagation, and even pathogenesis. Stapled peptides have the potential to be used as antiviral agents that can work effectively at multiple levels.

Autophagy is a cellular process that recycles excess or damaged cellular material to maintain the energy levels of the cell and ensure its survival. The material is removed from the cytoplasm by forming enclosed DMVs known as autophagosomes and then fused with lysosomes to be degraded [268, 269]. Recent studies have increasingly pointed to the involvement of autophagy components in viral infections [270]. Some suggest that it might have an antiviral function by inhibiting viral replication [271,272,273]. Others reported inhibition or subversion of autophagy as a defence mechanism to promote viral propagation [274,275,276]. Others still, notably RNA viruses, appear to exploit autophagy for the purpose of viral propagation [277, 278]. Regarding CoVs, replication of TGEV is negatively regulated by autophagy [279]. Interestingly, PRRSV activates autophagy machinery, possibly to enhance viral replication as certain components of autophagy are required for MHV replication [280, 281]. These studies suggest the possibility of CoVs exploiting autophagy for replicative purposes. It has even been proposed that the DMVs formed in CoV-infected cells might be the result of autophagy and derived from the rough ER [281]. Recently, an increase in cytosolic Ca2+, presumably from the ER lumen, has been implicated in autophagy induction by protein 2B (P2B) of the foot and mouth disease virus (FMDV) [282]. The rotavirus non-structural protein 4 (NSP4) reportedly induces autophagy by a similar mechanism [283]. Considering these studies, along with the ability of SARS-CoV to channel Ca2+, it is not inconceivable that CoV E viroporin could induce autophagy in CoV-infected cells by increasing cytosolic Ca2+. However, experimental evidence would be required to support the possibility of such a mechanism in CoVs.

The multifunctional role CoV E protein: A central role in assembly, release, and pathogenesis?

From studies, it appears that some viral proteins do not have unique, definitive functions. Despite the deletion of some viral genes, the viral life cycle continues, suggesting that other viral genes can compensate for this loss. It was recently shown to be the case for the vaccinia virus [284]. This is also evident in the varied requirements of the E protein for different CoVs and the reason(s) for this is not understood. Trafficking and maturation of TGEV virions is arrested without E [40]. Virions of MHV ΔE are capable of producing viable, replicating progeny [39]. Deletion of E from SARS-CoV attenuates the virus whereas, in the case of MERS-CoV, virions are propagation deficient [35, 165]. Certain CoV accessory proteins appear to be able to complement, or sometimes even compensate for, the absence of E in processes such as assembly, release, and the pathogenesis of some CoVs [30]. It is particularly noteworthy that SARS-CoV encodes two accessory proteins, 3a and 8a, that might exhibit relative compensatory functions in the absence of E [285, 286]. In terms of viral replication in vivo and in vitro, 3a could partially compensate for the loss of E. Moreover, 3a also contains a PBM and might be able to compensate for the loss of E to an extent but utilises different signalling pathways [285]. Although the study demonstrated that even the accessory proteins demonstrate some measure of dispensability, the virus still encodes these additional proteins with overlap** functions. The dynamics between these proteins, however, are not quite clear yet and warrants further investigation. What is clear, though, is that viroporin proteins, case in point IAV M2, can exhibit a multitude of different functions independent of their ion-channel properties [153, 184]. The studies in this review have shown that CoV E could be involved in multiple aspects of the viral replication cycle: from assembly and induction of membrane curvature to scission or budding and release to apoptosis, inflammation and even autophagy. Although a lot of progress has been made on CoV E, there is still much to be discovered about this small, enigmatic protein.