Introduction

Coronavirus is a new name for chaos that has developed since December 2019 with pandemic features and proved to be a threat to humans claiming millions of lives till now. The 2019 new coronavirus marked its onset from the Huanan Seafood Wholesale Market, Wuhan city, a Hubei province in China. 27 cases of pneumonia with unknown etiology were reported from this city on 31st December 2019. These patients who worked at or live around the seafood market showed the clinical symptoms of fever, dyspnea, dry cough and bilateral lung infiltrates on imaging [1]. The virus was identified from swab samples of patient’s throat conducted by the Chinese Centre for Disease Control and Prevention (CCDC) as novel beta coronavirus (a member of beta group of coronavirus) on 7th January 2020 and initially named as 2019-nCoV (2019 novel coronavirus) by World Health Organisation (WHO) [2]. Later, International Committee of Taxonomy of Viruses (ICTV) named this novel virus as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [3]. As the infectious diseases gradually spread worldwide and became a massive global outbreak, WHO declared the virus as sixth Public Health Emergency of International Concern on January 30th, 2020. WHO officially named this disease as COVID-19 (where CO stands for corona, VI for virus, D for disease and 19 represents the year 2019 in which it first emerged) on February 2020 [2].

Origin and progress of COVID-19

Wuhan is an emerging business hub of China and the seafood market trades in fish and variety of live animal species including poultry, bats, snakes and marmots. The outbreak of novel coronavirus gradually affected more than seventy thousand individual and killed more than eighteen hundred within the first fifty days of pandemic [4]. It was suggested from the genetic sequence of virus that the patients infected with coronavirus in Wuhan, China may have visited this market or may have consumed infected animals as source of food. While cases started increasing exponentially with no record of visiting the seafood market, thus, suggesting the virus having strong potential for human to human transmission [5]. Environmental samples from the Huanan sea food market was taken and tested positive, suggesting that the virus originated from there and likely there is a chance of transmission of pathogens from animals to human [6]. According to a genomic study, it was claimed that the role of Huanan seafood market in propagating of disease is not clear and suggested that the virus may be introduced from an unknown location into seafood market where it spread rapidly [7]. SARS-CoV-2 reported to be phylogenetically related to SARS-like bat CoV, with a sequence similarity of more than 90%, thus, suggesting that bats could be the key reservoir or zoonotic source [8]. Until recently, Lam et al. isolated Malayan pangolin CoV genomes and found 85.5–92.4% similarity to SARS-CoV-2, hence, concluded that it may be the intermediate host for SARS-CoV-2 [9]. Nonetheless, bats either directly transmit SARS-CoV-2 virus or requires an intermediate host to cause infection- this theory needs to be confirmed so that zoonotic transmission patterns could be established and understood [10]. The novel coronavirus has since spread overseas in other regions in Asia, North America, South America, Europe, Africa and Oceania and thus, making it global pandemic. The new coronavirus outbreak has not only caused the downfall of economy in all countries but also brought down medical and public health infrastructure in a tight spot [2]. The novel coronavirus has proven to be more contagious having enhanced transmission rate than SARS and MERS (middle eastern respiratory syndrome) [11].

Novel coronavirus 2019 (SARS-CoV-2)

Coronavirus are large group of viruses that mainly causes infection in respiratory and gastrointestinal tract and present in various species of birds, bats, snakes and other mammals. Coronavirus are named so due to the presence of crown-like bulbous appearance (“corona” means crown) [3]. SARS-CoV-2 belong to the subfamily Orthocoronavirinae in Coronaviridae family and order Nidovirales that consists of enveloped, positive sense ssRNA (single-stranded) genome [12]. They are spherical in shape with club-shaped spikes and a particle size of 125 nm as shown in Fig. 1a.

Fig. 1
figure 1

a Structure of respiratory syndrome causing human coronavirus. This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution and reproduction in any medium provided the original work is properly cited. Copyright@ Elsevier [11]. b The infection cycle of SARS-CoV-2 inside the host cell. The sequence of events, from host cell recognition through the release of new virion, is represented graphically as steps 1 to 12. Reprinted with permission from [91]. c Genome organization of the SARS-CoV-2. The viral genome encodes 16 Non-structural proteins (Nsps) required for replication/transcription along with the structural proteins required for the assembly of new virions. The proteins are marked below the genome with their respective coding regions. A short description of the functions of different proteins is also shown. Reprinted with permission from [91]

The subfamily coronavirinae are further subdivided on the basis of serological pattern into four genera: (a) alphacoronavirus that includes human coronavirus (HCoV)-229E and HCoV-NL63; (b) betacoronavirus includes severe acute respiratory syndrome (SARS-HCoV), HCoV-OC43, HCoV-HKU1 and MERS-CoV; (c) gamma-coronavirus contains viruses of whales and birds and; (d) delta-coronavirus consists of viruses of pigs and birds. SARS-CoV-2 along with SARS-CoV and MERS belong to beta-coronavirus [13]. The life cycle of SARS-CoV-2 infection is illustrated in Fig. 1b.

On the basis of genomic analysis, SARS-CoV-2 showed > 90% homology with bat SARS-like -CoVZXC21, 82% with human SARS-CoV and 50% with MERS-CoV [14]. SARS spread out in 2002 from Guangdong province of south-eastern China, an epidemic with unusual pneumonia and acute respiratory distress syndrome (ARDS) cases affecting 26 countries and affected 8096 people with 774 deaths by 2004. A decade later, similar case of respiratory tract infection came to light in middle east countries (Saudi Arabia, UAE) in 2012 affecting 27 countries in total including US and Malaysia with 2428 cases and 838 deaths [11]. Table 1 summarizes few differences between SARS-CoV-2, SARS and MERS.

Table 1 Characteristics and features of SARS-CoV-2, SARS-CoV and MERS

SARS-CoV-2 consists a genome of about 20–30 kb in size, encoding a large non-structural polyprotein which are proteolytically cleaved and thus, generate 15/16 proteins, accessory proteins (ORF3a, ORF6, ORF7, ORF8, ORF9) as well as four structural proteins. It also contains 14 open reading frames [14]. At the 5’ terminal region of the genome, ORF1 and ORF2 encode non-structural proteins (nsps) important for virus replication and 3’ terminal encodes structural proteins. The 4 structural proteins are outer spike glycoprotein (S), membrane (M), envelope (E) and nucleoprotein (N) which are important for virus assembly and infection. SARS-CoV-2 consists of 16 nsps encoded by largest gene, orf1ab as well as by orf1a gene [15]. SARS-CoV-2 genomic organisation is depicted in Fig. 1c.

It also expresses other polyproteins and membrane proteins, such as RNA polymerase, papain-like protease (PLpro), 3-chymotrypsin-like protease (3CLpro) and helicase [14]. To gain entry into the human cell, SARS-CoV-2 and SARS-CoV recognizes angiotensin-converting enzyme 2 (ACE2) as a key receptor while MERS requires dipeptidyl peptidase 4 (DPP4). Some other variations have been seen by researchers between SARS-CoV-2 and SARS-CoV. It is marked by the absence of 8a protein and change in number of amino acids in 3c and 8b protein in SARS-CoV-2 [16].

Current status of COVID-19

Antarctica was the only continent free of novel coronavirus but at the end of December 2020, 36 COVID-19 cases were reported. Scientists and WHO are now accessing the risk of potential transmission of coronavirus from humans to Antarctic wildlife and taking appropriate measures to protect the wildlife population (seabirds, penguins, seal, dolphins, whale) [17]. Since its emergence, the virus has undergone some mutations to adapt itself to various environmental factors like weather and population. These mutations of concern have raised public alarm. These mutations are related to proteins such as spike, envelope and membrane. Center for Disease Control and Prevention has classified few of them as variants of concern (VOC) and variants of interests (VOI). The D614G mutation at amino acid position 614 in S protein was reported in the early phase of the pandemic raised public concern. D614G mutant transmitted quickly enough to became globally dominant by June-July 2020 [18]. Due to high evolutionary rate of the virus, its transmissibility has reported to be increased. Around September 2020, another mutation emerged from South Africa with a new variant designated as B.1.351 (Beta variant, VOC) or 501Y.V2 and number of COVID-19 cases started to rise rapidly. In this variant, the mutation, N501Y, was seen in the receptor binding domain (RBD) of S protein along with E484K mutation. It was also reported that due to mutation, the binding efficiency of the virus to cell surface receptor has drastically increased and the virus can also reduce its neutralization against LY-CoV016 (monoclonal antibody); this strain has already spread over 20 countries [19]. Another new variant B.1.1.7 (Alpha variant, VOC) with N501Y mutation similar to 501Y.V1 reported to have arisen independently in SARS-CoV-2 strain in the UK and found to have 70% more transmission capacity [20]. In B.1.1.7 variant, besides N501Y, mutations in 17 other amino acids (including 8 in S protein) and a deletion at amino acid positions 69 and 70 in S protein was also reported. In early 2021, this variant rapidly spread around more than 100 countries in Europe, America and Asia [20]. Scientists reported that antibodies are created against multiple parts of S protein of SARS-CoV-2, so there is a high chance that the vaccines will retain efficacy for these variants [21]. Brazil reported another variant P.1 (Gamma variant, VOC) derived from the lineage B.1.1.28 with mutation present in the S protein primarily responsible for entry of viral particles into human cells [22]. Mutations N501Y and E484K was found in this variant and has high transmissibility. Zeta variant (lineage P.2) carrying E484K mutation derived from gamma emerged in Brazil but has low transmission rate than gamma [22]. In India, two VOIs B.1.617 and B.1.618 as well as B.1.617.2 (Delta variant, VOC), a sub-lineage of B.1.617 was reported and found to be more infectious and transmissible [23]. It has 50% more transmission rate than B.1.1.7 and has already spread across many countries. Delta variant has E484Q and L452R mutations which is involved in increased interaction of virus particles with human receptor cells and hence, increased rate of infection. Shortly after Delta variant, a new strain B.1.617 + (Delta plus variant) was emerged in some part of India during second wave of COVID-19 with a new P618R mutant which is responsible for reduced antibody binding capability as well as evasion of natural immunity along with E484Q and L452R. B.1.617 + has spread four times faster than alpha variant and rapidly expanded to other countries [23]. Another variant of concern known as Omicron (B.1.1.529) has emerged with a large number of mutations including at least 34 mutation in S protein and is more transmissible than Delta variant in late 2021 [24]. This variant has two lineages, BA.1 and BA.2 and has raised major concerns due to its ability to evade protection conferred by therapeutics monoclonal antibodies and vaccines [24]. As per WHO, total COVID-19 cases of 534,245,759 have been reported with 6,317,736 deaths and 505,168,553 recovered worldwide in 228 countries and territories till this date [25].

In this review, all the structural proteins and non-structural proteins (nsps) of SARS-CoV-2 that are involved in causing COVID-19 infection in humans have been comprehensively discussed. The interaction of these proteins with the cell receptor (ACE2) to gain entry into the host and its role in processes such as proteolytic cleavage, fusion, transcription, translation, viral packaging, assembly and exocytosis in life cycle of the virus have been explained in extensive detail. Repurposing of the antiviral drugs which have shown the efficacy to target proteins in any step of the viral life cycle have been outlined as well as illustrated. Other therapeutic options such as immunotherapy and cellular therapy as well as vaccines approved have been summarised as well.

Key proteins responsible for SARS-CoV-2 and possible druggable sites

Spike protein and its interaction with ACE2

The spike (S) protein of coronavirus is a large glycoprotein of about 180 kDa containing approximately 1273 amino acids and 20 asparagine-linked glycans. Spike glycoprotein present on the surface of novel coronavirus (SARS-CoV-2) as a homotrimer, plays an important role in the attachment of virus to receptors of the host cells as well as membrane fusion. The trimers are formed from S monomers in the endoplasmic reticulum (ER) of virus producing cells [26]. S protein consists of three segments: large ectodomain, single-pass transmembrane anchor and short intracellular tail. Ectodomain are composed of two functional domains/subunits: S1 domain is responsible for receptor binding and S2 domain is responsible for fusing viral and host cell membrane. The S1 domain (14-685 residues) is further divided into N-terminal subdomain (NTD) and C-terminal subdomain (or C-domain) [27]. S2 domain (686-1273 residues) consists of fusion peptide, heptapeptide repeat sequence (HR1, HR2), transmembrane domain and cytoplasmic domain. S protein trimer is located on the surface of viral envelope and have large number of N-linked glycans that are essential for proper S folding and for controlling the accessibility of host proteases and neutralizing antibodies of host. There are 22N-linked glycosylation sequons per protomer in SARS-CoV-2 and out of which 20 are in homology with SARS-CoV S protein [59].

Role of host proteases

Furin-like protease

For cleavage of S protein, the host proteases differ for different coronaviruses which determines the epidemiological as well as pathological features of virus. For example, trypsin, human airway trypsin-like protease (HAT) and TMPRSS2 are some host proteases which are expressed in many essential organs [60]. A study by Wang et al. found that COVID-19 has a novel multibasic unique cleavage site (–RRAR–) in S1/S2 domain, located between residues 682 and 685 which is distinct from other coronaviruses [39]. This site is most likely to be cleaved by convertase furin which enhances viral-host cell membrane fusion. Furin cleaves proteins and peptides precursors and converts them into biologically active state. It is a type 1 membrane-bound serine-protease and a member of calcium-dependent, subtilisin-like proprotein convertase family. It cycles from the trans-Golgi network to the cell membrane and through endosomal system. It specifically recognizes and cleaves the R-X-K/R-R motif in the presence of Ca2+ [61]. This protease is highly expressed in organs and tissues including lung, gastrointestinal tract, brain, pancreas, and reproductive tissues and liver as well. Hence, COVID-19 can infect these organs also resulting in systematic infection of virus in the body and an enhanced transmission and pathogenicity [39]. The binding between the furin protease and S protein is in a clamp-like fashion where furin clips tightly to the cleavage S1/S2 site. Substrate-binding pocket of furin has canyon-like crevice and its key amino acid residues are specifically positioned to interact with S glycoprotein. The presence of 12 additional nucleotides upstream of the exposed -RRAR- sequence in S glycoprotein corresponds to unique canonical furin-like cleavage site. These amino acid residues present between N657 and Q690 of S glycoprotein strongly interacts with furin are well organised in a flexible loop [62]. COVID-19 use this convertase to activate S glycoprotein and thus, provide a gain-of-function to the virus for efficient transmission. Van der waal or hydrogen bonding facilitates the interaction of furin and S glycoprotein. MERS-CoV contains -RSVR- sequence which is most probably cleaved by furin during virus egress whereas SARS-CoV’s S protein remains uncleaved [63].

TMPRSS2 (Transmembrane protease serine 2)

TMPRSS2 is a type II transmembrane enzyme of the host that belongs to serine protease family and encoded by TMPRSS2 gene. Along from S protein-ACE2 interaction, other proteases also play a major role in entry of the virus particle into the host cell. TMPRSS2 helps in the priming of S protein thereby causing the fusion of cellular and viral membranes. TMPRSS2 are found to be localized in epithelial cells of lungs [64]. A study was reported in which TMPRSS2 cells were overexpressed in Vero E6 cell line due to which chances of corona infection elevated in patients as this overexpression of TMPRSS2 in lung made them more vulnerable to SARS-CoV-2 [65]. Much research has proved that entry of novel SARS-CoV-2 depends on TMPRSS2 priming activity and gets blocked when TMPRSS2 inhibitor, camostat was used [66]. So, TMPRSS2 proved to be another protein target for treatment of SARS-CoV-2. A TMPRSS2 knockout mice model was studied and it was reported that knockout mice were immune to coronavirus [67].

Cathepsin L

Cathepsins are host cysteine proteases and play an important role in the entry of SARS-CoV-2 viral particle into the host cells via endocytic pathway and also in protein catabolism in endosomes and lysosomes. Cathepsin L causes the priming of S proteins after the virus enters the endosomes and hence, causing the fusion of viral and endosomal membranes and release of viral genome into cytoplasm. Cathepsin L can be targeted for the treatment of SARS-CoV-2 as its inhibitors have been reported to prevent pulmonary fibrosis. Some cathepsin L inhibitors are SID 26,681,509 and E-64-d [68]. A study reported that a glycopeptide antibiotic, teicoplanin, was able to block cathepsin L activity and thus, inhibits the entry of SARS-CoV and MERS-CoV [69]. Another study reported the inhibition of SARS-CoV-2 pseudoviruses entry into the host cells. It was demonstrated that SARS-CoV-2 pseudo virus uses the cathepsin L specifically to enter into 293/hACE2 cells and E-64-d (a broad spectrum cathepsin inhibitor) proved to reduce the viral entry by 92.5% while SID 26,681,509 inhibitor reduced it by 76% [70]. Different types of cathepsins have been found to play a key role in viral entry to host cells and targeting them along with other target protein can be more beneficial in treating SARS-CoV-2 patients [71].

Therapeutic interventions for COVID-19

COVID-19 is a serious international concern and research teams and health officials from all around the world are working tirelessly to cope with the disease. Since its outbreak, countries have taken measures to slow down the spread of virus by announcing lockdown, testing, isolating, and treating patients with drugs, carrying out contact tracing, limiting travel and quarantining citizens. Some of the methods to treat COVID-19 infected patients are listed below.

All the proteins associated with SARS-CoV-2 has been proved to be potential catalytic site to target in order to treat the infection and different treatment options has been developed till now. Initially when there were no specific antiviral drugs developed for coronavirus, drugs developed for MERS and SARS showed promising results, so expectation was shifted to them. Table 3 describes the antiviral drugs which have been developed by scientist for SARS-CoV-2 as well as some repurposed drugs evaluated for their potency to counter SARS-CoV-2 infection. Most synthetic small molecules under clinical trials are being repurposed for COVID-19 which are already reported for their efficiency against other disease states [71]. Drugs are being clinically tested to be used against COVID-19 infection such as Favipiravir, Ribavirin, Nafamostat, Nitazoxanide, Penciclovir, Favipiravir, Baricitinib and Arbidol. However, most of them showed moderate results when tested on clinical samples of COVID-19 patients in vitro [11]. Remdesivir, has caught the attention of many researchers due to its promising impact against the virus. It is an adenosine nucleoside analogue and targets viral RNA-dependent RNA polymerase, invades the viral RNA chains and thus, causing pre-mature chain termination. It also evades proofreading viral exoribonuclease. This drug has been used effectively against Ebola virus infection [72]. Sheahan et al. conducted clinical study stating that SARS-CoV-2 replication was significantly blocked in patients and were clinically recovered when remdesivir alone or in combination with cholorquine/ interferon beta was used [73]. Remdesivir was successfully used in treatment of first COVID-19 patient in USA and ameliorated the worsening condition of pneumonia on 7th day of hospitalization in January 2020 [74]. Remdesivir has been found to work effectively with compassionate use in severe cases of COVID-19 though extensive research is still going on. Controlled trials of antiviral drugs are required to determine side effects, if any [73]. Terali et al. reported in silico study to find out drugs which can be repurposed as potential human ACE2 inhibitors like lividomycin, quisinostat, burixafor, fluprofylline, spirofylline, pemetrexed, diniprofylline and edotecarin. They proved to be promising drug candidate which blocks viral entry [75]. Food and Drug administration (FDA) approved drugs like Lopinavir, Ritonavir, Darunavir, Boceprevir, Telaprevir are being investigated as Mpro inhibitors and has shown encouraging result. Compounds like Disulfiram, Baicalein, Ebselen, Carmofur, Shikonin, PX-12, Camostat, Calpeptin, Calpain inhibitor and Tideglusib have been reported to be likeable protease-based drug candidates for SARS-CoV-2 with good potency and selectivity [76]. Scientists are trying to find or develop more effective antiviral drug candidate to treat COVID-19. Figure 2 shows the various inhibitors acting at different stages in SARS-CoV-2 infection cycle.

Table 3 Therapeutic treatment option FDA approved or under clinical trial for COVID-19
Fig. 2
figure 2

Inhibitors which act at different stages of SARS-CoV-2 life cycle

Use of monoclonal or polyclonal antibodies is another suggested method which can serve as prophylactic and therapeutic tool against viral infection and provide some restitution in this pandemic condition. They are laboratory engineered molecules which are designed to bind antigens. It has been reported that monoclonal antibodies (mAbs) target the S protein of SARS-CoV and prevent virus to enter the host cells [77]. Some of the mAbs have been described in the Table 3. To improve the condition of COVID-19 infected patients, other treatment options were also considered. Convalescent plasma containing IgG, IgA, IgM, IgE and IgD was obtained from patients and has already been used effectively for treating SARS-CoV, poliomyelitis, influenza A (H1N1) and Ebola virus infection. The presence of antiviral antibodies in the plasma of clinically recovered patients helps to suppress viremia [11]. Vaccines like inactivated or live-attenuated vaccines, viral vector-based protein sub-unit vaccines etc. were produced using different strains of SARS which showed promising result in animal models. Yang et al. developed a DNA vaccine (inactivated whole virus or live-vectored strain of SARS-CoV, AY278741) which has successfully induced the release of neutralizing antibody and reduced viral infection in animal models [87]. Previously developed vaccines for SARS-CoV might be helpful to re-utilize it to facilitate vaccine development for COVID-19.

Scientists are working at a breakneck speed to develop vaccines for COVID-19 as early as possible. For this, institutes and pharmaceutical companies are working in collaboration worldwide. More than hundred vaccines are in progress and many of them have even entered clinical trial [88]. Among them, more than 25 have already been gone through clinical trial and has been approved for use till this date. The approved vaccines are being given to people around the world in doses and its efficiency and side effects are being analysed. Till now, some of these vaccines have shown good productivity with little side effects. However, vaccinating people worldwide is a long way to go in completely making world COVID-19 free. Scientists at Gamaleya Research Institute in Russia have successfully made and registered world’s first vaccine, Gam-COVID-Vac, using adenovirus. LNP-encapsulated (lipid nanoparticles) mRNA-based vaccine has been developed by US National Institute of Allergy and Infectious Diseases (NIAID) along with a biotechnology company called Moderna Therapeutics and is currently being administered to people in various countries. The scientists have already investigated the efficiency and side effects of vaccine in adult volunteers. It is a nanoparticle encapsulated vaccine encoding full length, prefusion stabilized S protein [89]. University of Oxford and AstraZeneca has developed a vaccine using Chimpanzee Adenovirus vector (ChAdOx1-5). It has also been tested for safety, reactogenicity and immunogenicity profile as well as tolerability. This vaccine is now called AZD1222/ covishield and is now being given as vaccine in many countries. These adenovirus vectors are weakened version of a common cold virus and is replication-deficient carrying one or few encoded antigens and can stimulate both humoral and cytotoxic T-cell immune responses efficiently [90]. All the vaccines which are approved and those currently in clinical trial in late phase are listed in Table 4.

Table 4 List of vaccines approved or in late phase clinical trial

There are still many challenges which are being faced by scientists working around the clock to fight the battle against COVID-19. Development of vaccines which can perform efficiently can only be done with detailed understanding of the pathway involved in infection. The evolving nature and strain of virus remains the biggest challenge in develo** virus-specific vaccine. Many parameters are taken into consideration for rational vaccine design development and for evaluation of vaccine efficacy.

Conclusion and perspective

The novel coronavirus 2019 has challenged the socio-economic, medical and health care foundation worldwide. The deadly virus has already caused devastating impact on human life. The zoonotic source of COVID-19 still needs to be confirmed further which originated from seafood market in Wuhan, China. Bats and pangolins are being considered as key reservoirs according to sequence-based analysis. Research on SARS-CoV-2 has showed many similarities as well genomic variations from SARS and MERS so, a detailed phylogenetic and pathogenetic study will enhance the understanding of virus and help in develo** preventive measures. Rapid diagnosis of SARS-COV-2 needs to be done in suspected patients to control the transmission of virus which has already claimed the lives of thousands of people including doctors, health care workers and paramedical staff and infected millions. The human-to-human transmission is a serious concern and a threat to public health. The rigorous surveillance and on-going research will unravel new research findings regrading host adaptation, molecular mechanism, transmissibility, clinical manifestations, evolution, epidemiological pattern and pathogenicity. The new information about COVID-19 is being available every week in scientific journals that will help the public to understand the virus better. Researchers are develo** efficient and promising therapeutics strategy to cope with the pandemic situation. Comprehensive measures need to be devised to curb not only this global health emergency but also take care of future outbreak of infection of zoonotic origin.