Introduction

A novel coronavirus SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) was subsequently detected in the Wuhan state of China in the last quarter of 2019 as the causative pathogen for the COVID-19 (coronavirus disease 2019), a lethal respiratory tract infection. Genetically this virus closely resembles the SARS virus. It has spread across the globe (more than 210 countries) within a short period. Coronaviruses pose serious health threats to both humans and animals. The mortality rate for 2019-nCoV (novel coronavirus) is not as high (approximately 2–3%), in comparison to SARS-CoV (severe acute respiratory syndrome coronavirus) having fatality rate of ∼ 10% and MERS-CoV (Middle East respiratory syndrome coronavirus) having fatality rate of ∼ 36%, but its rapid propagation has resulted in the activation of protocols to stop its spread [1]. The genomic RNA of coronaviruses is the largest among RNA viruses, approximately 27 to 30 kb [2]. The genome of 2019-nCoV is reported to have a 79.6% sequence identity to SARS-CoV [3]. Phylogenetic analysis have revealed that SARS-CoV-2 comes under genus Beta coronavirus and falls under the subgenus of Sarbecovirus, a relatively close family member to SARS like coronaviruses which have been derived from bats. These viruses have been identified as bat-SL-CoVZXC21 and bat-SL-CoVZC45 having 96% sequence similarity.

The homology modeling analyses have revealed that SARS-CoV-2 and SARS-CoV have a similar receptor-binding domain structure, despite amino acid variation at some key residues [4]. The S protein (spike protein) of SARS-CoV-2 also called 2019-nCov may also interact with human ACE2 as like SARS-CoV for host infection [5, 6]. Trimer of the S protein is known to be cleaved making them into S1 and S2 units of which the S1 subunit is released to the post-fusion conformation in this transition during viral infection [7,8,9,Data analyses

Data analysis for the produced trajectories was performed using TCL-scripts previously implemented in VMD [49] and data were plotted using gnuplot (http://gnuplot.info). We have also calculated RMSF α alignments for carbons for all residues and structural changes by RMSD throughout the simulation run. Calculation between the hydrogen donor and acceptor was set with a cut-off at 3.6 Å, which included the backbone as well as side-chain. Other analysis such as radius of gyration (ROG), solvent accessible surface area (SASA), secondary structure content (DSSP), and H-bond formations upon ligand binding were calculated using TCL bash scripts. RMSD, RMSF, total energy, SASA, radius of gyration, and H-bonds were plotted using prism.

Analysis for binding free energy (MMPBSA) from MD simulations

MMPBSA.py module was used to calculate the free energy and interaction energy of the ligand. The mathematical formula used to calculate the energies was:

$$ \Delta {\mathrm{G}}_{\mathrm{bind}.\mathrm{solv}}=\Delta {\mathrm{G}}_{\mathrm{bind}.\mathrm{vacuum}}+\Delta {\mathrm{G}}_{\mathrm{solv}.\mathrm{complex}}-\left(\Delta {\mathrm{G}}_{\mathrm{solv}.\mathrm{ligand}}+\Delta {\mathrm{G}}_{\mathrm{solv}.\mathrm{receptor}}\right) $$

The solvation energy for all the states was calculated using Generalized Born (GB) and Poisson Boltzman (OB). This analysis revealed the electrostatic contribution of the solvation state. The final data was plotted using prism.

Results and discussion

Protein selection and preparation

SARS-CoV-2 when gets into the human body and tries to find the host cell, the receptor-binding domain (RBD) of the CoV-2 spike protein binds to the human ACE2 receptor. This is the first point of contact between the human body cells and SARS-CoV-2. Therefore, the protein structure used in the study is the complex between the RBD of SARS-CoV-2 and the human ACE2. The protein used for this study is an experimentally solved structure of ACE2 and S protein complex (PDB-ID: 6M17), which was previously subjected to a 10-μs molecular dynamics simulation to have the most stable structure in the least energy conformation.

Virtual screening and ligand selection

Natural ligands were acquired using the ZINC natural library with a total of ~ 203,458 drug molecules. These molecules were tested through blind docking against the S protein: human ACE2 complex to shortlist best candidates. Primary virtual screening gave optimum hits for 20 compounds mentioned in Supplementary Table 2. Final 4 drugs—Andrographolide, Artemisinin, Pterostilbene, and Resveratrol—were then selected on the basis of multiple criteria such as binding score, hydrophobic, electrostatic, and pi-pi cationic interactions with the protein. Therefore, we continued further studies using the mentioned drug candidates. People have reported that hydroxychloroquine abolishes this interaction and binds between the interface of these two proteins [52]. Focus of the study is to find alternative drugs which can inhibit this particular region and at the same time have lesser side effects than hydroxychloroquine. Hydroxychloroquine was used as a positive control to confirm and compare the interaction. The final list of ligands tested thoroughly is mentioned with PubChem ID and 2D structures (Fig. 1).

Fig. 1
figure 1

Shortlisted drugs from natural library of over 200,000 compounds with their respective 2D structures and PubChem identification numbers

Toxicity prediction

Finalized inhibitors were then tested for compare their toxic effects using online tool ProTox-II. The results for toxicity prediction suggested that these shortlisted natural ligands used in this research are identified as less toxic than previously used hydroxychloroquine. The toxicity values suggest that Andrographolide was being put in the class 5 with Artemisinin categorizing them as the least toxic, while Pterostilbene, Resveratrol, and hydroxychloroquine were categorized in class 4 (Table 1). Toxicity radar charts for all the ligands explaining toxicity effects are shown in Supplementary Fig. 1.

Table 1 Toxicity prediction results for the selected compounds as calculated using online tool ProTox-II [40]

Molecular docking results

Molecular docking studies were performed using flexible docking module of AutoDock 4 further strengthened this research to find an alternate natural inhibitor for S protein: human ACE2 complex. The molecular docking was performed using AutoDock 4 using default settings.

Final inhibition scores in the form of binding energies and major interacting residues for all the drug candidates are mentioned in Table 2. With the binding energy of − 9.1 kcal/mol, Andrographolide shows the best binding with the receptor.

Table 2 Binding energy of protein-ligand complex obtained after performing molecular docking using AutoDock 4.2 [41]

Further structural analyses were carried out using PyMOL (www.pymol.org) and PoseView module of Protein Plus. We found that the best inhibitor for binding Andrographolide fits perfectly between the interface of S protein: human ACE2 complex. The binding of Andrographolide with the protein complex showed interactions with residues Asn-33, Arg-393, and Tyr-505 in the form of H-bonds with the drug candidate. His-34 and Pro-389 formed alkyl and pi-alkyl interactions (Fig. 2).

Fig. 2
figure 2

Docking pose (left) for Andrographolide docked with ACE2: S protein complex in the interface between both proteins. (Right) Ligand interaction diagram showing important interactions involved in the complex

Structural analyses for the second drug candidate—Artemisinin—also showed binding between the interface of S protein: human ACE2 complex. However, the docking score was lower than what we achieved for other candidates. Artemisinin showed the formation 1 H-bond with Tyr-505 residue of the ACE2 receptor. His-34 and Ala-387 again formed alkyl and pi-alkyl contacts with the receptor. Pro-389 forms a carbon H-bond. The docking pose and ligand interaction diagram for Artemisinin inhibiting the protein complex is shown in Supplementary Fig. 2.

We then moved on to analyze the third drug candidate “Pterostilbene,” which was the candidate with second best drug inhibition score in terms of binding energy (− 8.9 kcal/mol). Structural analyses showed Pterostilbene forming a pi-pi stack with His-34 of ACE2 along-with two H-bonds (Gly-496 and Ser-494) (Fig. 3).

Fig. 3
figure 3

Docking pose (left) for Pterostilbene docked with human ACE2: S protein complex in the interface between both proteins. (Right) Ligand interaction diagram showing important interactions involved in the complex

Resveratrol, which is from the same Stilbene family as Pterostilbene, also showed similar interaction as Pterostilbene with a docking score of − 8.7 kcal/mol. However, when the structural analyses of the complex were performed, it was surprising that Resveratrol only showed one H-bond with Gly-496 in this interaction (Fig. 4).

Fig. 4
figure 4

Docking pose (left) for Resveratrol docked with human ACE2: S protein complex in the interface between both proteins. (Right) Ligand interaction diagram showing important interactions involved in the complex

Hydroxychloroquine which is a known inhibitor and was used as a positive control to compare and confirm this interaction also showed expected binding at the same interface as other ligands. Structural analyses of the known inhibitor showed wide interactions with the ACE2 receptor including one H-bond with Gly-496, Tyr-505, Tyr-495, and Lys-403 as alkyl and pi-alkyl contacts; Gln-388 is contacted as the amide-pi stacked residue. One pi-cation interaction was also observed with Arg-393 (Supplementary Fig. 3).

Molecular dynamics simulations

To confirm the stability of the complex structures in combination with the drug candidates, we performed an accumulative 400-ns molecular dynamics simulation on all the 4 complexes. All the simulations are performed in triplicates for more concrete data analysis. This production run was post 1 ns equilibration using NAMD. We found that the RMSD fluctuations between structures are not too high which explains why the structures with complexed ligand are very stable. Overall trajectory analyses for all the compounds are more or less equilibrated with an average change of approx. 2 Å in the RMSD (Fig. 5). The most deviation observed (2.80 Å) as average RMSD change for around steps 40,000 and 45,000 ps for Artemisinin (shown in green) (Fig. 5). Artemisinin also had the least docking score, and this deviation may be because of the hydrophobic interactions of the cyclic groups with the receptor residues. Trajectories for Andrographolide (in red) and Pterostilbene (shown in violet) after 36,000 ps show equilibration, suggesting stable binding with the ACE2 receptor macromolecule (Fig. 5).

Fig. 5
figure 5

RMSD analysis (for backbone and C-alpha) for the production run of Andrographolide (red), Pterostilbene (violet), Resveratrol (blue), and Artemisinin (green) inhibiting the S protein of SARS-CoV-2 in complex with human ACE2

Similarly, RMSF plot for the trajectories shows approx. same per-residue fluctuation in the case of Andrographolide as of Artemisinin (Fig. 6). Pterostilbene and Resveratrol showed slight fluctuations. Artemisinin as expected from the RMSD plot showed more local residue-based fluctuation. Residue number 240 to 260 shows the highest fluctuation in all the 4 cases. This cluster of residue could be the functional site of the ligand binding phenomenon. Table 3 recorded the average values for all three individual simulation runs. Average RMSD (for backbone and c-alpha), RMSF, and number of H-bonds formed are recorded. Supplementary Fig. 4 demonstrated the hydrogen-bonding pattern observed during 100 ns simulation in all 4 protein-ligand complexes. Approximately near 50 ns, no bonds were observed in any complex. Average numbers of H-bonds formed in case of Andrographolide, Pterostilbene, Resveratrol, and Artemisinin are 3, 3, 2, and 2, respectively (Table 3).

Fig. 6
figure 6

RMSF graph (for alpha carbon) for all 4 protein-ligand complexes during 100-ns simulation run. RMSF for Pterostilbene and Resveratrol are exactly overlapped by Artemisinin

Table 3 The average RMSD for backbone and C-alpha trace, RMSF, and average H-bonds formed between protein and compound across simulations for all the complexes over 3 replicates of 100-ns molecular dynamics simulations

These results suggest that Andrographolide and Pterostilbene can be good inhibitors for the S protein: human ACE2 complex interface which will inhibit the binding of S protein of SARS-CoV-2 to the ACE2 receptor without showing any side effect.

Apart from RMSD, RMSF, and number of hydrogen bonds formed between protein and ligand, radius of gyration is also calculated. Supplementary Fig. 5 depicted the radius of gyration plots for all the 4 complexes over 100 ns of simulation time. As we can observe that Rg is decreasing in all cases over the time, it suggests that binding of ligands helps in the stabilization and compactness of the protein. The radius of gyration (Rg) of a particle is the root-mean-square distance of all electrons from their center of gravity. It is an important parameter and is often useful as an indicator for structural changes of a substance. Changes studied through the use of the radius of gyration are, for instance, association and dissociation effects, conformational changes by denaturation, binding of coenzymes, and temperature effects (O. Kratky, P. Laggner, in Encyclopedia of Physical Science and Technology (Third Edition), 2003).

Solvent accessible surface area for all the proteins was also calculated to check the effect of ligand binding on the residue profiling of the surface of the protein. Supplementary Fig. 6 shows solvent accessible surface area (SASA) plot for all 4 complexes as obtained using gmx sas command in GROMACS [53, 54] for 100-ns simulation run. Pterostilbene and Resveratrol plots are exactly overlapped by Artemisinin. This suggests that there is no major change in the structure of the protein on binding with the different ligand. The total energy of the complexes and individual energy components are depicted in Supplementary Figs. 7 and main text Fig. 7. Individual energy components like van der Waals forces, coulomb, and H-bond are calculated using MM-PBSA/MM-GBSA tool in GROMACS. A table (Table 4) representing these values is also included in the text. The complex of ACE2 and Andrographolide shows the highest Gibbs free energy (− 48.164 kJ/mol). It suggests that Andrographolide is the best lead molecule which shows good interaction with ACE2 receptor exhibiting it as the potential target for human ACE2 binding protein. Also, Supplementary Fig. 8 depicted the secondary structure change plot for all 4 complexes as obtained using do_dssp command in GROMACS [53, 54] for 100-ns simulation run. Pterostilbene and Resveratrol plots are exactly overlapped by Artemisinin again referring Andrographolide and Pterostilbene as best leads for further drug development process.

Fig. 7
figure 7

MM-PBSA/MM-GBSA graph of all 4 complexes providing the distribution of energy components during the course of simulation

Table 4 MMPBSA/MMGBSA analysis performed using the script MMPBSA.py module showing different energy contributions during the 100-ns molecular dynamics simulation for each of the four complexes

Conclusions

Initial molecular dynamics, primary screening, molecular docking, and post-complex molecular dynamics simulations for 100 ns each (in triplicates) in this research suggested that the interaction between the S protein: human ACE2 complex is very important. The interactions are strongly on the helices of the human ACE 2 protein, which are important in the interaction with the receptor-binding domain of the S protein of SARS-CoV-2. This was shown in the initial 10-μs simulation by DE Shaw Research [36]. This interface interaction also explains why it is important to abolish this interaction. The most important residue which we see from all the ligand interaction diagrams is His-34 of human ACE2 receptor which lies on the surface and hence a very important in terms of interaction with the S protein. We compare and show that our positive control as well all suggested drug candidates have shown interaction with His-34 with utilizing non-polar binding. This interaction will be an important factor in abolishing the connection between the S-protein and human ACE2, further stop** the spread by this first point of contact. Andrographolide and Pterostilbene have shown promising binding and stability results by molecular dynamics indicating their usefulness in the form of inhibiting this important complex. Experimental in vitro studies are suggested with the use of Andrographolide and Pterostilbene for further analysis and corroboration.