Introduction

Pangolins are 30–100 cm long ant- and termite-feeders found in Africa (two Phataginus spp. and two Smutsia spp.) and Asia (four Manis spp.)1. They have converged morphologically with unrelated South American anteaters. All three species occur in East Asia [Chinese pangolin (M. pentadactyla), Philippine pangolin (M. culionensis) and Malayan pangolin (M. javanica)] are Critically Endangered, with the Indian pangolin (M. crassicaudata) being Endangered2. The four African species are either Endangered or Vulnerable2 and all eight species have declining populations2. One of the main threats to the conservation of pangolins is poaching for body parts used in traditional medical remedies3. Pangolins are especially important to conserve because of their phylogenetic uniqueness: they are the only extant members of their order (Pholidota).

One of the main pangolin conservation challenges is that captive pangolins usually die from infection3. This makes it very difficult to maintain captive breeding programs or return rescued animals to the wild. In response to this problem, rigorous hygiene protocols have enabled us to establish a captive Malayan pangolin population up to the third filial generation in China4. This vulnerability to infection is possibly due to the pseudogenisation of immune system genes in the pangolin genome, including Interferon Epsilon (IFNE)1, Interferon-Induced with Helicase C domain 1 (IFIH1, also known as MDA5, a cytoplasmic RNA sensor that helps initiates the innate immune response to viral infection)5, cyclic GMP-AMP Synthase (cGAS)6, Stimulator of Interferon Genes (STING, the interaction partner of cGAS)6, Toll-Like Receptor 5 (TLR5)7, and also likely Toll-Like Receptor 11 (TLR11)7.

Owing to its rarity and protected status, Malayan pangolin specimens that can be examined are difficult to obtain. We investigated a possible infection in a Malayan pangolin that was seized by customs in the Guangdong province of China, which subsequently died8. We previously reported that this specimen’s brain (cerebrum and cerebellum) and lungs were infected by a coronavirus (pCoV) closely related to SARS-CoV-217. Low-quality bases with Phred score lower than 20, adapter sequences, and PCR primers were trimmed using Cutadapt v1.9.118. 21,956,237 pairs of reads of length 150 bp generated from the pangolin skin passed the quality check.

The healthy Malayan pangolin skin sample was downloaded from our previous study (project accession PRJNA283328; run SRR3923846) and quality checked. Malayan pangolin (Manis javanica) primary genome assembly (ManJav1.0; accession: GCF_001685135.1) and the corresponding GTF annotation were downloaded from National Centre for Biotechnology Information (NCBI). 24,805,754 pairs of reads of length 150 bp from normal pangolin skin passed the quality check. Then, the primary genome assembly was indexed using Bowtie2 v2.4.219 and the trimmed reads from the healthy and Dahu skin samples were mapped to the genome using TopHat v2.1.120,21,22. The overall map** rates for the pangolin skin and healthy pangolin skin were 76.2% and 67.4%, respectively.

We also downloaded the data of human lung biopsies from the study of Blanco-Melo et al.23 (project accession PRJNA615032), including healthy human lung biopsies from a 72-year-old male and a 60-year-old female (runs SRR11517725, SRR11517726, SRR11517727, SRR11517728, SRR11517729, SRR11517730, SRR11517731, and SRR11517732), and SARS-CoV-2 infected lung biopsies from a deceased 74-year-old male (runs SRR11517733, SRR11517734, SRR11517735, SRR11517736, SRR11517737, SRR11517738, SRR11517739, and SRR11517740) (Supplementary Table S1). The reads were quality-checked, indexed, and mapped using the same approaches as above. Technical replicates were merged before being analysed. The overall map** rate for healthy and human lung samples were 87.7% and 85.0%, respectively, and for SARS-CoV-2 infected human lung samples, the overall map** rates were 61.0% and 71.0%, respectively.

Examination of the presence of SARS-CoV-2 RNA

All reads were mapped to pCoV genome to confirm the presence of pCoV RNA in the pangolin skin transcriptome. The pCoV genome was obtained from ** and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011)." href="/article/10.1038/s41598-024-51261-x#ref-CR26" id="ref-link-section-d2803739e842">26, and visualised using Integrative Genomics Viewer (IGV) v2.4.927,28,29.

Phylogenetic analysis

The pCoV partial genome or gene sequences from the mapped reads were extracted from IGV v2.4.927,28,29. Phylogenetic trees were constructed using MEGA-X30. Sequences were initially aligned using Multiple Sequence Comparison by Log-Expectation (MUSCLE) aligner31. The alignments were manually curated to ensure accuracy. Maximum-likelihood trees were inferred using the Tamura-Nei DNA substitution model and nodal support was estimated using 1,000 bootstrap replicates.

Differential expression (DE) analysis

The read counts and normalised Fragments Per Kilobase Million (FPKM) for each gene were generated according to the Cufflinks pipeline32. The genes of FPKM less than one were considered as low expression or noise and being filtered. The remaining genes were considered as up-regulated DEGs if log2 fold change (FC) were higher than one, and down-regulated DEGs if log2 FC were lower than one (coronavirus versus control). For human samples, we downloaded the raw data from Blanco-Melo et al.23 and processed them using the same approach for accurate comparison. To compare the human gene expression with pangolin, we generated the normalised read counts for each human sample using the same approaches as above.

Functional enrichment analysis

Gene set enrichment analysis (GSEA) and over-representation analysis (ORA) were performed using clusterProfiler v3.18.033. GSEA was done using all the pangolin genes or human genes using fgsea method34.

We performed ORA using pangolin skin-specific or human lung-specific DEGs, compared against all the genes in pangolin or human (as background). Gene sets used in ORA and GSEA were based on human, including gene ontology (GO) biological process (BP) gene sets35 queried using AnnotationHub v2.22.036 and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway gene sets37. For ORA, we defined an enrichment score for each enriched term as below

$$\frac{{F}_{in}/{F}_{all}}{{B}_{in}/{B}_{all}},$$

where \({F}_{in}\) and \({B}_{in}\) are number of genes belonging to the term in the foreground (test) gene set and background gene set, and \({F}_{all}\) and \({B}_{all}\) are the number of all the genes in the foreground (test) and background gene set. The results with false discovery rate (FDR) adjusted p-values lower than 0.05 were considered as significant.

Analysing the expression of ERV genes

3,162 common viral proteins in the UniProt database were downloaded using the criteria: "Viruses [10239] " (name:helicase OR gene:gag OR gene:c OR gene:pol OR gene:env OR gene:tat OR gene:s OR gene:rev OR gene:rep OR name:polymerase OR gene:nef) AND reviewed:yes. The endogenous viral genes were explored using the TBLASTN38 output by querying common viral proteins against unmasked host genome. We retained hits with more than 40% identity, e-value lower than 1 × 10–6, and bitscore higher than 60. Then, among the retained hits, we selected one representative sequence among overlap** results by selecting the highest completeness hit. We removed all non-retroviruses and genes that have overlap coordinates with known exons. The filtered results were considered as ERV genes in the pangolin genome.

Results

The presence of pCoV subgenomic mRNA in the skin

To confirm that this specimen’s skin was infected by pCoV (in addition to its lungs and brains8), we searched for viral sequences among the transcriptome data. A total of 193 reads mapped to the reference pCoV genome (Fig. S1 in Supplement 1). The distribution of these mapped reads was consistent with the corresponding locations of the subgenomic mRNAs39. We also observed an individual read that spanned precisely the splicing sites: this read was 150 bp in length and its 5′ 47 bp mapped to the 5’ region of the pCoV genome, while its 3′ 103 bp mapped to the 3’ region of the same genome. This read indicates that the CoV RNA has been processed in the host cell40. We consistently detected the N gene (which is used in diagnostic human SARS-CoV-2 testing) in all the tested organs using qRT-PCR8, including the skin. To confirm the identity of the viral RNA in the skin, we compared consensus sequences from our mapped reads with CoV from other species (Fig. 1; Supplementary Table S2). We observed that the pCoV genome and genes from the specimen’s skin were almost identical (sequence identity = 99.2–100%) to the counterparts from another pangolin, Guangdong pCoV isolate MP78941, confirming the presence of pCoV RNA in the current pangolin’s skin.

Figure 1
figure 1

Phylogenetic trees of the coronavirus RNA and gene sequences from the Malayan pangolin’s skin. Phylogenetic trees were generated using the maximum likelihood, with 1000 bootstrap replicates. Nodes with bootstrap support values of 70 or greater are indicated.

Comparative analysis of transcription in Malayan pangolin skin and human lungs

Our comparative analysis of transcription focused on Malayan pangolin skin and human lungs. It is crucial to clarify that this comparison was not due to direct similarity between these tissues, but rather because of the availability of comprehensive data on DEGs in coronavirus-infected human lungs, which contrasts with the absence of such data for coronavirus-infected human skin. We leveraged this comparison as part of our cross-validation strategy for the DEGs identified in pangolin skin, operating under the assumption that certain similarities in immune response exist between human lungs and pangolin skin, given their shared mammalian traits.

Here, we identified 3201 DEGs (1810 upregulated and 1391 down-regulated) in the Malayan pangolin specimen’s skin (Supplementary Table S3). To validate our data, we investigated the differences and similarities of host responses between the Malayan pangolin’s skin and the human lungs by comparing DEG lists between our study and an external human lung study (Supplementary Table S1)23. Our comparative analysis revealed 366 DEGs shared between species (i.e., ‘common DEGs’ below), 2835 Malayan pangolin skin-specific DEGs, and 1527 human lung-specific DEGs. As anticipated, the common DEGs were enriched in coronavirus diseases-COVID-19 pathway, followed by MAPK signalling pathway, apoptosis, C-type lectin receptor signalling pathway and Kaposi sarcoma-associated herpesvirus infection. These findings are consistent with the pCoV infection in the Malayan pangolin’s skin (Fig. 2A).

Figure 2
figure 2

Comparative analysis and interferon-related responses in Malayan pangolin (Manis javanica) skin and human lung tissue. (A) Enriched pathways in common differentially expressed genes (DEGs) between the Malayan pangolin skin and human lung. (B) Enriched pathways in Malayan pangolin skin-specific DEGs. (C) Interferon-specific responses significantly enriched in human-specific DEGs. (D) Significant interferon-related terms from the human gene set enrichment analysis (GSEA) results. (E) Interferon pathway related gene expressions in healthy human lungs (hControl), SARS-CoV-2 infected human lungs (hCoV +), healthy pangolin skin (pControl), and pCoV infected pangolin skin (pCoV +). Gene expression in Fragments Per Kilobase of transcript per Million mapped reads (FPKM) were log10 transformed and only expressed genes are shown. GO gene ontology, BP biological process, FDR false discovery rate adjusted p-value, KEGG Kyoto Encyclopedia of Genes and Genomes.

Consistent with the pangolin’s unique immune characteristics1,5,6,7, especially the loss of IFNE which plays an important antiviral role in epithelial cells42,43,4.

The mechanism of pCoV entry into Malayan pangolin skin cells remains unclear. While the low or undetectable expression of ACE2 in this specimen's skin aligns with the observed low interferon levels, it does not definitively preclude its presence or functionality as a viral entry receptor. This conclusion is drawn considering that ACE2 expression, even in healthy human lungs where it serves as the primary receptor for SARS-CoV-2, is typically low68. Additionally, our analysis suggests the potential involvement of alternative receptors, such as DPP4, which has been identified in the infected pangolin skin and is recognised as a potential binding target for SARS-CoV-269. This consideration is particularly relevant given the distinct expression profiles of pangolin keratinocytes compared to humans. Therefore, while our findings suggest a potential role for ACE2 in pCoV infection, the possibility of alternative or supplementary entry mechanisms, such as through DPP4, cannot be ruled out. Considering the situation in humans, clinical and histopathological studies of COVID-19 patients reported some dermatologic manifestations such as petechiae (a rash and haemorrhagic dot-like areas)70,71,72,73, and it has been suggested that angiotensin-converting enzyme 2 (ACE2)—used by SARS-CoV-2 to enter the host—can be highly expressed in keratinocytes74,75. Therefore, we cannot rule out the possibility that the skin of these patients was indeed infected by SARS-CoV-2 and novel mechanisms may exist to assist CoV in infecting the skin. Our functional enrichment analyses are generally consistent with the results observed in human patients with SARS-CoV-2 infection51. Furthermore, cell cycle processes were suppressed in pangolin skin. It has been shown that CoV can arrest cell cycle to boost viral replication efficiency76 through mechanisms such as regulating through cyclin-CDK complex77, p53-dependent pathway78, N protein of coronavirus79, and directly interacting with host cell cycle proteins80. At the pathway level, our analysis showed that the COVID-19 pathway, immunity and inflammation (except for IFN-related pathways), cell proliferation, and coagulation pathways were the most significant upregulated pathways in the Malayan pangolin’s skin51. In the CoV-infected pangolin skin, the interferon-specific pathways were not enriched, and the expressions of many interferon pathway-related genes were undetectable and/or not significantly differentially expressed. Therefore, the IFNE-mediated pathways, including interferon-stimulated gene responses, are unlikely to be activated or upregulated in the naturally IFNE- and IFIH1-deficient pangolins upon CoV infection1.

In this study, we examined the expression of ERV genes in the context of pCoV infection in pangolin skin. ERVs are known to be significant modulators of the innate immune system and can support antiviral responses through various mechanisms14,15,16. Our analysis aimed to understand how ERV gene expression in pangolins responds to pCoV, given their potential role in enhancing antiviral defense. In healthy pangolin skin, many ERV genes were expressed, indicating their biological significance. We believe that these ERV genes are beneficial to the host, such as to boost the host’s immunity81,82, which is especially important in IFNE-deficient pangolins. Our data showed that most of the ERV genes were downregulated after being infected, leading to our speculation that pCoV might directly or indirectly suppress the ERV genes to benefit its own proliferation. It is important to note that skin tissue was specifically chosen for RNA-Seq analysis due to the availability of an appropriate control sample, unlike the lung where no suitable control was available. Our investigation into skin tissue is significant, as it provides insights into the transcriptional antiviral response of pangolins, especially given their unique immune characteristics and the IFNE-deficiency which is expressed in both skin and lungs. This focus allows for a comprehensive understanding of the species' response to viral infections and contributes to our broader knowledge of pangolin immunity.

A possible cause of observation of replication of pCoV in the skin is contamination by pCoV-infected blood. Also, a limitation of this study is that our observations are only based on one sample due to the fact that Malayan pangolin is a Critically Endangered species found in Southeast Asia and difficult access, making it extremely tough to study them. Therefore, it would be useful to validate these results using more samples (whenever it is available) in the future.

Conclusion

We observed pathway dysregulation consistent with CoV infection of an organism lacking multiple immunity-related genes. We also report the presence of replicating virus in the skin (proven by the presence of pCoV subgenomic/spliced mRNAs which only occur in infected cells) and transcriptomic hallmarks of the host response to CoV infection. This study highlights the unique transcriptional response of pangolins to viral infections, which is impacted by the pseudogenisation of key immunity-related genes. Also, it underscores the value of studying pangolin antiviral responses to enhance our understanding of similar processes in humans.