Introduction

Retinopathy of prematurity (ROP) is a retinal vascular disease affecting prematurely born infants. Although there have been many advances in neonatal care and management for ROP, ROP remains a leading cause of childhood blindness throughout the world1,2. The most significant risk factors for ROP include low birth weight (BW), early gestational age (GA) and oxygen treatment3. However, some high-risk infants with low BW and early GA do not develop ROP, whereas some low-risk infants develop severe ROP including aggressive posterior ROP (AP-ROP)4,5,6,7. In these infants at phenotypic extremes, a study from our group demonstrated that known clinical risk factors were not significantly associated with development of ROP, suggesting presence of other risk factors for ROP8.

The i-ROP consortium includes collaborators from 14 academic institutions throughout the world with the goal of develo** better methods for diagnosing, understanding and treating ROP through computer-based image analysis, genetic analysis and biomedical informatics analysis. Over the past 10 years, over 1700 subjects have participated in the research where demographic data, eye exam data, images taken at the eye exams, as well as other systemic health data have been collected and stored in a large data repository. The 100 non-Hispanic Caucasian preterm infants in this study were selected from the i-ROP consortium samples with the goal of building a sample set that is enriched in phenotypic extremes.

Genetic factors have been suggested to predispose to ROP3,9. Racial differences in incidence and severity of ROP3,10,11, high concordance rate among monozygotic twins12,13, and strain differences in animal models of ROP14,15,16 suggest possible roles of genetic factors in ROP. However, the field of ROP genetics is still in its infancy. A number of studies investigated the frequency of specific genetic variants in premature infants with or without ROP9. However, few of them showed strong association or were replicated in other populations, although some studies reported promising gene variants17,18. Moreover, most studies examined only a few variants from a small number of candidate genes, which were mostly related to retinal angiogenesis9. Studies using updated technologies such as genome-wide association studies (GWAS) or next-generation sequencing (NGS) based approaches have not been reported in ROP.

Whole exome sequencing (WES) has revealed novel pathogenic genes or variants especially in Mendelian disorders (e.g. retinitis pigmentosa) and also in multifactorial diseases (e.g. glaucoma19,20,21,22, age-related macular degeneration23,24,25) in ophthalmology. Also, combining WES and pathway analysis has enabled researchers to find novel biological pathways or polygenic burdens associated with disease in small to moderate-scale sequencing studies19,26,27,28. This study aimed to identify potentially pathogenic genes and biological pathways associated with ROP by analyzing whole exome sequencing data from 100 preterm infants enriched in phenotypic extremes, by analyzing the variants both by rare variant methods and common variant methods.

Results

Characteristics of subjects

The overall scheme of this study is shown in Fig. 1. Fifty-eight premature infants with severe ROP (49 type 1 and 9 type 2 or pre-plus) and 42 with mild or no ROP (5 mild and 37 no ROP) in the worse eye or both eyes were included. Characteristics of 100 subjects including demographics, characteristics of ROP and associated morbidities of prematurity are summarized in Table 1. In the severe ROP group, 12 patients with bilateral AP-ROP were included.

Figure 1
figure 1

The overall scheme of this study. ROP retinopathy of prematurity, SKAT-O sequence kernel association optimal unified test, BW birth weight, GA gestational age, KEGG Kyoto Encyclopedia of Genes and Genomes, PANTHER Protein ANalysis THrough Evolutionary Relationships.

Table 1 Characteristics of subjects: demographics, retinopathy of prematurity characteristics, and associated morbidities of prematurity.

Sequencing results

The mean percentage of map** on genome was 99.8% and mean sequencing depth on target 149.8 ± 36.8 (Supplemental Table S1). A minimum 1- and tenfold coverage per base was achieved on average for 95.9% and 92.0% of the target region, respectively (Supplemental Table S1). The number of each type of variant is summarized in Supplemental Table S2. Mean number of missense variants per patient was 9283 ± 271.

Gene-based analysis

Gene-based analysis using SKAT-O, which combines the effects of all rare variants across a gene, with adjustment for BW and GA revealed no genes that reached genome-wide significance (P < 2.5 × 10–6). From the rare/low-frequency variants analysis, the most strongly associated genes by SKAT-O included RNF225, PMS1, PRMT3, SPANXD, ANKRD30A and THSD4 (Table 2). Among the 263 candidate genes, CD36, NOX4, NTRK2, IGFBP7, and NTF4 were the most strongly associated with ROP (Supplemental Table S3). From the common variants analysis, which analyzes each variant separately, the most strongly associated genes by SKAT-O included ZNF341, ALG10B, OR4D6, KCNE1, and FAM198A (Table 3).

Table 2 Results of rare/low-frequency variants analysis: the 20 most strongly associated genes in 100 non-Hispanic Caucasian preterm infants by SKAT-O.
Table 3 Results of common variants analysis: the 20 most strongly associated genes in 100 non-Hispanic Caucasian preterm infants by SKAT-O.

In order to check if there is confounding between genes that are contributing to prematurity and genes that are contributing to ROP specifically, we compared the twenty genes with the largest associations from the rare variant and common variant analysis sets with the largest gene list for prematurity to date29 and found no intersection.

Gene set enrichment analysis (GSEA)

GSEA identified one pathway (GO:0005790, smooth endoplasmic reticulum) from the rare/low-frequency variants analysis that was significantly enriched after correction for multiple tests (Table 4). GSEA also identified 25 pathways from databases including GO, KEGG, PANTHER, Reactome, and Wikipathways with empirical P values of less than 0.05, including the beta2 adrenergic receptor signaling pathway, dopamine receptor mediated signaling pathway, T cell activation, nicotinate/nicotinamide metabolism, and platelet-derived growth factor (PDGF) signaling pathway (Table 4). From the common variants analysis, GSEA identified one significantly enriched pathway (Reactome R-HSA-196836, Vitamin C metabolism) (Table 5). Other pathways with empirical P values of less than 0.05 are listed in Table 5.

Table 4 Results of rare/low-frequency variants analysis: a list of the most significantly enriched pathways (P value < 0.05) from gene set enrichment analysis using WebGestalt.
Table 5 Results of common variants analysis: a list of the most significantly enriched pathways (P value < 0.05) from gene set enrichment analysis using WebGestalt.

Discussion

This study aimed to identify potential genes and pathways associated with ROP by analyzing rare and low-frequency variants from whole exome sequencing data of 100 preterm infants enriched in phenotypic extremes. The key findings from this study are as follows: (1) gene-based analysis with adjustment for BW and GA revealed no genes that reached genome-wide significance; and (2) GSEA identified 2 significantly enriched pathways (smooth endoplasmic reticulum and vitamin C metabolism) that may be important in the pathogenesis of ROP.

Gene-based analysis, which aggregates all the rare variants in each gene, using SKAT-O with adjustment for BW and GA identified many genes with low P values, although no genes reached formal genome-wide significance (P < 2.5 × 10–6). The top most strongly associated genes included several genes of potential interest (Table 2 and 3). THSD4 (P = 5.04 × 10–4) encodes thrombospondin type-1 domain-containing protein 4, also known as “A disintegrin and metalloproteinase with thrombospondin motifs-like protein 6” (ADAMTSL-6). ADAMTSL-6, expressed in various tissues including retina (from FANTOM5 data), is an extracellular matrix protein that promotes assembly of the fibrillin-1 matrix30. Fibrillin-1 controls activation of TGF-β31,32, which is an angiogenic activator and has been implicated in retinal vascular diseases such as diabetic retinopathy33,34. Therefore, THSD4 is one potential target for studies on ROP pathogenesis.

OPA1, the most common gene mutated in dominant optic atrophy, encodes a dynamin-related GTPase that is necessary for mitochondrial inner membrane fusion and maintenance of mitochondrial architecture35. Recently, it was suggested that diabetes resulted in reduced opa1 gene expression and mitochondrial dysfunction in an animal model of diabetic retinopathy (Verma A et al. IOVS 2016;57:ARVO E-Abstract 5446). However, the role of OPA1 in retinal angiogenesis has not been investigated.

Calpain 2, which is encoded by CAPN2, has been known to be involved in neurodegeneration. A recent study showed that inhibition of calpain 2 ameliorated retinal ischemic injury, suggesting that calpain-2 inhibitor might prevent ischemia-induced retinal degeneration36. Therefore, CAPN2 is a potential target for the treatment of ROP.

Among the 263 candidate genes, CD36 showed the lowest P value from SKAT-O test (Supplemental Table S3). CD36 encodes platelet glycoprotein 4, also known as the thrombospondin receptor. Platelet glycoprotein 4 was suggested to be involved in angiogenesis in various ways with or without mediating thrombospondin37,38. Also, as a multi-ligand scavenger receptor, it has been implicated in retina homeostasis38. Therefore, CD36 may also be a potential target for future ROP studies.

GSEA identified several pathways that may be important in the pathogenesis of ROP. Identified pathways included the β2 adrenergic receptor signaling pathway, dopamine receptor mediated signaling pathway, T cell activation, PDGF signaling pathway, Robo4 and VEGF signaling pathways crosstalk, and vitamin C (ascorbate) metabolism (Table 4 and 5). A series of studies showed that the β2 adrenergic receptors play a pivotal role in the regulation of vascular endothelial growth factor (VEGF) production and retinal neovascularization39. Also, the β1/β2 adrenergic receptor antagonist propranolol showed reduction in VEGF expression, retinal neovascularization and vascular leakage in oxygen induced retinopathy39. A recent systematic review on clinical trials which investigated the effect of beta-blockers on ROP concluded that low to moderate quality evidence suggests that prophylactic administration of oral beta-blockers might reduce progression towards stage 3 ROP and decrease the need for treatment40.

Retinal dopamine, which is synthesized in and released by subtypes of amacrine and interplexiform cells, and its receptor signaling pathway has not been associated with retinal angiogenesis41. However, studies suggested that dopamine mediates diverse functions including retina development, visual signaling, and myopic eye growth41,42. Thus, further studies to examine the relationships between dopamine receptor mediated signaling and retinal vascular development or ROP is warranted.

A growing body of evidence supports important roles of inflammation in ROP. Recently, a study showed that regulatory T cells are recruited to the retina in oxygen-induced retinopathy and reduce severe microvascular disease43. In this manner, the T cell activation pathway might be related to development of ROP. The PDGF signaling pathway was also important for pericyte viability and the subsequent prevention of VEGF/VEGFR-2 overexpression and angiogenesis in oxygen-induced retinopathy44. Although PDGF antagonists have been tried for patients with neovascular age-related macular degeneration with unclear benefits (https://clinicaltrials.gov/ct2/show/NCT01944839; Last accessed on 09/13/2019)45, the exact role of PDGF in ROP requires further investigation.

One of the identified pathways (GO:0005790, smooth endoplasmic reticulum) was significantly enriched after correction for multiple tests. In general, smooth endoplasmic reticulum is associated with production of carbohydrate and lipids such as steroid hormones, cholesterol and membrane phospholipids. It also plays an important role in protein modification and intracellular protein transport. However, whether biological processes in smooth endoplasmic reticulum have specific roles in ROP is not known.

A few studies investigated the role of vitamin C in ROP or retinal angiogenesis. An in vitro study showed that vitamin C prevented VEGF-induced increases in endothelial permeability46, and retinal level of vitamin C was reduced in the rat model of ROP47. However, a randomized controlled trial in 2005 which compared high or low dose supplementation of vitamin C on clinical outcome including ROP showed no significant effects on the development of (any stage) ROP48. Further studies to examine the relationships between vitamin C metabolism and ROP is warranted.

This study has several limitations. First, the statistical power was not high because of small sample size (58 severe ROP and 42 mild or no ROP). The small sample size may be one of the reasons why no genes reached genome-wide significance in this study. Second, although we tried to include phenotypic extremes to overcome low statistical power due to small sample size, we could not identify enough phenotypically extreme patients from the i-ROP database. In this study, the severe ROP group included 12 patients with AP-ROP and the remaining patients were selected for highest birthweight in severe ROP group and lowest birthweight in no or mild ROP group. Thus, we believe that our subjects are enriched in phenotypic extremes, which may mean possible enrichment of rare pathogenic variants in our subjects. Third, as a result of the first two limitations, identified association with pathways that are plausible biologically need to be replicated in larger studies.

This study is, to the best of our knowledge, the first NGS-based genetic study in ROP. In this study, we analyzed WES data of 100 premature infants from the large-scale multicenter i-ROP consortium. Although no genes reached genome-wide significance, the results revealed genes and pathways that may be important in development or progression of ROP. Novel genes and pathways may be the targets of future genetic studies such as GWAS.

Methods

This study was approved by the Institutional Review Board at the coordinating center (Oregon Health and Science University) and at each of 8 study centers (Columbia University, University of Illinois at Chicago, William Beaumont Hospital, Children’s Hospital Los Angeles, Cedars-Sinai Medical Center, University of Miami, Weill Cornell Medical Center, and the Genomics Institute at The Lundquist Institute / Harbor-UCLA Medical Center). This study was conducted in accordance with the Declaration of Helsinki49. Written informed consent for the study was obtained from parents of all infants enrolled.

Subjects

The subjects for this study were selected from participants of the Imaging and Informatics for ROP (i-ROP) study, a prospective multicenter cohort study which enrolled preterm infants for ROP screening and collected clinical and imaging data (retinal images obtained using a wide-angle fundus camera [RetCam; Natus Medical Incorporated, Pleasanton, CA]) and blood or saliva samples. In this study, the ROP diagnosis of each eye exam was made by combining clinical exam at each study center and image-based diagnoses by 3 trained graders, as previously described50. The i-ROP study database was reviewed to identify 100 non-Hispanic Caucasian infants enriched in phenotypic extremes (e.g. AP-ROP, non-LBW infants with severe ROP, LBW infants with no ROP) to maximize variant discovery. Among the 373 non-Hispanic Caucasian infants enrolled between July 2011 and October 2016, patients with AP-ROP in at least one eye were selected first. Remaining cases were selected for lowest birthweight in the case of no or mild ROP (defined as ROP less than type 2 ROP), and highest birthweight in severe ROP (defined as pre-plus, type-2 or type 1 ROP), with an enforced ratio of approximately 4:1:1:4 of no, mild, pre-plus/type-2, and type 1 ROP, respectively. In cases where a patient had a twin in the selected set, the most phenotypically extreme infant of the two was selected. Patients without detailed information on demographics, ROP screening, co-morbidities of prematurity, or imaging data were excluded.

Whole exome sequencing

Genomic DNA was extracted from peripheral blood samples and whole-exome sequencing was performed at Bei**g Genomics Institute (BGI; Hong Kong, China). Briefly, genomic DNA was randomly fragmented and enriched for exome sequences using the SureSelect Human All Exon Kit (Agilent Technologies, Santa Clara, CA, USA) and sequencing was performed on a BGISEQ-500 sequencing platform (BGI; Hong Kong, China). Sequencing-derived raw image files were processed by BGISEQ-500 base-calling Software for base-calling and the sequence data of each sample was generated as paired-end reads51.

Bioinformatics analyses

After removing reads containing sequence adaptors and low-quality reads, reads of each sample were aligned to the reference human genome, Genome Reference Consortium Human Build 37 using Burrows-Wheeler Aligner software. Local realignment around insertions and deletions (InDels) and base quality score recalibration were performed using Genome Analysis Toolkit (GATK), with duplicate reads removed by Picard MarkDuplicates. The HaplotypeCaller of GATK was used to call SNPs. After performing hard-filtering for SNPs and Indels as previously described52, the SnpEff tool was used to annotate SNPs (reported in dbSNP v.141) and Indels. The variants were annotated with the allele frequency in 1000 Genomes Project (http://www.internationalgenome.org/)53 or ESP6500 database (http://evs.gs.washington.edu/EVS/), and with prediction algorithms including SIFT54, PolyPhen255, MutationAssessor56, FATHMM57, and MutationTaster58.

Selection of candidate genes

Previously reported candidate genes in ROP and additional genes involving pathways in retinal angiogenesis/vasculogenesis, neuronal development, neuroprotection, and retinal inflammation were selected (n = 164; Supplemental Table S4). Also, additional genes (n = 99) that are related to the 164 candidate genes were identified using Cytoscape with GeneMANIA plugin which finds related genes using large-scale functional association data including protein and genetic interactions, pathways, co-expression, co-localization and protein domain similarity (Supplemental Table S5)59.

Gene-based test

Gene and pathway analyses were run on two subsets of data. First, gene-based testing was performed to detect rare and low-frequency variants with relatively large effect. For the test, variants with a MAF of greater than 0.05 in 1000 Genomes Project and ESP6500 were excluded. In gene-based testing, in order to provide sufficient power, all rare variants within the gene are combined for the statistical analysis. Second, for the common variant analysis, variants with a MAF of ≥ 0.05 in 1000 Genomes Project or ESP6500 were selected. In common variant analysis, each variant is analyzed separately. The qualifying variants for gene-based tests included coding nonsynonymous variants, including missense, stop-gain, stop-loss, start loss, and splice acceptor and donor variants that were predicted to be functionally damaging from at least 1 of the 5 prediction algorithms (SIFT, PolyPhen2, MutationAssessor, FATHMM, and MutationTaster). These qualifying variants in individual genes from the two subsets of data (rare / low-frequency variants and common variants) were subject to sequence kernel association optimal unified test (SKAT-O)60 using the genipe (GENome-wide Imputation PipelinE) v.1.4.0 module61 with adjustment for BW and GA.

Gene set enrichment analysis

Gene set enrichment analysis (GSEA) was performed using WebGestalt (WEB-based Gene SeT AnaLysis Toolkit, http://www.webgestalt.org) using two sets of genes from the rare/low-frequency and common variants analyses, respectively62. For GSEA, we included Gene Ontology (GO, http://geneontology.org)63, KEGG (Kyoto Encyclopedia of Genes and Genomes, http://www.kegg.jp/)64, PANTHER database (Protein ANalysis THrough Evolutionary Relationships, http://www.pantherdb.org)65, Reactome (https://reactome.org)66, and WikiPathways (https://wikipathways.org)67. The obtained P values were adjusted by Benjamini–Hochberg correction (false discovery rate, P < 0.05).

Conference presentation

Portions of this study were presented at the 2018 ARVO annual meeting in Honolulu, HI, on May 1, 2018.