Introduction

ATP-binding cassette (ABC) transporters are a superfamily of membrane proteins that, in humans, comprise 48 genes. ABC transporters catalyse the translocation of a wide spectrum of endogenous substrates across biological membranes, including amino acids, sugars, nucleosides, vitamins, lipids, bile acids, leukotrienes, prostaglandins, uric acid, antioxidants, as well as a multitude of natural toxins (Liang et al. 2015). In addition, ABC transporters mediate the export of a plethora of drug substrates, including calcium channel blockers, HIV protease inhibitors, vinca alkaloids, topoisomerase inhibitors, methotrexate, anthracyclines, and taxanes, into the extracellular space and are thus key modulators of drug resistance, particularly in oncology (Robey et al. 2018). Hence, ABC transporters are of specific clinical and regulatory interest for their involvement in drug–drug interactions (König et al. 2013; Marquez and Van Bambeke 2011; Zhang et al. 2018).

Genetic variants in ABC transporters contribute to the inter-individual variability in the risk of adverse drug reactions and treatment efficacy, and are key modulators of drug resistance. Arguably, the most studied are polymorphisms in ABCB1 (encoding MDR1, P-gp), which have been associated with methotrexate clearance (Kim et al. 2012a), response to antiretroviral protease inhibitors (Coelho et al. 2013), as well as with pharmacokinetics, response, and toxicity of imatinib (Dulucq et al. 2008; Ma et al. 4a). In addition to this backbone, some transporters have additional domains. ABCA transporters have two large extracellular domains (ECDs), while transporters of the ABCB and ABCC subfamilies contain an additional N-terminal TMD0 domain with unclear functional relevance. Furthermore, seven ABC genes of the ABCB subfamily encode only half-transporters (one NBD and one TMD domain) that require homo- or heterodimerization for transporter activity.

Fig. 4
figure 4

Structural analysis of putatively deleterious genetic variants of ABC transporter superfamily. a Illustration of the tertiary structures of ABCA, ABCB, and ABCC transporters. As representative examples, the structures of ABCA1 (PDB identifier 5XJY), ABCB10 (ABCB half transporter; PDB identifier 4AYT), ABCB11 (BSEP; ABCB full transporter), and ABCC7 (CFTR; PDB identifier 5UAK) are shown. Transmembrane domains (TMDs) are shown in red, nucleotide-binding domains (NBDs) are depicted in blue and turquoise, Walker motifs are colored in salmon and the N-terminal Lasso motif is depicted in yellow. b Overview of the genetically encoded structural variability stratified by ABC subfamily and domain. c Schematic topology models as well as 3D protein structures of MDR1 encoded by ABCB1. Different domains in the topology models are shaded based on the identified number of deleterious variants per amino acid in the respective domain. MDR1 constitutes two pseudo-symmetrical TMDs and NBDs encoded in a single polypeptide, colored in orange and blue, respectively. Detailed 3D structure of key protein domains with functionally relevant variants (sticks in cyan or magenta) and substrates (sticks in yellow) are shown as insets under the topology model. In the 3D model, all putatively deleterious variants with MAF > 0.1% are shown as light red spheres, whereas the corresponding part of the secondary structure motif is highlighted in salmon in case of variants with MAF < 0.1%. Note that N21D localizes to the lasso motif for which no crystallographic data were available and the variant is thus not shown. ECD extracellular domain, TMD transmembrane domain, NBD nucleotide-binding domain

When stratifying by domains, we found that genetic variability differed substantially between transporters (Fig. 4b). The lowest numbers of variants per residue were found in the TMD0 domains of ABCB transporters with 0.21 variants/amino acid. In contrast, the NBD2 domains of ABCB and ABCC transporters are more variable (0.35 variants/amino acid). For individual genes, the TMD1 (0.05 variants/amino acid) and NBD1 domains (0.07 variants/amino acid) of ABCB7 were most conserved, while the TMD1 and TMD2 domains of ABCC7 (0.65 variants/amino acid) and ABCA7 (0.56 variants/amino acid), respectively, were > 10-fold more variable.

Finally, we aimed to corroborate our computational variant predictions using structural map** approaches by focussing on the pharmacogenetically most important ABC transporter, MDR1 (also known as P-gp; encoded by ABCB1), for which high-resolution crystal structures are available (Kim and Chen 2018) (Fig. 4c). The clinically important missense variation A893S/T is located in the second intracellular loop of TMD2, which interacts with NBD1, and is necessary for structural stability. The S400N polymorphism is localized directly adjacent to the critical tyrosine at position 401, which coordinates the ATP in its binding pocket in NBD1 by direct van-der-Waals interactions with the adenine of the bound ATP molecule. Q1107P resides within the NBD2 Q-loop, which is necessary for ATPase activity and stabilizes the NBD dimer. No common variants were identified in any transmembrane helix or extracellular domain. However, we found a variety of rare variations in structurally important residues, including variants at the catalytic glutamate residue 556, which is required for ATP hydrolysis (Sauna et al. 2002), as well as various amino acid exchanges in the functionally critical NBD1 and NBD2 Q-loops (Zolnerciks et al. 2014).

Ethnogeographic distribution of pathogenic ABC alleles can inform about Mendelian disease epidemiology

We previously showed that the frequency of loss-of-function variants in SLC transporter genes implicated in recessive Mendelian disorders are suitable proxies to estimate population-specific disease risk (Schaller and Lauschke 2019). Here, we analyzed whether similar associations could be identified for ABC transporter genes. To this end, we comparatively analyzed the frequencies of loss-of-function variants, defined as frameshifts, start-lost or stop-gain variations or variants that affected critical splice site residues, in ABC transporter genes with or without implication in hereditary disease (Fig. 5).

Fig. 5
figure 5

Genetic variability in ABC genes associated with genetic disorders can inform about population-specific disease risk. The gene-wise aggregated frequencies of loss-of-function (LoF) variants (frameshifts, start-lost, stop-gain, and splice site variants) are shown for ABC genes with known associations with congenital diseases (a) as well as for non-disease-associated genes (b)

Overall, 17 of 48 ABC genes are linked to autosomal recessive Mendelian disorders (Supplementary Table 3). Reduced CFTR (ABCC7) function is associated with cystic fibrosis (CF; OMIM 219700). We calculated homozygosity frequencies for ABCC7 loss-of-function variants of 1 in 1850 and 1 in 4300 in Ashkenazim and European individuals, whereas frequencies in individuals of Africans and Asian ancestry were 1 in 24,000 and < 1 in 40,000, respectively. Impaired function variants in ABCC6 are associated with pseudoxanthoma elasticum (PXE; OMIM 264800). In our data set, we find the highest aggregated ABCC6 loss-of-function frequency in individuals of East Asian ancestry (0.5%), resulting in estimates of affected individuals of 1 in 42,530. Similarly, high carrier rates were identified in Europeans (0.4%; 1 in 52,000) and Finns (0.4%; 1 in 82,000), whereas risk allele prevalence was significantly lower in all other populations. Congenital generalized hypertrichosis (OMIM 135400) is a rare disease with varying presentations and comorbidities that is speculated to be, at least in part, caused by loss of ABCA5 function (DeStefano et al. 2014). While global prevalence rates have, to our knowledge, not been reported, the disease was originally described in individuals of Mexican ancestry (Pavone et al. 2015), aligning with our finding of highest ABCA5 loss-of-function frequencies in Latino populations (0.7%; 1 in 20,500).

In conclusion, these data provide an overview of the frequency of ABC loss-of-function variants in the general population that can be used to estimate population-specific Mendelian disease risk, thus providing valuable information for epidemiological rare disease research and clinical geneticists.

Discussion

The ABC superfamily of transporters is of importance for drug response and toxicity, and genetic rare disease research. ABC transporters translocate a wide spectrum of endogenous substrates and medications. Consequently, identification of ABC transporters that interact with a drug candidate constitutes a critical step in drug discovery and development (Benadiba and Maor 2016; Yee et al. 2018). Previous clinical studies implicated genetic germline polymorphisms in at least 12 ABC genes with risk of adverse drug reactions or altered chemotherapy efficacy (Tables 1, 2, 3 and Supplementary Table 2). In addition, genetic variations in 21 ABC genes are causative for Mendelian disorders. Therefore, understanding the genetic landscape of ABC transporters constitutes a potentially important area for the personalization of oncological therapy and risk allele epidemiological study of relevant Mendelian diseases.

In this study, we detected a total of 62,793 exonic variants, the vast majority (98.5%) of which are rare and functionally poorly understood. In addition to these single-nucleotide variants and indels, we identified 1003 ABC alleles in which at least one exon was deleted or duplicated. Notably, somatic ABC gene CNVs have been implicated in acquired drug resistance. Studies of drug-resistant cell lines derived from human neoplasms identified amplifications of at least 13 ABC transporter genes, including ABCB1, ABCC1 and ABCC4 (Yasui et al. 2004). Conversely, deletions of the multi-drug resistance transporters predicted response to neoadjuvant therapy in breast cancer patients (Litviakov et al. 2016). Notably, while drug resistance is primarily characterized by somatic amplification events, the majority of CNVs in our data set were deletions and it will be interesting to observe whether patients with germline deletions of pharmacologically important drug transporters are predisposed to favorable therapeutic responses using drugs, which are substrates of the deleted transporter.

There is an increasing body of evidence describing differences in drug response, ADRs and clinical outcomes from chemotherapy based on genetic differences between ethnic groups (Phan et al. 2011). For instance, Caucasian colon cancer patients were at significantly higher risk to develop diarrhea, nausea, vomiting, and stomatitis during adjuvant 5-fluorouracil-based chemotherapy compared to African Americans (McCollum et al. 2002). Moreover, the risk of dose-limiting ADRs due to taxanes or platinum therapy was significantly lower in Caucasian lung cancer patients compared to patients of Asian descent, whereas response rates consistently showed inverse correlations (Gandara et al. 2009; Lara et al. 2009, 2010). This variability is likely to be at least in part caused by differences in the allelic distribution for genes involved in the disposition of the respective chemotherapeutics.

Mounting evidence suggests that the targeted interrogation of candidate pharmacogenetic polymorphisms is not sufficient to accurately predict the drug response of a given patient (Lauschke and Ingelman-Sundberg 2016, 2018). Importantly, our previous data indicate that variant burden rather than allele status of specific ABC variants is a predictor of clinical outcomes, thus corroborating that NGS-based approaches can add value to personalized cancer prognostics (** of clinically impactful variants onto the 3D structure of MDR1 revealed a preferential localization in NBDs. Generally, the NBDs in MDR1 are highly conserved compared to the substrate-binding domains, indicating that NBDs might be more sensitive to functional alterations, whereas impacts of variations in the substrate-binding domain or translocation channel seem to be less pronounced (Wolf et al. 2011). The two synonymous variants indicated here (G412G and I1145I), although not resulting in amino acid exchange, have been suggested to affect transporter function by disrupting the cotranslational folding process via introduction of rare codons (Kimchi-Sarfaty et al. 2007). The triallelic variation at position A893, which localizes to a less conserved transmembrane helix, has not been reported to affect transporter function in vitro (Kimchi-Sarfaty et al. 2002). Thus, functional effects associated with this variant might be due to the strong linkage with G412G and I1145I (Fung and Gottesman 2009).

Overall, we found that the ABC transporter superfamily was highly population-specific and inter-ethnic variability is commensurate with other genetically diverse pharmacogene families, including CYPs (Zhou et al. 2017), SLCOs (Zhang and Lauschke 2019) and UGTs (Kaniwa et al. 2005). Overall, 74.9% of all variants that were predicted to affect the functionality of the respective ABC transporter were specific to a single population and the overall load of functional genetic variability differed considerable between the analyzed populations. Inter-ethnic variability was furthermore reflected in differences in population-specific prevalence of ABC-associated Mendelian diseases with autosomal recessive inheritance. For instance, frequencies of CF are around 1 in 2500–3500 newborns of Caucasian ancestry, whereas only 1 in 17,000 and 1 in 31,000 children of African and Asian ancestry are affected, which closely aligns with predictions based on loss-of-function carrier rates (1 in 1850 in Europeans, 1 in 24,000 in Africans, and < 1 in 40,000 in East Asians). Similarly, PXE has been reported to have a prevalence around 1 in 50,000 Dutch individuals (Kranenburg et al. 2019), compared to our estimates of 1 in 52,000 in Europeans based on ABCC6 loss-of-function allele frequencies. Interestingly, ABCC6 was also the ABC gene that was found to harbour most CNVs, which is aligned with the previous studies describing genomic deletions in this locus in PXE patients (Costrop et al. 2010; Katona et al. 2005). Combined, these data suggest that population-scale sequencing data provide an important tool to predict Mendelian ABC disease risk. Notably, however, this approach is only suitable for diseases in which heterozygous loss of gene function is phenotypically silent, thus excluding autosomal dominant or X-linked modes of inheritance. Taken together, our analyses revealed striking ethnogeographic differences in ABC variability profiles that might explain at least part of the observed variability in chemotherapy response and incidence of Mendelian disorders between populations. Furthermore, the population-scale genomic data set presented here promises to provide a powerful resource for the evaluation of genetic ABC disease epidemiology.

In summary, we comprehensively profiled the genetic variability of the human ABC transporter superfamily and revealed a surprising extent of rare and population-specific variations. Computational evaluations of the functional impacts of these variants indicate that these variants contribute considerably to the variability in ABC transporter function with potentially important consequences for chemotherapeutic treatment regimens. Thus, these data incentivize the consideration of sequencing-based genotypes for patient stratification, particularly in the current era of clinical trial globalization. Furthermore, we expect that a deeper understanding of the functional consequences of ABC transporter variability might be useful to improve public health strategies and flag patients at risk of not responding appropriately to treatment with ABC substrates.