Background

Tuberculosis (TB) remains the most lethal infectious disease with an estimated rate of 1.4 million deaths in 2018 [1]. Human-adapted Mycobacterium tuberculosis complex (MTBC), as a causative agent of TB infection, belong to eight phylogenetic branches with a phylogeographical population structure [2, 3]. These lineages include Indo-Oceanic lineage (Lineage 1), East Asian (Lineage 2), Central Asian (Lineage 3), Euro-American (Lineage 4), Ethiopian (Lineage 7), known as Mycobacterium tuberculosis sensu stricto, West African 1 (Lineage 5) and West African 2 (Lineage 6), referred to as Mycobacterium africanum and Lineage 8 (L8) which geographically restricted to the African Great Lakes region [2,3,4].

Different studies have shown that genomic differences among MTBC lineages or sublineages can affect the clinical and epidemiological characteristics of TB infection [5,6,7,8]. In recent decades, some Mycobacterium tuberculosis (Mtb) lineages/sublineages have attracted wide attention due to certain features such as transmission potential, pathogenic properties and association with drug resistance [9, 10]. Lineages 2 and 4 are widely distributed and seem to have a higher pathogenic power compared to geographically restricted lineages [2, 11, 12]. In West and South Asia, a sharp increase has been documented in the circulation of certain sublinages such as NEW-1 (Lineage 4) and CAS (Lineage 3) strains that are prone to emerging as resistant clones [13,14,15]. This growing increase seems be more important in Iran with the national average TB rate of 14 cases per 100,000 population, due to the influx of Afghan refugees and population growth [1]. Accordingly, acquiring comprehensive insight into the dynamics of Mtb population structure is an essential step to adopt effective TB control strategies and improve therapeutic methods and vaccines.

Therefore, the current systematic review and meta-analysis was conducted to determine (1) the overall prevalence of Mtb genotypes/sublineages and (2) to determine the dominant multidrug-resistant (MDR) Mtb genotypes in TB patients in Iran.

Methods

Study protocol

The meta-analysis was based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines for systematic reviews and meta-analyses [16]. The study protocol was registered in the PROSPERO database (CRD42020186561).

Search strategy and selection criteria

For evaluating the diversity of Mtb isolates in Iran, a comprehensive literature search was conducted using the international electronic databases of MEDLINE and Scopus as well as Iranian databases. English-language studies published until April 2020 were retrieved using the following keywords: “Mycobacterium tuberculosis”, “tuberculosis”, “molecular ty**”, “genetic diversity”, “genoty**” and “Iran” combined with the Boolean operators “OR”, “AND” and “NOT” in the Title/Abstract/Keywords field. Additional keywords such as “lineage” combined with “Mycobacterium tuberculosis” were used to avoid missing any articles. Similar strategies using Persian keywords were used to find relevant Persian original articles in Iranian databases, such as Scientific Information Database (www.sid.ir), Irandoc (www.irandoc.ac.ir), Magiran (www.magiran.com), and Iranmedex (www.iranmedex.com).

The titles and abstracts of all the identified articles were reviewed for eligibility, then screening for relevant articles were performed by reviewing the full texts.

The inclusion criteria were: 1) studies reporting the prevalence of Mtb genotypes among TB patients, 2) studies presenting data from Iran irrespective of the publication year, and 3) studies used Spoligoty**, mycobacterial interspersed repetitive unit-variable number tandem repeat (MIRU-VNTR) ty** and Whole-Genome Sequencing (WGS) methods for genoty**. The exclusion criteria, on the other hand, included:1) studies only presenting prevalence data on Mtb genotypes among drug-resistant Mtb isolates, 2) studies providing incomplete data, 3) studies published as meta-analyses and systematic reviews, 4) studies not in English or Persian, 5) studies limited to a single genotype, 6) studies that lacked genoty** data, and 7) studies that were not related to human TB molecular epidemiology. Data screening was performed by two reviewers independently.

Data extraction and quality assessment

Data from the studies meeting our inclusion criteria were extracted. We required the following data: first author’s name, year of publication, study area, molecular techniques, genotype, number of genotypes, total sample size, MDR genotype, sample type and nationality.

According to the items defined in the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) checklist, we evaluated the methodological quality of the included studies using the pre-defined criteria presented in Table 1. This checklist consists of various methodological aspects, and a maximum quality evaluation score of 32 was considered and articles with scores below 18 were excluded from this study [51]. Data extraction and quality assessment were also carried out by two reviewers independently.

Table 1 Characteristics of 34 included studies in this meta-analysis

Statistical analysis

Pooled proportion and 95% CI were used to assess the prevalence of the genotypes in the pulmonary tuberculosis (PTB) and extrapulmonary tuberculosis (EPTB) samples. Generalized linear mixed model with random intercept logistic regression model was used for assessing pooled prevalence [52]. The heterogeneity of prevalence between the included studies was tested and quantified by using Cochran’s Q test and I2 index, respectively [53]. Clopper-Pearson was run for evaluating pooled proportion and confidence interval in the individual studies. Also, continuity correction of 0.5 was considered in studies with zero cell frequencies [54]. The pooled proportion, as an overall prevalence of the genotypes, was derived by the random effects model because of significant heterogeneity between the individual studies. Publication bias was tested by Egger’s linear regression test and Begg’s test (P < 0.05 was set as the significance level for publication bias) [55]. All the statistical analyses were performed by using the metafor R package and MedCalc software.

Results

Search results and studies’ characteristics

A total of 316 articles were identified by the primary search strategy, of which 34 articles met the eligibility criteria and were included in this study (Fig. 1). The selected studies included 8329 clinical samples. Most of the studies were conducted in Tehran (capital of Iran). Publication year of these studies ranged from 2006 to 2020. Spoligoty** and MIRU-VNTR ty** were identified as the most common methods of genoty**.

Fig. 1
figure 1

Flow diagram of literature selection process in the meta-analysis

Quality assessment

Based on the scores of the STROBE checklist, the highest and lowest scores were related to the studies of Velayati et al. (2014) and Feyisa et al. (2016), respectively. The mean score of STROBE tool was 25.72 (SD = 2.42, range: 20–31) (Table 1).

Pooled prevalence of MTBC genotypes in the PTB and EPTB samples

Results of the random or fixed effects meta-analysis are summarized in Table 2. M. bovis as a member of the animal-adapted MTBC accounted for only 3.29% of the studied strains and Mtb sensu stricto (Lineages1–4) comprised the largest proportion of the studied strains. Based on the pooled prevalence of the Mtb genotypes in the PTB and EPTB samples, NEW1 (21.94, 95% CI: 16.41–28.05%), CAS (19.21, 95% CI: 14.95–23.86%), EAI (12.95, 95% CI: 7.58–19.47%), and T (12.16, 95% CI: 9.18–15.50%) were found to be the dominant circulating genotypes in Iran. West African (L 5/6), Cameroon, TUR and H37Rv (parts of the Euro-American super-lineage [L4]) were identified as genotypes with the lowest prevalence in Iran (< 2%). The forest plot of some of the genotypes (i.e., Bei**g, CAS, and EAI) are shown in Fig. 2. In addition, the highest pooled prevalence of MDR strains was found in Bei**g (2.52,95% CI) and CAS (1.21,95% CI) genotypes (Table 2).

Table 2 Pooled prevalence of MTBC genotypes in each studied genotype in PTB and EPTB samples
Fig. 2
figure 2

Forest plots displaying the prevalence of different M.tb genotypes in the studied geographical region

Publication bias

We observed significant heterogeneity across the studies based on the I2 index with a few exceptions (Table 2). However, publication bias was not significant based on the results of Egger’s linear regression test and Begg’s test.

Discussion

Based on the pooled data investigated, all MTBC lineages, except lineage 7 and 8, were found in Iran, which reflects the presence of high diversity in MTBC strains. Phylogeographical population structure of the MTBC stems from the interplay between different factors such as human migration, geography, genetic drift and host-pathogen interaction [4, 5, 56]. Iran is the main host country for Afghan refugees, but the main factor contributing to formation of MTBC lineages phylogeography in Iran has not been identified.

Study of global variation in MTBC strains showed that the prevalence of lineages 2, 3 and 4 strains may be increasing in West Asia, while the prevalence of lineage 1 is declining [15]. The summary of Mtb strains diversity in Iran, based on families/sublineage, showed that NEW1(L4) (21.94, 95% CI: 16.41–28.05%), CAS (L3) (19.21, 95% CI: 14.95–23.86%), EAI(L1) (12.95, 95% CI: 7.58–19.47%), and T (L4) (12.16, 95% CI: 9.18–15.50%) were the dominant circulating Mtb genotypes. EAI (L1) and CAS (L3) are mainly confined to the areas around the Indian Ocean [11]. Movement of strains with people from these regions may explain the presence of these genotypes in Iran. Besides, appearing CAS as a one of the prevalent Mtb subpopulations in Iran may reflect the pathogenic properties of this genotype.

In a recent study, the global proportion of MDR in CAS population was estimated at 30.63% [57]. In our study, based on the pooled prevalence of MDR genotype, CAS was found (1.21%) as a one of the dominant genotypes. This finding reflects the needs for more understanding and monitoring of this subpopulation.

Despite the global dissemination of Bei**g genotype as a prototype of lineage 2, it had low prevalence in our geographical region. However, the highest pooled prevalence of MDR strains was found in the Bei**g (2.52%) genotype. This result is consistent with the previously published reports about the prevalence of Bei**g among MDR-TB isolates in Iran [58]. The low prevalence of Bei**g genotype compared to other genotypes in Iran may be explained by the prevalent Bei**g sublineage, affecting its pathobiological properties and epidemiological dynamics. Further studies are warranted to identify the distribution pattern of the Bei**g sublineages in Iran, which can improve the management of their infection.

The dominance of NEW1 as a specialist sublineage of Euro-American lineage (L4) in Iran was not unexpected. Some evidence has shown that Iran is the probable origin of this family/sublineage, which may reflect ecological adaption in this subpopulation [59]. It is noteworthy that NEW1 genotype is prone to MDR [13]. The pooled prevalence of MDR in NEW1 was 0.8% (95% CI). However, the results of overall MDR estimation may be less representative of the target population, as in the some of the included studies in our analysis; drug susceptibility testing was not reported based on the identified genotype, which may lead to variation in the final results. Other sublineages of lineage 4 such as T, Haarlem, Uganda and S in varying proportions were also observed. This distribution pattern in the subtypes of lineage 4 in Iran may be explained by the effect of human migration and genetic and phenotypic characteristics of each sublineage.

In addition, we observed that lineage 5/6 subtype had the lowest prevalence in our geographical region. Based on the fact that these strains are geographically restricted [2], we can only speculate human migration as the determinant of this distribution. A limitation of this study is that most of the included studies were conducted in Tehran (Capital of Iran). Thus, our finding may not be completely representative of the overall prevalence of different Mtb populations in Iran. In addition, the most of the included studies were based on Spoligoty** and MIRU-VNTR ty** methods while WGS provides a superior resolution compared with these PCR-based genoty** methods to identification of diversity in Mtb strains.

Conclusions

In summary, this systematic review showed that Mtb population are genetically diverse in Iran and the NEW1 (L4) and West African (L5/6) genotypes had the highest and lowest pooled prevalence rates, respectively. This type of evidence can contribute to better clinical and epidemiological management of Mtb infections. Also, there is a need for further in-depth studies to gain a deeper insight into the national diversity of Mtb populations and their drug resistance pattern.