Background

HIV remains a worldwide public health threat, while the efforts for prevention and control efforts have made substantial progress, including reducing of HIV-related morbidity and mortality and increasing life expectancy and access to antiretroviral therapy. The number of people living with HIV was 38.4 million according to the Joint United Nations Programme on HIV/AIDS (UNAIDS) at the end of 2021 [1], and the number of new HIV infections was approximately 1.5 million [2]. The number of people living with HIV was estimated to be 1.25 million in China in 2018, with approximately 80,000 new infections that year [3]. As of 2022, the number of people living with HIV was over 80 thousand in Guangdong Province, which ranked fourth out of 31 provinces [4].

Guangdong Province is located in southern China and has a relatively advanced economy and a higher proportion of sexually active individuals [5]. The number of new HIV diagnoses each year is high and increasing owing to the complex mode of sexual transmission, which has been the main route since 2009 [6], accounting for 52.4%, and increasing continually to 90.3% in 2014. However, the number of new HIV infections predicted in Guangdong Province has not been reported in published peer-reviewed articles. It is important to predict the number of new HIV infections in provinces where incidences of HIV may be different for prevention and control planning, especially since heterosexual transmission is the dominant transmission mode in China.

HIV is increasingly difficult to prevent and control due to intricate modes of sexual transmission in China, including heterosexual transmission from heterosexual men or men who have sex with men (MSM) in heterosexual acts and homosexual transmission. It is observed that HIV began to spread to the general population from key populations in 2007 [4, 7]. Furthermore, impacted by the traditional culture including marital pressures and filial expectations [8, 9], the mode of sexual transmission makes the HIV epidemic complex and diverse due to mixed sexual acts in marital or nonmarital partnerships, commercial or noncommercial partnerships, heterosexual or homosexual partnerships, etc [10].

There are different methods to estimate the number of new HIV infections including mathematical models such as compartmental models and the workbook method; however, a compartmental model is often used to estimate the number of infections [11], which is also recommended by UNAIDS [12,13,14]. Dynamic compartmental models that go beyond the workbook method [15] in long-term prediction are also classical mathematical models of infectious disease transmission, but there are two considerable challenges in develo** a compartmental model to predict new HIV infections, including defining risk groups related to MSM depending on the characteristics of a local epidemic and defining a criterion or a reference for model calibration in addition to parameter estimation.

One definition of MSM considers sex of their partners in sexual acts in addition to their number of sexual partnerships in predictive models of new HIV infections; some MSM may have female sexual partnerships in some countries and are also known as men who have sex with men and women (MSMW). The marriage rate among Chinese MSM ranges from 31.2 to 70% [16,17,18,19] and is even higher during the lifetime [20], and the number of wives (tongqi) of MSM is between 1 million and 16 million [21]. However, MSM included only two subgroups depending on their number of sexual partnerships in previous compartmental models [22, 23]. This was inconsistent with the characteristics of an epidemic in these countries as it ignored the risk of HIV transmission [24,25,26], which was higher among MSMW than among MSM only [27], even though MSM are often classified into two subgroups that make interpretability insufficient between the number of predictions and the number of case reports [3]. MSMW may transmit HIV [26, 28]. With the HIV prevalence among MSM increasing, the HIV incidence among their female partners increased by 5.3 times from 2002 to 2010 [29] because MSM failed to disclose their sexual orientation and HIV seropositivity to their wives or female sexual partnerships to satisfy the social and familial expectations [28, 30] in general, thus reducing the awareness of HIV prevention [31] during heterosexual acts with lower condom use than during homosexual acts due to reproductive purposes [18, 32] or other reasons [33,34,35].

Another challenge is how to determine a criterion for model calibration for a compartmental model. Given that the problem of late diagnosis is prevalent worldwide [36,

Methods

We developed a deterministic compartmental model with four states to predict the HIV epidemic in the population aged 15 and over in Guangdong Province from 2016 to 2050. The model predicted the number of new HIV infections and its 95% credible interval (CrI). We used the Morris and Sobol methods to analyse the sensitivity of the model parameters. The predicted number was calibrated by comparing it with the number of yearly new HIV diagnoses (almost equal to the number of new cases reported in China) and the potential proportion of late diagnoses reported by AIDS experts in the Guangdong Provincial CDC or published peer-reviewed articles [37]. The model structure, population size of risk groups, model parameters, sensitivity analysis and model calibration are as follows.

Model structure

Considering births and deaths, the basic structure of a compartmental model with four states is shown in Fig. 1. Four ordinary differential equations describing the states are shown in formulas (1)–(4), denoting four states of the HIV epidemic, four risk groups, and two sexual transmission routes among MSM. The four states were S (susceptible), I (infected), D (diagnosed), and T (treated); the infectivity of people in the I and D states and the antiretroviral treatment failure of those in the T state were evaluated. We defined MSM as the high-risk MSM who satisfied at least one of two conditions [22, 23]: (1) more than 10 sexual partners over the past six months [44] and (2) rates of inconsistent condom use over the past six months more than 50.0%, which were approximately 50.0% in Guangdong Province [45]; otherwise, they were classified as low-risk MSM. The four risk groups were heterosexual men, heterosexual women, low-risk MSM, and high-risk MSM. The two sexual transmission routes were heterosexual and homosexual. The probability of HIV acquisition is denoted as \(\lambda \left(t\right)\) in the S population per year. The I(t) is denoted as \(\lambda \left(t\left)S\right(t\right)\). The number of people engaging in potentially high-risk sexual acts per year is denoted as \(N\left(t\right)=S\left(t\right)+I\left(t\right)\), with I (t) including D (t) and T (t). \({\upzeta }\left(t\right)\) is the entry rate per year, which is the sum of the natural birth rate and natural death rate. \(d\left(S,I,D,T\right)\) is the death rate in the states for the risk groups.

Fig. 1
figure 1

A schematic diagram of the compartmental model of HIV transmission

$$\frac{{{X_{j,1,i}}}}{{dt}} = \sum\limits_{j = 1}^4 {{\zeta _j}} {X_{j,1,i}} - \left( {\sum\limits_{k = {\text{2,3}},{4^\prime }} {\lambda _{i,k}^j} \left( t \right)} \right){X_{j,1,i}} - {d_{j,1}}{X_{j,1,i}}$$
(1)
$$\frac{{X}_{j,2,i}}{dt}=\left(\sum _{k=\text{2,3},{4}^{{\prime }}}{\lambda }_{i,k}^{j}\left(t\right)\right){X}_{j,1,i}-{d}_{j,2}{X}_{j,2,i}$$
(2)
$$\frac{{X}_{j,3,i}}{dt}={\delta }_{j}{X}_{j,2,i}-{d}_{j,3}{X}_{j,3,i}$$
(3)
$$\frac{{X}_{j,4,i}}{dt}={\psi }_{j}{X}_{j,3,i}-{d}_{j,4}{X}_{j,4,i}$$
(4)
$$j=\text{1,2},\text{3,4}, i=1, 2$$

j denotes the four risk groups, j\(=\text{1,2},\dots ,4\), 1 indicates heterosexual men, 2 heterosexual women, 3 low-risk MSM, and 4 high-risk MSM. i denotes HIV sexual transmission routes, \(i=\text{1,2}\), 1 indicates heterosexual sexual acts, and 2 indicates homosexual sexual acts. k denotes the four states, \(k=\text{1,2},\dots ,4\)’, where 1 is S, 2 is I, 3 is D, 4 is T, and \(4\)’ is treatment failure.

Population sizes of the risk groups

The number of people in the risk groups, including heterosexual men, heterosexual women, low-risk and high-risk MSM, and low-risk and high-risk MSMW, whose risk groups among MSM or MSMW were classified based on the number of male sexual partners among MSM, were calculated on the basis of the population size and the proportion of people aged 15 and over by sex from then the 2020 Guangdong Statistical Yearbook, which includes data from 2019 and before [46]. Data from 2020 were used with the assumption that the population size has remained relatively steady. Presuming that the population size of low-risk MSM was equal to that of high-risk MSM [45], the proportion of MSM was 5.0% of all men aged 15 and over, and the proportion of heterosexual sexual acts among low- and high-risk MSM was 31.2% [16]. By excluding people already living with HIV, we estimated that the population sizes of heterosexual men, low- and high-risk MSM, and low- and high-risk MSMW were 47.89 million, 1.3 million, 1.3 million, 0.4 million, and 0.4 million, respectively. The population size of heterosexual women was 43.73 million, which excluded people living with HIV and a number of lesbians, whose proportion was presumed to be 5.0%, as the risk of HIV transmission is extremely rare in this group.

Model parameters

Model parameters included three parts depending on whether they were known or unknown (unknown parameters: the probability of HIV acquisitions and a mixing index). The known parameters that we collected (see the Supplementary material) included demographic, behavioural, biological and epidemiological data, coming from peer-reviewed published articles, domestic government reports, AIDS expert interviews, and the viewpoints of key experts on World AIDS Day; these parameters were also employed in other studies and meta-analyses. Demographic parameters included the population sizes of the four groups of people aged 15 and over and the number of people who entered and left this population yearly because of natural population growth. Behavioural parameters included sexual partners, condom use rates and the effectiveness of condoms, the last two of which differed between heterosexual and homosexual sexual acts. Biological parameters included the probability of HIV transmission per sexual act and death rates in different states. Epidemiological parameters included HIV prevalence, rates of new diagnoses, and rates of antiretroviral therapy.

The first unknown parameter was the probability of HIV acquisition, which was a function of the probability of not acquiring HIV during a high-risk sexual act, the number of sexual acts, the number of sexual partners, condom use rates and the effectiveness of condoms. We presumed that HIV transmission in our model occurred only via sextual transmission, including heterosexual and homosexual acts, due to sexual transmission being the dominant route since 2009 in Guangdong Province; the rate of sexual transmission increased continually to 90.3% in 2014. We also presumed that heterosexual men engaged in only heterosexual sexual acts, and women were only infected by men who could also be MSMW. These men were indistinguishable from heterosexual men participating in heterosexual sexual acts due to the lack of a relevant compartment for heterosexual women in the compartmental model. The probability of acquiring HIV in a high-risk sexual act among MSM was calculated based on the total number of homosexual sexual acts, of which insertive and receptive anal sex were indistinguishable. People living with HIV who have experienced antiretroviral treatment failure can spread HIV, or they are not infectious and should achieve a suppressed viral load. The function and its calculation are detailed in the Appendix.

The last unknown parameter was a mixing index that we used approximate Bayesian computation (ABC) [47, 48] to estimate. The index was a randomized mixing level of opting for male or female sexual partners among MSM, also called the assortativity, ranging from 0 to 1 [49], where 0 denoted that MSM would only opt for male sexual partners and 1 denoted that MSM would opt for male or female sexual partners completely at random. The number of HIV infections approached to the number of HIV diagnoses as the randomized mixing level increased [49]. Finally, an adaptive sequential Monte Carlo (SMC) sampler [50] was adopted considering the cost of calculations, given that a value for the mixing index could not be acquired from an epidemiological research or published peer-reviewed studies, which only presented an unconvincing presumed value of 0.5 [23].

Sensitivity analysis

Considering the cost of calculations, we combined a qualitative global method and a quantitative global method, the Morris and Sobol methods, to analyse the sensitivity of model parameters to the number of new HIV infections predicted in our model. The first step qualitatively determined the sensitive parameters via the Morris method [51] and then calculated the quantitative effect of those identified parameters on the number of new HIV infections predicted via the Sobol method [52]. The sensitivity analysis excluded the parameter of the mixing index because it was certainly sensitive. The outputs of the Sobol method included the total-order indices and the first-order indices on qualitatively identified parameters from the absolute values of elementary effects (|EEs|) of the Morris method.

Model calibration

The model calibration data included the following: the yearly new HIV diagnoses based on the system of case reports on government websites [53, 54] on World AIDS Day from 2016 to 2022; and the proportion of HIV diagnoses based on published peer-reviewed studies [41, 55, 56], the viewpoints of a national AIDS expert [3] or provincial AIDS expert interviews [57] from the Guangdong Provincial CDC. Moreover, the number of new HIV infections may have increased during the COVID-19 epidemic due to temporary disruptions in health services and changes in sexual risk acts, including interrupted antiretroviral treatment, reduced HIV testing, and a higher proportion of unprotected sexual acts. Considering the proportion of late diagnoses in Guangdong Province, the difference was measured by determining the relative ratios of the number of new predicted HIV infections divided by the number of new HIV diagnoses each year, which was deemed the “goodness of fit” of the model, and a value larger than 1.2 was applied as a measure of model calibration [2. The increase in the predicted number of new HIV infections was much larger during the COVID-19 pandemic and the two years after the end of the pandemic, and then the number continuously declined until 2050. The changing tendency of the number of new HIV infections predicted for the four risk groups was in line with the changing tendency of the total number predicted each year (see Fig. 2).

Table 2 Predicted number of new HIV infections in four susceptible populations in Guangdong Province from 2016 to 2050

The number of new HIV infections and its proportion in the four risk groups from our model varied, as shown in Table 2. The predicted number of the new HIV infections between 2016 and 2050 among heterosexual women was 2,067 (UI: 2,359–3,016) in 2016, which increased to 2,835 (UI: 2,259–2,879) in 2019, jumped to 4,346 (UI: 3,612–4,735) in 2025, and then decreased to 1,228 (UI: 1,008–2,145) in 2050; among heterosexual men, the number was 2,839 (UI: 2,237–3,243) in 2016, which increased to 3,189 (UI: 2,698–3,842) in 2018, and then decreased to 1,751 (UI: 1,427–2,738) in 2050; among high-risk MSM, the number was 2,288 (95% CrI: 660–3,393) in 2016, which slightly increased to 2,568 (95% CrI: 756–3,798) in 2025 and then decreased to 1,046 (95% CrI: 304–1,549) in 2050; and among low-risk MSM, the number was 1,094 (95% CrI: 329–1,612) in 2016, which increased to 1,209 (95% CrI: 357–1,787) in 2019 then to 1,487 (448–2,193) in 2026 and decreased to 824 (95% CrI: 242–1,219) in 2050. The predicted number of new HIV infections can be broken down as follows: women accounted for approximately 25.0% of the total, homosexual transmissions accounted for approximately 40.0% of the total, high-risk MSM accounted for approximately 25.0% of the total, and MSM accounted for 55.0% of men.

Fig. 2
figure 2

Predicted number (95% CrI) of new HIV infections in the population aged 15 and over from 2016 to 2050 in Guangdong Province, China

Discussion

Despite including the period of the COVID-19 pandemic in this prediction, the predicted number of new HIV infections was similar to the number during the HIV epidemic in Guangdong based on the relative ratios from 2016 to 2022. As measures and strategies for HIV prevention and control were carried out regularly before the COVID-19 pandemic in 2016 and 2019, the number of new HIV infections also increased slightly, similar to the number of new HIV diagnoses, which showed a changing trend. Overall, the predicted number increased during the COVID-19 pandemic and for approximately two or three years after the pandemic. The proportion of women out of the total predicted number peaked at 39% during this period. Then, the predicted number decreased continually until 2050.

A range of calibrated criteria, the relative ratios, between the predicted number in this model and HIV diagnoses had values larger than 1.2, fitted with the proportions of HIV diagnoses from Guangdong Provincial AIDS expert interviews on World AIDS Day, which was 71.3% in 2019 [58] and 78.7% in 2021 [57], and the interpretations or viewpoints of national experts regarding the predicted results in China evaluated by Spectrum, which was recommended worldwide [3]. The predicted number of new HIV infections jumped during the COVID-19 pandemic, with the calibrated criterion ranging from 1.4 to 1.7 due to the interruption of or delay in HIV services provided by health care workers who had been diverted for COVID-19 [59] to different extents. These services included HIV testing and counselling, referrals, timely antiretroviral treatment, and the promotion in condom use. Previous studies that used simulation models and cross-sectional data reported that the COVID-19 pandemic should have led to an increase in the number of new HIV infections [43, 60]. HIV diagnoses in another province of China declined by approximately 37.0% during the COVID-19 pandemic compared to the same period before the pandemic, according to a prediction model [61]. Furthermore, there were no convincing data to imply that the variety of sexual behaviours in risk groups differed from that before the COVID-19 pandemic [62].

The predicted number of new HIV infections increased continually for two or three years after the end of the COVID-19 pandemic. Recognizing that high-risk individuals may have been unaware that they had HIV due to reductions in HIV testing, there should have been an increase in HIV transmission because those individuals who worried about contracting COVID-19 did not go to designated institutions for HIV testing. As the COVID-19 epidemic was mitigated, HIV testing failed to promptly reduce HIV transmission, but risk behaviours for HIV may have been unchanged and potentially increased [63] when the COVID-19 epidemic was mitigated. Cross-sectional data showed that the HIV-positive rate of blood donors was much higher than that before the COVID-19 pandemic [64]. The predicted new HIV infections and the trend after the COVID-19 pandemic showed brief increases and then steadily declined after measures and strategies for prevention and control were reinstated, especially scaled up HIV testing [65], timely antiretroviral treatment for newly diagnosed infections, and viral suppression.

Although we used proportions of late diagnoses in addition to new HIV diagnoses from 2016 to 2022 in Guangdong Province to calibrate our model, it was and remains uncertain how many new HIV infections there were and how many were diagnosed in China based on existing research and the viewpoints of national or provincial core experts. The diagnosed proportion of new HIV infections was 68.9% (61.5%, 78.3%) [55] based on predictions made in 2018 by Spectrum regarding the HIV epidemic in China. The diagnosis rate of HIV infections remains low, and the HIV epidemic may be increasing continuously, which is in contrast to predictions from the workbook method and core experts indicating that the HIV epidemic has been stable since 2007 in China [3].

Nevertheless, regarding HIV transmission from MSMW to general women [28], the sex compositions and proportions of heterosexual and homosexual sexual acts associated with the number of new HIV infections from our model were similar to those related to the number of new HIV diagnoses from 2016 to 2019 in Guangdong. The female proportion of new HIV diagnoses and the proportion of heterosexual transmission of new HIV diagnoses in Guangdong during recent years were approximately 30.0% and over 60.0% [66], respectively, which were similar to the national proportions, with the proportions were approximately 30.0% and 70.0% among people living with HIV, respectively [3]. The female proportion of the new HIV infections was higher during the COVID-19 epidemic, and after two years, for the proportion of heterosexual acts among MSMW may increase due to a decline of approximately one-fifth in the number of male sexual partners among MSMW in that period [43], which further could lead to increase for heterosexual transmission among MSMW.

The model parameters and population size in our model were selected based on more reliable sources. The ranges of the parameters were set according to highly cited studies and the viewpoints of core experts, and the prior distribution of the mixing index was reasonable. The model parameters of the HIV epidemic in Guangdong in our model were preferable, and they were replaced with parameters from another province with a similar HIV epidemic were replaced if these parameters for Guangdong were not acquired; otherwise, the values from a meta-analysis were used for China. For some parameters, including the effectiveness of condoms, we opted for values from highly cited studies.

Our modelling study had three strengths. First, the assumptions in the model were in line with the characteristics of the HIV epidemic in Guangdong Province and Chinese social norms and pressures on MSM, including low- and high-risk subgroups classified according to the number of sexual partnerships and the higher proportion of heterosexual sexual acts in China [17, 20]. Second, the calibrated criteria in the model also included the proportion of late diagnoses, which was approximately 30.0% and even higher during the COVID-19 epidemic, in addition to the new HIV diagnoses each year in Guangdong Province. Finally, this is the first compartmental model to publicly predict the number of new HIV infections in members of the population with susceptibility to sexual transmission from 2016 to 2050 in Guangdong Province. However, this study also had four limitations. First, our model failed to identify the risk groups of sexual partnerships, MSMW or heterosexual men, among females with HIV infections. The problem could be further explored by defining of subgroups for females in our model. Second, our model failed to subclassify according to CD4 cell counts, considering that viral suppression may be more reasonably defined as a state of being “Treated”, starting in 2016 for all people with HIV initiating antiretroviral treatment regardless of clinical stage and CD4 cell count. Third, the mixing index failed to randomize between low- and high- risk MSM but did randomize between heterosexual and homosexual sexual acts because HIV transmission from MSM to women may be substantial, impacted by Chinese traditional cultures. Finally, the calibration criterion for our model would also lead to an underestimation of new HIV infections, while we have taken into account the proportion of late diagnoses. And determining an upper value for model calibration is challenging due to the lack of available data.

Conclusion

We developed a deterministic compartmental model to predict the number of new HIV infections from 2016 to 2050. The presumption regarding MSM who engaged in heterosexual acts and the criterion of model calibration may all be in line with the complicated characteristics of the epidemic in Guangdong Province to some degree. The results from the model may simulate a realistic epidemic in Guangdong Province by model calibration. The predicted number slightly increased between 2016 and 2019, was much larger during the COVID-19 pandemic until two years after the end of the pandemic, and then continually declined until 2050. Overall, the HIV epidemic in Guangdong Province remains serious, and it is urgently important to restore services for HIV control and prevention after the COVID-19 pandemic.