1 Introduction

It is by now a well-established fact that good institutions are decisive for economic growth and development. More recently, the importance of institutions has led researchers investigate the nature of institutional development, in particular convergence in institutional quality, to gain more insights into the economic growth process (one of the earliest contributions can be attributed to Knack and Keefer 1995). The majority of empirical studies focuses on the concepts of beta or sigma convergence (examples include Elert and Halvarsson 2012, Savoia and Sen 2016, and Schönfelder and Wagner 2016) and thus, does not take into account that theory suggests the existence of multiple equilibria in institutional quality (cf. Acemoglu and Robinson 2006, 2008). Against this background, it would however be more adequate to prefer empirical methods that allow for the possibility of various institutional clubs.

The above discussion also gives rise to a potential contradiction in the institutions-development relationship: If we agree that institutions exhibit multiple equilibria but a huge literature on income convergence appears to postulate a single long-term steady state equilibrium, how can institutions be a causal determinant of economic performance? Blackburn et al. (2006), among others, solve this problem theoretically by develo** a dynamic general equilibrium model that shows that economic development and corruption can be jointly determined so that we have, for instance, a high corruption – low development club and a low corruption – high development club. Kar et al. (2019) are the first to solve this contradiction also empirically. They apply Phillips and Sul (2007) log-t test to identify income and institutional convergence clubs for a sample of 111 to 117 countries employing per capita income data from the PWT 9 and institutional quality data from the ICRG dataset. Their results suggest that there exist multiple income clubs and also multiple institutional clubs with various countries being stuck in a low income trap and/or poor institutional trap. Kar et al. (2019) also show that poor institutional traps are important determinants of income traps.

The question of whether institutional traps determine income traps appears to be also very interesting in the case of China. In general, the role of institutional quality for China’s development process is puzzling: Very often, it is argued that China has reached miraculous GDP growth despite (internationally) relatively low institutional quality; however, within the country, the picture appears to be more nuanced: Even though Chinese provinces exhibit homogeneous constitution, law and governance structures (cf. Ji et al. 2014), the level of institutional quality differs across provinces (cf. Fan et al. 2010 and Tang et al. 2014).

Figure 1 displays the average institutional qualityFootnote 1 (ranging between zero and ten) (light blue bars, left y-axis) and the average GDP p.c. (dark blue bars, right y-axis) over the period 1997–2007 for the 22 provinces, 4 autonomous regions, and 4 municipalities of our sample. Please note that for reasons of simplicity, we will in the following refer to these 30 “provincial level administrative divisions” (省级行政区) as “provinces”.Footnote 2 Some of these “provinces” report relatively high levels of institutional quality between seven and nine, whereas the majority of provinces only scores around five or below. In addition, there is a very unequal economic development across provinces with average incomes over the same period, ranging from ¥3,854 to ¥38,846 (in constant 2005 prices) between the poorest and richest province, namely Guizhou and Shanghai. Besides the two top performers Bei**g and Shanghai, there appears to be a small number of provinces performing above average, whereas the rest reports a relatively similar below-average per capita income. Interestingly, Fig. 1 indicates that many provinces reporting high levels of institutional development also exhibit relatively high per capita incomes. This impression is also corroborated by a scatter plot of the provincial institutional quality and the log GDP p.c. (cf. Fig. 6 in the Appendix A).

Fig. 1
figure 1

Provincial per capita income and institutional quality (average between 1997 and 2007). Source: NBS, own calculations and Fan et al. (2010). Notes: GDP p.c. in constant 2005 yuan. Each bar depicts a province’s mean GDP p.c. (dark blue), and respectively, institutional quality (light blue) between 1997 and 2007. The horizontal dark blue (light blue) line indicates the mean GDP p.c. (mean institutional quality) of all provinces over the period 1997–2007

Overall, the institutions-development nexus appears to be particularly interesting in the case of China. However, while there is a considerable body of literature on income inequality and income convergence/divergence across China’s provinces, the development of institutional quality at the provincial level, the impact of institutions on growth, and the possibility of convergence or even multiple equilibria in income and institutional quality have rarely been studied, pointing to a clear gap in the literature (a brief overview on existing studies is provided in Sect. 2). Our paper aims to add further arguments to this branch of research by focusing on the concept of club convergence. In particular, we analyze whether there is institutional quality convergence across Chinese provinces or whether there exist multiple institutional clubs over the period 1997–2007 by using the log t test proposed by Phillips and Sul (2007).

Our findings indicate that there exist multiple institutional clubs within China. We identify three rather small clubs which follow an above-average high institutional quality path. The remaining two clubs which together account for the majority of provinces find themselves at below-average low institutional quality paths. Using the same methodology, we find that various provinces are additionally caught in a low-income trap. In a next step, we analyze the causal relationship between poor institutional traps and low-income traps in China by using a recursive bivariate probit model. We find that institutional traps are important determinants of income traps, giving rise to the recently identified phenomenon of a ‘double trap’ (cf. Kar et al. 2019). Moreover, we find that human capital is another important determinant of income traps, while globalization/trade is decisive for avoiding poor institutional traps.Footnote 3

Our research is mostly related to the paper of Kar et al. (2019) who analyze the impact of institutional traps on income traps at the country level and to the studies conducted by Glawe and Wagner (2019a, c, d, 2020b) who analyze the impact of institutions on growth across Chinese provinces and also elaborate on the institutional convergence process within China.

Our paper is the first study that analyzes multiple equilibria in institutional quality within China and also the first study to empirically show that there is kind of a ‘double trap’ in China with poor institutions traps determining low-income traps.

The remainder of this paper is organized as follows. Section 2 provides an overview on the related literature. Section 3 is then dedicated to the identification of income and institutional clubs in China over the period 1997 to 2007 by using the convergence tests developed by Phillips and Sul (2007). In Sect. 3.1, we first describe our data and outline our research methodology. The institutional and income clubs identified via the log t convergence test are then presented in Sect. 3.2. Based on these findings, in Sect. 4, we analyze the (causal) relationship between poor institutional traps and low-income traps by using a recursive bivariate probit model. We again first describe the methodology and data in Sub-Sect. 4.1 before discussing our main findings in Sect. 4.2. Concluding remarks are provided in Sect. 5.

2 Literature Review

As already mentioned in the Introductory Section, there are various strands of the literature related to our paper, namely the literature on (i) the impact of institutions on growth, (ii) the development of institutional quality, and (iii) the formation of income clubs and the underlying methodology. We first very briefly refer to the findings of the general literature and then focus more extensively on the China-related research. Finally, we show how these three strands can be combined by analyzing the development of institutions (ii) using the club convergence methodology (iii) and then examining the impact of the thus formed institutional clubs/traps on per capita income clubs/traps (i)/(iii).

2.1 Institutions and Economic Development

There is a significant body of literature focusing on the importance of good institutional quality for economic growth and development. Prominent contributions of this branch include the studies of North (1981), Hall and Jones (1999), Acemoglu et al. (2001, 2014), and Rodrik et al. (2004). Most existing research focuses on cross-country studies and there are much less within-country studies, e.g. Niquito et al. (2018) for Brazil, Liberto and Sideri (2015) for Italy, and Glawe and Wagner (2019b) for Europe. The role of institutional quality on economic development at the regional level appears to be particularly interesting for China since it is often argued that the country has achieved tremendous growth despite relatively low institutional quality (in cross-country comparison). Surprisingly, there is very little research on the impact of institutions among Chinese provinces. One of the few examples is the study conducted by Glawe and Wagner (2019a) which is based on OLS and 2SLS estimations. They find that at the provincial level, institutional quality played in fact an important role for the economic success of a province in China, even more important than geography and integration. When simultaneously examining the relationship between institutions, human capital, and economic development, the authors find that human capital “trumps” everything else; however institutional quality has a highly significant indirect effect on provincial per capita income by improving human capital. In their subsequent paper, Glawe and Wagner (2020b) employ a dynamic panel data model to analyze the role of improvements (i.e., growth) in institutional quality and human capital (rather than the levels of these two variables) for the economic success of a province in China over the period 2003 to 2007. Using system GMM estimation, they find that while growth in human capital fosters economic growth all over China, only coastal provinces record a positive effect of institutional improvements on the growth rate of per capita income.Footnote 4

There are also some other studies that investigate the role of institutional quality at the provincial level, however, in a different context. For example, Ji et al. (2014) primarily focus on the role of natural resources. In particular, they analyze the interplay between resource abundance, institutional quality, and economic growth in China. They find that resource abundance has a positive impact on economic growth at the provincial level over the period 1990–2008 and this effect depends nonlinearly on institutional quality (measured by the confidence in courts in 1995). There are also some studies examining the impact of institutional quality on the firms’ R&D activity. For instance, Ang et al. (2014) show that the effective property rights enforcement at the provincial level is critical for encouraging financing and investing in R&D. Similarly, Zhou (2014) finds that institutional quality positively affects the decision of firms to engage in R&D activities.

2.2 Institutional Development

As described above, there is an increasing body of research studying the role of institutions for explaining cross-country differences in economic performance starting in the 1990s. However, surprisingly, only recently, studies have started to examine the development of institutional quality, in particular, whether there is institutional convergence or divergence (cf. Savoia and Sen 2016). The only early (empirical) notification is made by Knack and Keefer (1995) and Knack (1996) who find that differences in institutional quality are one important hindrance of income convergence across countries. More recently, Savoia and Sen (2016) test for convergence in legal bureaucratic and administrative institutional quality by using cross-section and panel methods on a large sample of countries from the 1970s to 2010. They find that countries with initially poor institutions tend to slowly catch up institutionally, whether they share the same initial conditions (conditional convergence) or not (absolute convergence). In the same vein, Elert and Halvarsson (2012) examine whether there is convergence in economic institutions, drawing on the literatures of economic convergence and of industrial organization. They use the Economic Freedom of the World-index over the period 1970–2009 to proxy for economic institutions. They find evidence of institutional convergence, that is, countries with lower institutional quality experience faster institutional change than countries with higher institutional quality. Their results also show that countries with lower institutional quality have higher variability of institutional change. Using distributional analysis, they analyze institutional transition probabilities. Their results indicate that the probability of a country ending up with high-quality institutions is high in the long-run. Besides the studies of Glawe and Wagner (2019c, d) who analyze beta- and sigma-convergence of institutional quality in China (using the government efficiency index constructed by Tang et al. 2014), there is no research on how institutions have evolved in China at the provincial level. Both studies provide evidence for conditional convergence in institutions across provinces.

2.3 Club Convergence

Income convergence has long been an important topic in economics since the question of whether poor countries will stay poor or will be able to catch up to the developed economies has important policy implications. One of the earliest contributions to the income convergence literature can be dated back to Baumol (1986); however, the widely known empirical concepts of beta- and sigma-convergence were first introduced by Barro and Sala-i-Martin (1992) and Quah (1993). Beta-convergence occurs if a poor country tends to grow faster than a rich one so that it tends to catch up to the latter, whereas sigma-convergence applies if the dispersion of per capita income across countries declines over time. Beta-convergence is a necessary but not a sufficient condition for sigma-convergence (see Young et al. 2008). The initially cross-sectional studies were criticized since they do not control for cross-country heterogeneity, endogeneity biases, or measurement errors (see Temple 1999), giving rise to time-series and panel data analyses. An especially important contribution in this field is made by Phillips and Sul (2007). Their method overcomes various shortcomings of previous studies by allowing for different time paths as well as individual heterogeneity, and also enables to distinguish between various convergence possibilities, among others, absolute convergence, absolute divergence, and also multiple steady states (i.e., club convergence). While there is already a significant body of literature applying Phillips and Sul (2007) method in order to analyze income convergence across different sets of countries and also within regions,Footnote 5 only very few of these studies focus on China. For example, Tian et al. (2016) find that provincial incomes are converging into two clubs: seven east-coastal provinces (Shanghai, Tian**, Jiangsu, Zhejiang, Guangdong, Shandong, and Fujian) and Inner Mongolia are converging into a high-income club, and the remaining provinces are converging into a low-income club. In addition, they obtain strong evidence that income inequality within a club decreases, while that between clubs increases over time. Li et al. (2018) apply the log t convergence test to identify economic growth convergence clubs in 2286 Chinese counties over the period from 1992 to 2010. The results indicate significant convergence club patterns at the county level, resulting in the gradual formation of six convergence clubs.

2.4 Institutional Clubs and their Relationship to Income Traps

To our knowledge, Kar et al. (2019) are the first applying the Phillips and Sul (2007) method in order to identify (per capita) income clubs and also institutional clubs. Their sample comprises 111 to 117 countries over the period 1985 to 2015. Regarding per capita income they employ PWT 9 data and regarding institutional quality they primarily focus on four indicators of the ICRG Dataset, namely contract viability, law and order, bureaucratic quality, and corruption. Moreover, Kar et al. (2019) show that poor institutional quality traps are determinants of low-income traps, combining the three strands of the literature presented above. In the present paper, we show that the same applies to Chinese provinces, being the first to analyze institutional club convergence at the regional (within country) level.

3 Identifying Institutional and Income Clubs

This section is dedicated to the identification of institutional and income clubs within China by using the log t test proposed by Phillips and Sul (2007, 2009). After introducing the Phillips and Sul method and describing our two main variables in Sect. 3.1, we subsequently discuss our regression results in Sect. 3.2.

3.1 Estimation Strategy (log t test) and Data

In the following, we provide a brief description of the log t model developed by Phillips and Sul (2007, 2009). Under this framework, the specification of the panel data \({X}_{it}\) can be expressed as follows:

$${X}_{it}={g}_{it}+{a}_{it},$$
(1)

where \({g}_{it}\) represents systematic (e.g. permanent common) components and \({a}_{it}\) comprises transitory components. For example, \({X}_{it}\) can present the level of institutional quality or the (log of) per capita income. In order to separate common components from idiosyncratic components, we transform Eq. (1) as follows:

$${X}_{it}=\left(\frac{{g}_{it}+{a}_{it}}{{\mu }_{t}}\right){\mu }_{t}={\delta }_{it}{\mu }_{t},$$
(2)

where \({\mu }_{t}\) is the common factor and \({\delta }_{it}\) is a time-varying factor loading coefficient which absorbs any idiosyncratic movements in \({X}_{it}\). As argued by Phillips and Sul (2007: 1780) it is impossible to estimate \({\delta }_{it}\) directly without imposing additional structure and assumptions on the dynamic latent factor model, i.e. on \({\delta }_{it}\) and \({\mu }_{t}\). Therefore, we remove the common factor \({\mu }_{t}\) by constructing the following relative transition paths:

$${h}_{it}=\frac{{X}_{it}}{{N}^{-1}{\sum }_{i=1}^{N}{X}_{it}}=\frac{{\delta }_{it}}{{N}^{-1}{\sum }_{i=1}^{N}{\delta }_{it}},$$
(3)

where \({h}_{it}\) is the relative transition parameter which measures the loadings \({\delta }_{it}\) in relation to the panel average at time \(t.\) That is, like the loading coefficient, \({h}_{it}\) traces out a transition path for economy \(i\), however, in contrast to \({\delta }_{it}\), it does so in relation to the panel average. Equation (3) indicates the following two properties of \({h}_{it}\): (i) the cross-sectional mean of \({h}_{it}\) is unity; (ii) if the factor loadings \({\delta }_{it}\) converge to δ, the relative transition paths given by \({h}_{it}\) converge to unity. In that case, the cross-sectional variance of the relative transitions parameter \({h}_{it}\) converges to zero asymptotically, as expressed in Eq. (4):

$${H}_{t}={N}^{-1}{\sum }_{i=1}^{N}{\left({h}_{it}-1\right)}^{2}\to 0\;\mathrm{as }\;t \to \infty .$$
(4)

Decreasing cross-sectional variation does not necessarily imply overall convergence, it can also occur when there is for instance local convergence within subgroups. In order to allow for this possibility, following Phillips and Sul (2007: 1785), we model \({\delta }_{it}\) in semi-parametric form as expressed in Eq. (5):

$${\delta }_{it}={\delta }_{i}+\frac{{\sigma }_{i}}{L\left(t\right){t}^{\alpha }}{\xi }_{it},$$
(5)

where \({\delta }_{i}\) is fixed, \({\xi }_{it}\) are iid(0,1) across \(i\) but weakly independent over \(t\). \({\sigma }_{i}>0\) is the heterogeneity parameter.\(L\left(t\right)\) is a slowly varying function for which \(L\left(t\right)\to \infty\) as \(t\to \infty\) such as \(\mathrm{log}(t)\) (as suggested by Phillips and Sul). \(\alpha\) is the rate at which cross-sectional heterogeneity declines to zero over time and thus can be interpreted as the speed of convergence. This formulation guarantees that \({\delta }_{it}\) converges to \({\delta }_{i}\) for all \(\alpha \ge 0\). The null hypothesis of absolute convergence can thus be written in the semi-parametric form as

$$H_0:\delta_i=\delta\;\mathrm{for}\;\mathrm{all}\;i\;\mathrm{and}\;\alpha\geq0.$$
(6)

Regarding the corresponding alternative hypothesis, we can distinguish between two cases:

$$\begin{array}{l}H_A:\\\begin{array}{c}\\(\mathrm i)\lim_{t\rightarrow\infty}\delta_i=\delta\;\mathrm{for}\;\mathrm{all}\;i\;\mathrm{with}\;\alpha<0\end{array}\\\begin{array}{c}\\(\mathrm{ii})\lim_{t\rightarrow\infty}\delta_i\neq\delta\;\mathrm{for}\;\mathrm{some}\;i\;\mathrm{with}\;\alpha\geq0,\end{array}\end{array}$$
(7)

where the first case corresponds to absolute divergence, whereas the second case presents a situation in which sub-groups converge to different steady states with possibly some diverging units, that is the possibility of club convergence.

The hypothesis test can be implemented through the regression Eq. (8), which is also called the “log t regression model” where \(L\left(t\right)\) is set as \(\mathrm{log}(t)\):

$$\begin{array}{l}\log\left(\frac{H_1}{H_t}\right)-2\;\log\left(\log\left(t\right)\right)=a+b\;\log\left(t\right)+u_t\\\mathrm{for}\;t=\left[rT\right],\left[rT\right]+1,\dots,T\;\mathrm{with}\;r>0,\end{array}$$
(8)

where \({H}_{t}\) is defined as in Eq. (4), and \(\widehat{b}=2\widehat{\alpha }\) (where \(\widehat{b}\) is the fitted coefficient of \(\mathrm{log}\left(t\right)\) and \(\widehat{\alpha }\) is the estimate of \(\alpha\), that is, the decay rate or speed of convergence (cf. also Phillips and Sul 2007: 1789). \({H}_{0}\) can be tested by a heteroscedasticity- and autocorrelation-consistent (HAC) one-sided t-test of the inequality \(\alpha \ge 0\). It is rejected at the 95 percent significance level if the t-statistic is smaller than -1.65. In case that the null hypothesis of absolute convergence is rejected, we can have either absolute divergence (that is, case (i)) or club convergence (that is, case (ii)). Therefore, in a next step we perform a clustering procedure in order to identify sub-groups for which the log t test shows convergence described in detail below. If we identify such sub-groups, we conclude that our sample shows club convergence, whereas in the absence of such sub-groups, we draw the conclusion that there is absolute divergence.

Following Phillips and Sul (2007, 2009: 1170), we use the following clustering algorithm consisting of four steps:

  1. 1.

    Cross-section sorting: We order the \(N\) provinces in our panel decreasingly according to their observation in \({X}_{it}\) (for instance, institutional quality) in the last period or in the last fraction of the sample (for instance, 1/2).

  2. 2.

    Formation of the core group of \({k}^{*}\) provinces: We select the first \(k\) highest units to form the subgroup \({G}_{k}\) for some \(2\le k<N\) and then run the log t regression to obtain the test statistic \({t}_{k}=t({G}_{k})\) for this subgroup. We choose the core group of size \({k}^{*}\) by maximizing \({t}_{k}\) subject to \(\mathrm{min }\left\{{t}_{k}\right\}>-1.65\). If this condition is not satisfied for \(k=2\), the highest unit is dropped from the core group and we form the core group starting with the second highest unit, etc. If \(\mathrm{min }\left\{{t}_{k}\right\}>-1.65\) does not hold for all such pairs, we conclude that there is absolute divergence and we exit the algorithm.

  3. 3.

    Sieving provinces for new club members: We add one of the remaining provinces at a time to the core group (with \({k}^{*}\) members) and run the log t test again. The new province is included if the respective t-statistic is greater than the sieving criterion \({c}^{*}\) which we set to zero.Footnote 6 All provinces that satisfy this condition are added to the core group and we again run a log t test for this new sub-group. If it satisfies \({t}_{k}>-1.65\), we conclude that the group forms the first convergence club. If this is not the case, we have to raise the critical value and repeat until \({t}_{k}>-1.65\).

  4. 4.

    Recursion and stop** rule: We form a second group consisting of all provinces that could not be sieved in the previous step and run the log t test again for this sub-group. If \({t}_{k}>-1.65\), we conclude that there are two convergence clubs. If this is not the case, however, we repeat step 1–3 in order to check if this second group can itself be subdivided into convergence clubs or, if the remaining provinces diverge.

Merging After having completed the process described above and if more than one convergence club has been detected, as a final step, we test whether these clubs can be merged to form larger clubs. Therefore, we take the two highest clubs and run the log t test again. If the t-statistic is greater than -1.65, we conclude that both clubs can be merged. We then add the next highest club until the convergence hypothesis is rejected, that is, \({t}_{k}\le -1.65\) and proceed to identify more mergers from the remaining clubs. After all possible mergers have been completed we have our final convergence clubs.

In the following, we briefly describe the data used for the log t test. We employ a panel dataset of 30 provincial level administrative divisions (in particular, 22 provinces, 4 direct-administered municipalities, and 4 autonomous regions)Footnote 7 over the period 1997 to 2007. The choice of time period and regions (we do not include Tibet, Taiwan, Hong Kong, and Macao) is due to data availability on the marketization index. However, since this period lies between the Southern Tour of Deng ** earlier in the 1990s (and the opening up of the inland provinces to foreign direct investment) and the global financial crisis, it seems to be reasonable. Moreover, it coincides with China’s Western Development Strategy (“Go West”) that was launched in 1999, which makes it in fact an interesting study period. Descriptive statistics of our two main variables – institutional quality and the per capita income – are provided in Table 1. The mean of the per capita GDP is approximately 12,220 and the average institutional quality is about 5.48 (with a quite high standard deviation of 2.00). The correlation coefficient of the two variables amounts to 0.78 (significant at the 1-percent level).

Table 1 Descriptive Statistics (I)

The marketization index constructed by Fan et al. (2010) is used as a measure of institutions. It is used by various empirical studies (for instance by Che and Wang (2013) and Zhou (2014)) to measure the quality of institutions at the provincial level in China. The marketization index varies between 0 and 10, a higher score indicating stronger institutions, and it consists of five sub-indices (namely, “government and market relation”, “development of the non-state enterprise sector”, “development of the commodity market”, “development of factor markets” and “market intermediaries and the legal environment for the market”) as well as a total of 23 basic indicators. It should be noted that the set of institutions that matter for economic performance is far more complex and cannot be fully captured by the marketization index. However, since the marketization index comprises important aspects of institutional quality (for instance regarding contracting institutions and property rights institutions) and due to the serious data limitation, we decided to focus on this index. Data on the (log) per capita income is obtained from the National Bureau of Statistics of China (NBS 2018). Since this data is only available in current RMB, we divided the series by the consumer price index for 2005 to obtain the time series in constant 2005 RMB.

3.2 Estimation Results

As outlined in Sect. 3.1, we first test whether there is overall convergence in both p.c. income and institutional quality. The t-statistics displayed in the first row of each Panel of Table 2 indicate that the null hypothesis of absolute convergence among the Chinese provinces is rejected in either case, since both statistics lie below the critical threshold of -1.65. Therefore, in a next step, we perform the clustering procedure (Step 1–4) described in Sect. 3.1 in order to identify sub-groups for which the log t test shows convergence (the alternative outcome would be absolute divergence).

Table 2 Log t test for institutional quality and per capita income

Regarding the per capita income, in the first place we identify six clubs and one diverging province. We next test whether any of the identified clubs can be merged to form larger convergence clubs. The second Column of Table 3 displays the merging procedure. Our results indicate that the initial clubs 2 and 3 can be merged together to form the new club 2 (consisting of six provinces). We also checked for the possibility of further merging the remaining clubs, however, the respective tests indicate that no further merging can be done. A list of the final clubs, the number of provinces and the respective test statistics are provided in Table 2, Panel B. Detailed information on the provinces forming each club is provided in Table 14 in the Appendix A. In addition, Fig. 7 in the Appendix A shows the spatial distribution of the income convergence clubs and the diverging province. The respective relative transition paths are displayed in Fig. 2. We can see that the clubs 1 and 2 lie significantly above the remaining four clubs. Moreover, the clubs do not seem to converge to each other, only the Club 1 consisting of only three provinces (Bei**g, Tian**, and Shanghai) shows a slightly decreasing tendency.

Table 3 Convergence club classification, detailed: 30 provinces from 1997 to 2007
Fig. 2
figure 2

Transition paths for income clubs. Source: Own calculations based on NBS (2018) data. Note: The relative transition path of the club is defined as the cross-section mean of the members of club i divided by the cross-sectional mean of the whole sample

Regarding institutional quality, the clustering procedure reveals the existence of five clubs and one diverging province (see Table 3, Column 1, Panel A). The merging analysis indicates that the clubs cannot be merged to form larger clubs (see Table 3, Column 2, Panel A). The relative transition paths are depicted in Fig. 3. The high institutional quality clubs 1, 2, and 3, each comprising only very few provinces, have relative transition curves (far) above the overall mean institutional quality (see also Table 15 in the Appendix A for more detailed information on the members of each club).Footnote 8 Moreover, all members of these three clubs have relatively high levels of institutional quality, indicating that in that case institutions are rather persistent. In addition, the transition paths of club 1 and – to a somewhat lesser extent – also of club 2 show a slightly increasing tendency. In contrast, the rather large poor institutional club 4, consisting of 18 provinces, lies below the cross-section average institutional quality and shows a decreasing tendency.

Fig. 3
figure 3

Transition paths for institutional clubs. Source: Own calculation based on data of Fan et al. (2010). Note: The relative transition path of the club i is defined as the cross-section mean of the members of club divided by the cross-section mean of the whole sample

Fig. 4
figure 4

Internal relative transition paths for per capita income clubs. Source: Own calculations based on NBS (2018) data. Note: The relative transition path of the province i is defined as the value of province i divided by the cross-sectional mean of the whole club

Club 5 which is made up of only two provinces (namely Gansu and Qinghai) brings up the rear; even though its transition curve shows a slightly increasing tendency, it lies far below those of the high institutional clubs 1–3 and even below that of the poor institutional club 4. On average, club 5 members reach only 60 percent of the overall mean level of institutional quality (in contrast, club 1 members realize values over 150 percent).

Figures 4 and 5 depict the internal transition paths of the six per capita income convergence clubs and the five institutional quality convergence clubs. There is a clear visible convergence tendency within each club, confirming the results of the log t test. Figure 8 in the Appendix A provides a graphical illustration of the institutional convergence clubs.

Fig. 5
figure 5

Internal relative transition paths for institutional clubs. Source: Own calculation based on data of Fan et al. (2010). Note: The relative transition path of the province i is defined as the value of province i divided by the cross-section mean of the whole club

In a next step, following Kar et al. (2019), we define which provinces are caught in a poor institutions trap and a low-income trap, respectively. The above findings indicate that the provinces belonging to the (rather large) poor institutional quality club 4 and the smaller institutional club 5 may be stuck in a low institutional trap as the transition curves of both clubs do not only lie far below the other transition curves (and also below the cross-section mean of all provinces of our sample) but also do not show any real tendency of narrowing the gap to the high institutional clubs. Moreover, club 4 shows a slightly decreasing tendency over the entire period towards the even lower club 5. Analogously, the provinces identified as club members of the low-income clubs 3, 4, 5, and 6 are apparently caught in a low-income trap. The relative transition paths of clubs 1 and 2 lie significantly above the other clubs and clubs 3–6 also show no catching-up tendency. These findings are the basis for our subsequent analysis in which we want to investigate whether poor institutions traps determine low-income traps.

It has to be noted, that of course, in the (very) long-run, there might be convergence in per capita income (and maybe as well in institutional quality) across Chinese provinces. However, during the transition period in which China moves from middle-income status to high-income status and in which there is the danger of a prolonged growth slowdown (‘middle-income trap’), multiple equilibria might temporarily emerge (as supported by our empirical evidence), which can have important implications for the future growth path (and, thus, also for long-run convergence). For achieving high growth at the national level (in order to quickly reach (lower) middle-income status), it might have even been positive that some provinces far outperformed others in terms of income and institutional quality for some time. This is especially true for China due to its huge size and the impossibility to develop the entire country at once. In the long run, however, this growth strategy is not sustainable. The growth of the top-performers (such as Bei**g and Shanghai) will naturally slow down and the growth potential of the “low-performing” provinces is not utilized optimally/unnecessarily kept low. After China’s impressive growth performance over the last decades, this point could be reached quite soon, and an analysis of the multiple equilibria in income and institutions during this period, especially their interrelatedness, appears to be very interesting.

Finally, since our sample period coincides with the Western Development Strategy, we will briefly discuss our results in the context of this initiative. In 1999, the Chinese government launched its Western Development Strategy (WDS, also known as “Go West” strategy) in order to accelerate the development of Western regions through various policy incentives and financial investments. The Chinese government also aimed at narrowing the economic development gap between Eastern and Western China. The WDS comprises a large number of initiatives and projects with a focus on infrastructure, ecological protection, promoting foreign investment and strengthening the reform and opening up efforts, as well as promoting education (cf. The Central People’s Government of the PRC 2009). It covers six provinces (Sichuan, Guizhou, Yunnan, Shaanxi, Gansu, and Qinghai), five autonomous regions (Tibet, Ningxia, **njiang, Guangxi, and Inner Mongolia) and the municipality Chongqing, which together account for more than 70% of the country’s land area.

The results of the WDS are mixed. While there was an undeniable increase in the GDP per capita of Western provincial units, the economic gap between Eastern and Western China has even been widening. This is also partly reflected in our results regarding the geographical distribution of income convergence clubs in China: Many “provinces” in Western China are located in one of the lower income clusters, whereas the majority of eastern provinces managed to join one of the higher income clubs. However, also across the Western (and Central) regions that are part of the WDS, there are considerable differences. For instance, the performance of Inner Mongolia, Chongqing, **njiang, and Shandong is particularly strong. Those provincial level administrative divisions (referred to as provinces here) are located in middle-level income clubs and/or show an extremely pronounced upward trend within their respective club. In contrast, the performance of Gansu, Guizhou, and Yunnan is rather worrisome. All three provinces are located in the lowest two clubs and seem not to be able to move toward a higher income cluster in the near future. A similar picture emerges with respect to the institutional development: Chongqing showed the strongest performance and as the only region covered by the WDS it managed to join the institutional Club 3. In contrast, Gansu and Qinghai are both located in the lowest institutional club.

4 Factors Conditioning Club Formation

In this section we analyze whether there exists a (causal) relationship between institutional traps and income traps which we have identified in Sect. 3.2. In particular, we want to estimate whether a province that is caught in a low institutional trap is also victim of an income trap. In Sect. 4.1, we first present our estimation strategy (the recursive bivariate probit model) and describe the cross-sectional data used in our regressions. Section 4.2 then elaborates on our estimation results.

4.1 Estimation Strategy (Recursive Bivariate Probit Model) and Data

We employ a recursive bivariate probit model in which two equations with binary outcomes and correlated error terms are estimated simultaneously. In contrast to the “normal” bivariate model, the left-hand variable of Eq. (9) is used as an explanatory variable in Eq. (10). It can be specified as follows:

$${i}^{*}={\beta }_{1}^{^{\prime}}{x}_{1}+{\varepsilon }_{1}, i=1\;if\; {i}^{*}>0\;and\;i=0\;otherwise$$
(9)
$${y}^{*}={\beta }_{2}^{^{\prime}}{x}_{2}+\gamma i+{\varepsilon }_{2}, y=1\;if\; {y}^{*}>0\;and\;y=0\;otherwise,$$
(10)

where \({i}^{*}\) and \({y}^{*}\) denote unobserved continuous latent variables determining the observed binary outcomes \(i\) and \(y\) which indicate whether a province is caught in an institutional or income trap, respectively. \({x}_{1}\) and \({x}_{2}\) are vectors of regressors, \({\beta }_{1}^{^{\prime}}\) and \({\beta }_{2}^{^{\prime}}\) are the respective vectors of coefficients, and \(\gamma\) is the coefficient of the binary variable \(i\) (the institutional trap dummy). The error terms \({\varepsilon }_{1}\) and \({\varepsilon }_{2}\) have a joint bivariate normal distribution with coefficients of correlation \(Corr\left[{\varepsilon }_{1},{\varepsilon }_{2}|{x}_{1},{x}_{2}\right]=\rho \ne 0\).Footnote 9 Recursive bivariate probit models are usually estimated using at least one exclusion restriction \(z\) (‘instrument’) which is only included in \({x}_{1}\).Footnote 10 The exclusive restriction has to be exogenous, that is \(Cov\left(z,{\varepsilon }_{2}\right)=0\).Footnote 11

In the following, we briefly describe the (cross-sectional) data used in this section.Footnote 12 Table 4 presents summary statistics of the main regressors. More detailed information regarding the construction of the variables are presented in Table 16 in Appendix B and below.

Table 4 Descriptive Statistics (II)

The variables ‘institutional trap’ and ‘income trap’ are defined as suggested in Sect. 3.2. There we explain why the provinces belonging to the poor institutional quality clubs 4 and 5 are stuck in a low institutional trap. Their relative transition paths lie far below the respective paths of club 1–3 and they both do not show any real tendency of narrowing the gap to the high institutional clubs. Club 4 additionally shows a decreasing trend towards club 5. Analogously, we can state that the provinces identified as club members of the low-income clubs 3, 4, 5, and 6 are caught in a low-income trap. The relative transition paths of clubs 1 and 2 lie significantly above the other clubs and clubs 3- 6 do not show any sign of catching-up (cf. also Kar et al. 2019).

Our choice of (additional) explanatory variables of the income trap and the institutional trap is based on the results of the standard literature on the determinants of economic development and on the determinants of institutional quality, respectively. Following (among others) Mankiw et al. (1992), Hall and Jones (1999), Sachs and Warner (1997), and Rodrik et al. (2004), we choose the population (natural) growth rate, the extent of (trade and financial) globalization as well as physical and human capital as explanatory variables of the income trap (besides the institutional trap dummy). Regarding the factors determining institution traps, we follow (among others) Easterly et al. (2006), La Porta et al. (2008), Acemoglu et al. (2001), and Sachs (2001) and employ globalization, ethnicity, and latitude as explanatory variables. As a robustness test, we also use an alternative geographical variable, namely the distance to Bei**g or Shanghai (whichever is less) which is used as an instrument for institutions in the study of Glawe and Wagner (2019a). Moreover, we also use human capital as an additional explanatory variable of the institutional trap, taking into account that human capital might induce improvements in institutional quality, as argued, among others, by Glaeser et al. (2004). In the following, we provide more detailed information regarding the various variables.

We use three different measures of human capital, (i) a human capital ratio defined as the ratio of the number of students enrolled in higher education over the number of students enrolled in secondary education (as suggested, for instance by Yao 2006 and Bonnefond 2014) and, as robustness checks, (ii) the population share aged six and above with tertiary education, and (iii) the population share aged six and above with secondary education (all calculated using NBS data).Footnote 13 Moreover, we employ a new measure of the provincial physical capital stocks devised by Holz and Sun (2018) (in logarithms). The extent of globalization is proxied by trade openness defined as the logarithm of the trade share in GDP (calculated using NBS data). As a robustness check, we use the logarithm of the FDI share in GDP as alternative measure of (rather financial) globalization (also calculated using NBS data). Data on the provincial population growth, which is supposed to hinder economic growth according to the neoclassical growth theory, is also obtained from the NBS. Geography is measured by latitude, that is, the distance from the equator. As an alternative geographical variable, we compile the air distance to Bei**g or Shanghai, whichever is less, calculated with the great-circle distance formula. Ethnicity is measured by the ethnic fractionalization index compiled by Yeoh (2012). It ranges between 0 and 1, where zero corresponds to a homogenous province. Finally, in some robustness checks we add the secondary and tertiary sector shares in GDP as well as urbanization (i.e., the logarithm of the urban population share in total population), all calculated using NBS data.

4.2 Estimation Results

The bivariate probit estimates are displayed in Table 5. Panel A reports the estimated coefficients of the determinants of the institutional trap while Panel B shows the estimated coefficients of the determinants of the income trap. The coefficients capture the relationship between the likelihood that a province will not be in an institutional (or income) trap and the respective regressors. In Columns (1)–(3) our globalization measure is the trade share in GDP, whereas in Columns (4)–(6), we use the FDI share in GDP instead. In all columns, we use the ratio of the number of students enrolled in higher education over the number of students enrolled in secondary education to represent human capital. In additional robustness checks, we will also present the results obtained when employing alternative educational measures (namely the population share with tertiary or, alternatively, secondary education). Regarding the institutional trap equation, we find that in all columns, globalization measured either by the trade share or the FDI share in provincial GDP is statistically significant and has the expected positive effect. Including human capital as an additional regressor in Columns (2) and (5) does not change these findings; the coefficients of the globalization measures stay significant (the significance level of FDI is only slightly reduced) whereas the human capital measure is insignificant. Also adding ethnicity in Columns (3) and (6) does not change our main results; however, the significance level of the trade variable is now also slightly reduced to the 5-percent level. Ethnicity itself has a negative coefficient and is significant or very close to being so. Latitude, that is, our measure of geography, is positively signed and significant in some specifications.

Table 5 Bivariate probit estimates, latitude

Regarding the income trap equation, we find that the institutional trap dummy is highly significant at the 1-percent level and has the expected positive sign in all specifications with coefficients ranging from around 2.8 to 3.5. Regarding the remaining explanatory variables, we find that human capital is positively signed and statistically significant at the 5-percent level for most specifications. The population growth rate also has the expected negative sign and is marginally significant in Columns (1)–(3). The coefficients of physical capital have varying signs and are insignificant throughout Columns (1) to (6). The trade-to-GDP ratio has a positive sign and is significant at the 5- and 10-percent level while the FDI share is insignificant and negatively signed.

For all columns, the null hypothesis of the Wald test that rho equals zero (meaning that the two probit equations are independent) is rejected at the 1-percent level. Thus, the choice of a recursive bivariate probit model is appropriate in our case.

In Table 6, we use an alternative geographical measure, namely the distance to Bei**g or Shanghai (whichever is less). This variable is used as instrument for institutional quality by the study of Glawe and Wagner (2019a) and passes all (possible) tests for exogeneity.Footnote 14 For an extensive discussion of the instrument see Glawe and Wagner (2019a, Sect. 2).

Table 6 Bivariate probit estimates, distance

In general, our results remain mostly unchanged. The institutional trap dummy is still highly significant with coefficients ranging from 2.7 to 3.7. Regarding the institutional quality equation, the coefficients of the trade and FDI shares are again positive and significant at the 5- to 1-percent level with coefficients around 1.7 and 1.6, respectively. The distance measure is (as expected) negatively signed and marginally significant in Columns (4) and (5). As before, adding ethnicity does not change our results. The cultural variable is again negatively signed but it fails to be statistically significant this time. Again, the p-value for the Wald test suggests that the bivariate model is appropriate (instead of running two separate probit models). Also when including latitude and the distance measure simultaneously, our results stay robust (cf. Table 7). The ethnicity variable is marginally significant (in Columns 3 and 6).

Table 7 Bivariate probit estimates, latitude and distance

In the following, we briefly discuss some further robustness checks regarding the choice of the exclusion restriction, the use of alternative proxies for human capital, as well as the consideration of structural characteristics (such as sector shares und the level of urbanization).Footnote 15 Moreover, we analyze which of our globalization measures (trade or FDI) is more important, especially for avoiding poor institutional traps.

The additional robustness checks regarding the exclusion restriction are presented in Table 8. There, we focus on a historical variable, in particular a dummy variable for having been colonized by a Western power (cf. Wang et al. 2018). Our results remain unchanged, the institutional trap dummy and also the other determinants of the income and institutional traps that above have been identified as crucial stay significant.

Table 8 Bivariate probit estimates, former western colony

In a further robustness test, we employ alternative measures of human capital, namely the population share (aged six and above) with tertiary or secondary education. The respective regression tables are presented in Tables 9 and 10. Again, our main findings stay robust. In some columns, the educational measures have a higher significance level (compared to the human capital ratio), particularly in the specifications that include financial globalization. The coefficients of tertiary education are on average higher than the corresponding coefficients of the secondary education measure (cf. Panel B). We also used the average years of schooling as a human capital proxy, however, even though our main findings again do not change, the mean school years barely fail to be statistically significant in some specifications. The results are reported in the Appendix C, Table 17.

Table 9 Bivariate probit estimates, tertiary education
Table 10 Bivariate probit estimates, secondary education

The next robustness check is dedicated to the importance of structural characteristics. As the economic development varies widely across the 30 provinces, municipalities, and autonomous regions, they consequently also find themselves at different stages of the structural transformation process: While some provinces still have a relatively strong agricultural orientation (e.g. Hainan), in various coastal provinces (such as Bei**g and Tian**), the service sector already accounts for the largest share in the regional GDP (whereas the primary sectors contributes only around 1 percent or less). In addition, the degree of urbanization differs significantly across provinces. Therefore, in Tables 11 and 12, we include some additional control variables to take into account these structural characteristics. Table 11 shows that when adding the sector shares,Footnote 16 the institutional trap dummy stays significant at the 1-percent level in all Columns. Also the main regressors of the institutional trap equation do not change; globalization is still highly significant and the geographical variables are marginally significant or close to. Only the human capital measure reports a decrease in its significance level when using the trade globalization measure (in the income trap equation). Regarding the additional control variables capturing differences in the provincial structural characteristics, we find that the (log) industry and tertiary sector shares in GDP are both insignificant.Footnote 17 In Table 12 we include another structural variable as additional regressor in the income trap equation, namely the urbanization share. Unfortunately, we can only use the trade share as measure of globalization since the model cannot be solved when using the FDI share instead. In all Columns, the urbanization share is highly significant with coefficients ranging from 1.1 to 1.2. The institutional trap dummy remains significant at the 1-percent level whereas the significance of human capital is again reduced. Regarding the institutional trap equation, the trade share remains positive and highly significant. Also adding human capital to the institutional trap equation does not change our findings; as before, the coefficient of the educational measure is insignificant (cf. Table 18 in the Appendix C). In sum, adding structural characteristics to our set of regressors does not change the importance of good institutions for avoiding poor income traps; however, the significance of human capital is reduced. Regarding the structural characteristics, the sector shares appear to be of minor importance whereas the level of urbanization is a decisive determinant of income traps.Footnote 18

Table 11 Bivariate probit estimates, structural characteristics
Table 12 Bivariate probit estimates, urbanization

In a final robustness check, we analyze whether trade or financial globalization is more decisive for the institutional quality of a province. When including trade and FDI simultaneously in the institutional trap equation, trade stays significant at the 5-percent level, whereas FDI turns insignificant (cf. Table 13). The main results (highly significant institutional trap dummy, positive significant impact of human capital in the income trap equation, positive impact of globalization in the institutional trap equation etc.) do not change. These findings are robust independent of which geographical and cultural variables (or combination of them) we include as exclusion restriction in the institutional equation. It is an indication that trade is probably even more decisive for institutional development than FDI. We also tried to include FDI only in the institutional trap equation and trade only in the income trap equation. Both variables are highly significant and positive. However, the other way round (that is, when including trade only in the institutional trap equation and FDI only in the income trap equation) only trade is highly significant. Again, our main findings do not change and are robust for various combinations of exclusion restrictions (cf. Table 19 in the Appendix C). Overall, we can say that both, trade and FDI have a positive impact on the probability of not being caught in a poor institutional trap. However, trade additionally reduces the likelihood of being stuck in a low-income trap. Moreover, if included simultaneously, trade trumps FDI regarding its impact on institutions.

Table 13 Bivariate probit estimates, trade and financial globalization

Overall, our results suggest that poor institutional traps are crucially important determinants of low-income traps. Moreover, human capital and urbanization can play a decisive role for reducing the likelihood of income traps, whereas the likelihood of institutional traps is related to the trade and FDI performance (and, to a somewhat lesser extent, also to the ethnic fractionalization and geography of a province).Footnote 19

5 Conclusion

In our paper, we have analyzed whether there exist multiple equilibria in institutions across China’s provinces over the period 1997 to 2007. Using the log t test proposed by Phillips and Sul (2007), we find that there are multiple institutional clubs within China. In particular, we identify three rather small clubs of provinces with above-average high levels of institutional quality of which some even show a slightly increasing relative transition path. The remaining two clubs find themselves on below-average poor institutional quality paths. One of these clubs is rather large, consisting of 18 provinces, and shows a slightly declining tendency towards the other, even lower, poor institutional club which comprises only two provinces. The provinces of these two clubs are assumed to be caught in a poor institutional trap. In addition, many of these provinces are also members of the low per capita income club (in total we identify six income clubs, two high-income clubs and four low-income clubs forming the low-income trap), suggesting a positive correlation between being stuck in an income trap and being a victim of an institutions trap (the so-called phenomenon of a ‘double trap’). This hypothesis is then more formally tested by using a recursive bivariate probit model. Our results suggest that poor institutional traps are indeed important determinants of low-income traps. Moreover, human capital and urbanization appear to be additional important success factors that can increase the likelihood of avoiding a low-income trap. Population growth and the trade share are also significant factors, however to a somewhat weaker extent.

Our results imply that in order to avoid an extended period of sluggish growth, Chinese policymakers should focus on simultaneously improving the educational performance as well as the institutional framework, particularly in the low-income trapped regions.Footnote 20 Moreover, fostering trade (and also FDI) appears to a good strategy to increase the likelihood that a province can obviate a low institutional trap. The positive impact of integration on institutional quality across Chinese provinces is also confirmed in the study of Glawe and Wagner (2019a) and is in line with findings of the general deep determinants literature, e.g. with the study of Rodrik et al. (2004). Regarding China’s future trading prospects, the picture is mixed: while the One Belt One Road initiative could provide important trade opportunities to poor (inland) provinces, the trade conflict with the US could pose a constraint on China’s trade performance. Regarding the structural characteristics of a province, we find that urbanization reduces the probability to experience an income trap whereas the sectoral composition appears to be only of minor importance.

As already mentioned, our findings also have important implications for whether China will become victim of a severe growth slowdown at the middle-income range, the so-called middle-income trap. If the majority of provinces converge to lower income clubs and only few provinces are on the growth trajectory to a high-income club, it will be increasingly difficult to sustain growth at the national level, especially when the relatively rich provinces reach the levels of more advanced economies and their growth rates will naturally start decreasing (this is indeed already the case for some provinces). Activating the growth potential of the remaining provinces and also putting them on the high growth path could help counteracting this tendency at the national level. However, this is only possible by first breaking through the underlying barrier, namely the institutional trap. According to our analysis, this can be fostered by improving the trade performance.Footnote 21

Finally, it is worth noting that the positive impact of more inclusive economic institutions at the provincial level (on which we focus in our paper) is also limited to some extent. As argued by Acemoglu and Robinson (2006), economic institutions do not exist in a vacuum but are supported by certain political institutions. In contrast to economic institutions which can vary across regions, all provinces are under the same political system. This could pose an additional constraint on growth at some point since according to Acemoglu and Robinson (2006) only inclusive economic and political institutions in combination can sustain long-run growth. At the same time, especially in the Chinese case, a certain political stability implied by the authoritarian system could also be beneficial for sustainable development to some extent (see Wagner 2019). Finding the right mix of political stability and increasingly inclusive political institutions (which of course can change over time) in order to ensure sustainable growth will be an interesting future challenge for China.