1 Introduction

Simulating rainfall in the tropics has been a challenge for the climate models (Tian and Dong (2020)). The challenge is compounded in the Maritime Continent which is a large archipelago of several thousands of islands in the middle of Warm Ocean. The Maritime Continent plays an important role in the global weather and climate system by acting as the major source of convective heating. It is the complex interaction of the large-scale flow with the distinct diurnal cycle of rainfall over these islands that makes the prediction challenging for the models (see Yoneyama and Zhang (2020) and the references therein).

The weather and climate models suffer from systematic rainfall biases (Zadra et al. 2018) that are seemingly related to the biases in simulating the convective diurnal cycle (Love et al. 2011; Lee et al. 2021; Hassim et al. 2016). There are likely several reasons for these biases (Dipankar et al. 2019; Yang and Slingo (2001); Peatman et al. 2014; Peatman 2021) in the current models but errors in the convection parameterization schemes (Birch et al. 2015) are probably the one that has the largest impact. Our aim, therefore, is to push the grid resolution to the limit where the convection parameterization schemes can be turned off without any loss in model performance.

With the advancements in supercomputing, the GCMs are now capable of global climate simulations of order of few hundred years at a grid resolution of 50–100 km (Eyring et al. 2016; Haarsma et al. 2016), which can be further downscaled using a regional climate model to a 10–20 km grid resolution to capture the regional scales in detail (Qian 2008; Gianotti et al. 2012; Zhao et al. 2021). While a grid resolution of 10–20 km is found to perform reasonably well in the extra tropical regions (Ban et al. 2014). It is difficult to justify if such a coarse grid resolution will perform equally well in a tropical region like Singapore, where one ideally needs a grid size of the order of kilometres (or less) to reasonably capture the local convective processes (e.g., localized thunderstorms) realistically to characterize the local climate, which is of small scale (Ngo-Duc et al. 2017; Nguyen et al. 2022; Hariadi et al. 2022). Based on our own experience in weather modelling and the results from existing literature (Marsham et al. 2013; Birch et al. 2014; Prein et al. 2015; Huang et al. 2019; Li et al. 2020a, b; Dipankar et al. 2020; Lu et al. 2021), we believe that a convection-permitting grid resolution with convection parameterization turned off, although considerably computationally expensive, is better suited to study the climate change impact on the city-state Singapore. Our aim of this study is also to document the development of a convection-permitting climate modelling system over the Maritime continent for downscaling CMIP6 models for the third National Climate Change projection (V3). As rainfall is the primary product of interest in V3, evaluation is based solely on rainfall characteristics over land and ocean.

Increased grid resolution alone with convection parameterization turned off does not ensure a quality performance. Our experience in develo** SINGV as the operational numerical weather prediction model (Huang et al. 2019; Dipankar et al 2020) is an example of it. Several adjustments were needed (see Huang et al. 2019; Dipankar et al 2020 for the details) to make SINGV forecast useful for daily use at a horizontal grid resolution of 1.5 km. These changes then resulted in a science configuration specific to the tropics as documented in Bush et al. 2019. This is a very strong basis to establish SINGV as the regional climate model (RCM) of choice.

In this study, analyses are done to investigate the SINGV-RCM model performance to reproduce the spatial and temporal characteristics of observed rainfall beyond weather time scales. Sensitivity experiments performed to ensure the model's robustness against vertical resolution and convection parameterization are also documented. The contents of the manuscript are as follows: modelling framework is dealt with in Sect. 2 and the data for model validation is dealt with in Sect. 3, Sensitivity experiments in Sect. 4, Sect. 5 examines the results and finally, the summary and discussions are in Sect. 6 respectively.

2 Model

The current modelling system which is, the Singapore Regional Climate Model (SINGV-RCM), is based on the Unified Model (UM). Results presented in this study are from SINGV-RCM v1.0, which is based on the dynamical core of Met UM (Met Office Unified Model) version 11.1, and the physics basis from the tropical version of the Met UM known as RA1T (Regional Atmosphere 1 – Tropical) (Bush et al. 2019). Details of the physical parameterization schemes are listed in Table 1 for completeness.

Table 1 Shows the details of the configuration of dynamics, physics and model setup

The horizontal grid consists of a spherical latitude–longitude grid with Arakawa C-grid staggering variables. The vertical grid consists of 80 levels extending from the surface to 38.5 km at the top, the levels are height-based hybrid-η vertical coordinates with Charney and Philips's (1953) grid staggering of variables. The semi-Lagrangian scheme is used to treat the advection term and semi-implicit method for time integration. The model time steps are 240 s (4 min) for 8 km and 120 s (2 min) for 2 km.

The modelling experiments are conducted to explore SINGV’s potential as a Regional Climate Model (RCM) for the region; previously several studies were conducted with the SINGV-NWP system and found to have high skills in predicting the convection realistically over the Maritime Continent (Huang et al. 2019; Dipankar et al. 2020) and over Singapore (Simón‐Moral et al. 2020; Doan et al. 2021). The downscaling is performed using ERA5 (Hersbach et al. 2019, 2020) as the initial condition and the lateral boundary conditions. The soil moisture and the sea surface temperature are also derived from ERA5. The simulation domains are shown in Fig. 1. SINGV-RCM is tested for domains shown in solid boxes 8 km [MC (D1)] and 2 km [SG (D2)] for the Maritime Continent (MC) and Singapore (SG) domains.

Fig. 1
figure 1

Downscaling domains for this study. D1 (16.16 S–24.08 N; 79.68 E–160.248 E) is the 8 km domain, and D2 (7.29 S–9.972 N; 93.16 E–110.422 E) is the 2 km domain (in solid line)

The SINGV-RCM model setup is similar to the SINGV-NWP setup, except that it is a free run with the regular update of Sea surface temperature (SST) at 3 hourly intervals from ERA5. The motivation for imposed diurnal variability of SST comes from earlier studies (Yang and Slingo (2001); Ichikawa and Yasunari 2006; Peatman et al. 2015; Dipankar et al. 2019) that have indicated its importance in forecasting precipitation in the region.

The static ancillary files used in the simulation are the Land-Sea Mask, Model orography, Soil parameters, Vegetation, land-surface type, and Leaf Area Index. The orography is from SRTM (Farr et al. 2007) whereas the rest of the static data are taken from the CCI (Defourny and ESA Land Cover CCI project team 2016). The aerosol and sea ice are fixed at their climatological values.

3 Validation data

The Integrated Multi-satellite Retrievals for GPM (IMERG) algorithm combines information from the GPM satellite constellation to estimate rainfall over the majority of the Earth's surface (Huffman et al. 2020). In the latest Version 06 release of GPM-IMERG, the algorithm fuses the early rainfall estimates collected during the operation of the TRMM satellite (2000–2015) with more recent rainfall estimates collected during the operation of the GPM satellite (2014–present). The IMERG data are available from 2001 to the present. These data are available on a 0.1° spatial grid between the coordinates 60°S–60°N and 0°–360° E–W. The half-hourly data are processed to obtain hourly, daily and monthly values, when necessary, over the study period. An extensive study was carried out for the Maritime continent using IMERG data by Silva et al. (2021) and the usefulness of this product for model validation over the Maritime continent.

4 Sensitivity experiments

As part of sensitivity studies, we conducted several experiments to assess model performance with respect to the model changes. Some of them, which we believe are important are listed below. The simulations were done for a month-long period and are compared, after discarding the first two days as spin-up, with the IMERG rainfall dataset.

4.1 Impact of changes to the vertical resolution of the forcing field data

Most of the global model data are coarsely resolved in the vertical as compared to SINGV-RCM which uses 80 levels up to z = 38.5 km. Vertical interpolation of driving data to higher resolution is known to produce model biases. To get an understanding of expected model bias in the downscaled climate simulations due to the loss of vertical resolution in the driving data, we compared the simulations driven using ERA5 data with full model (137) vertical levels and the simulation with only 37 pressure levels in the vertical against the ERA5 reanalysis. Both data with 137 model vertical levels and 37 pressure levels are interpolated to 80 hybrid-height levels of SINGV-RCM. Focus is given to the vertical velocities considering their role in driving convection in the region. From the test results, very small differences in the large domain (MC) were noted (Fig. 2a), but more sizeable in the small domain (SG) is evident (Fig. 2b) between the runs. Both, finely and coarsely vertically resolved driving data, show similar bias against the ERA5 reanalysis, which suggests that the vertical resolution of the driving data has a weaker control on the overall model behaviour than the model itself.

Fig. 2
figure 2

Mean pressure vertical velocity for a Maritime Continent (MC) and b Singapore (SG) domain using different vertical levels in the forcing fields. Units in Pa/s

4.2 Explicit versus Parameterized convection

The MC domain in SINGV-RCM uses a horizontal grid spacing of 8 km, which is coarse to explicitly resolve convection. However, global models using parameterized convection are also known for substantial biases in capturing the diurnal cycle of precipitation in the MC (Birch et al. 2015). We have, therefore, performed an experiment to see how the rainfall with and without the convection scheme compare against the observation and ERA5. MC domain averaged (land only) rainfall in Fig. 3 shows that the diurnal peak is too early in the parameterized simulation and the amplitude is severely overestimated when compared to IMERG. Based on this finding, we decided to turn off the convection parameterization for both the 8 km and 2 km grid in SINGV-RCM.

Fig. 3
figure 3

Mean diurnal cycle of rainfall over the Maritime Continent (MC) domain. Units in mm/h

5 Results

5.1 Mean rainfall

The mean rainfall biases with respect to IMERG are shown in Fig. 4 for both MC and SG domains. The spatial structure of the bias is quite similar in the two domains and the amplitude is reasonably small. Large biases are found near the coastlines, which agrees with the earlier findings (Qian 2008; Love et al. 2011). The biases, at least in the monthly mean rainfall, do not seem to have reduced in the SG-domain with increased spatial resolution, suggesting that mean features of the rainfall can be captured relatively well even at 8 km grid resolution.

Fig. 4
figure 4

Mean simulated rainfall bias with respect to IMERG for SINGV-RCM a 8 km simulations over the MC-domain, b 8 km simulations over the SG-domain and c 2 km simulations over the SG-domain. Units in mm/h. IMERG rainfall mean for the MC-domain is 0.29 mm/h and for the SG-domain is 0.385 mm/h

The Mean, Bias, Pattern Correlation Coefficient (PCC) and Root Mean Square Error (RMSE) with respect to IMERG for 8 km and 2 km simulations are shown at the top of each figure panel (Fig. 4a–c). The mean rainfall averaged over the MC-domain for IMERG is 0.29 mm/h (mean value for IMERG shown in figure caption). The bias becomes slightly positive (+ 0.008) for the SINGV-RCM 8 km simulation over the MC domain with the highest pattern correlation coefficient (0.57) over the MC domain (Fig. 4a) and a root mean square error (RMSE) value of 0.27 with respect to IMERG.

The mean rainfall averaged over the SG-domain for IMERG Rainfall is 0.385 mm/h (mean value for IMERG shown in figure caption). The mean value for SG domain from 8 km is 0.305 mm/h and 0.304 mm/h for 2 km with bias values of − 0.078 mm/h (8 km) and -0.080 mm/h (2 km); the bias does not change much between 8 and 2 km for SG domain, but PCC increases slightly when moving from lower grid resolution to higher grid resolution (0.42 for 8 km-explicit and 0.48 for 2 km-explicit) in SINGV-RCM (Fig. 4b, c), while the root mean square error (RMSE) decreases slightly when moving from lower grid resolution to higher grid resolution (0.23 for 8 km-explicit and 0.21 for 2 km-explicit) in SINGV-RCM (Fig. 4b, c).

5.2 Mean diurnal cycle of rainfall

In this section, the temporal and spatial characteristics of the mean rainfall are discussed. We start by analysing the mean diurnal cycle of rainfall over land-only (Fig. 5) and ocean-only (Fig. 6) grid points of the area bound by MC and SG domains. In the following discussions, IMERG rainfall is assumed to be the “truth” for comparison purpose whereas ERA5 rainfall is used to demonstrate the added value of downscaling over the forcing data (also ERA5).

Fig. 5
figure 5

Diurnal cycle of mean rainfall over land-only grids for a MC and b SG domains. Units in mm/h

Fig. 6
figure 6

Diurnal cycle of mean rainfall over Ocean-only grids for a MC and b SG domains. Units in mm/h

It is noted that the diurnal peak time of rainfall in the ERA5 is at least a couple of hours earlier than the observation (IMERG) over the land in the MC domain. Though the rainfall intensity in the SINGV-RCM is higher than IMERG, the peak diurnal timing is well captured by the model over the MC domain. Similarly, in Fig. 5b, we see that both rainfall intensity and phase are in close agreement with the observation. It is remarkable to notice that the initiation of convection, which is quite vigorous in ERA5, is modest in SINGV-RCM at both 8 and 2 km. The initiation, however, in SINGV-RCM at 8 km appears to be delayed leading to a delay in the peak as well.

On the other hand, over the Ocean points (see Fig. 6a) the 8 km SINGV-RCM underestimates the diurnal rainfall in the night and in the early morning. The phase of the diurnal cycle appears to be reasonably captured, including the peak in the early morning at around 5 am and the afternoon peak at 2 pm SG local time. The double peak over the ocean is a pronounced feature of the diurnal rainfall in the Maritime Continent (see Love et al. 2011) wherein the peak in the afternoon is due to the daytime convection and the peak in the morning is known to be due to the night-time offshore propagation of convection. It is important to note that the rainfall in SINGV-RCM is in general underestimated over Ocean and is more pronounced over the smaller SG domain (see Fig. 6b). The reason for this underestimation is not known.

We now evaluate the spatial characteristics of the rainfall in SINGV-RCM to see if the observed temporal behaviour in Figs. 5 and 6 are there for the right reasons. The spatial diurnal anomalies for rainfall and 10 m winds are calculated from the daily averages and are depicted at 5 PM and 5 AM over the SG domain in Fig. 7. These two time instants are selected because they roughly correspond to the maximum rainfall over the land and the ocean points, respectively. 10 m wind anomalies in the observation (IMERG) are from ERA5.

Fig. 7
figure 7

Diurnal cycle of rainfall for observation (IMERG) and SINGV-RCM at indicated spatial resolutions of 5 PM and 5 AM local time. 10 m winds shown in IMERG are due to ERA5. Units for rainfall in mm/h and 10 m winds in m/s

It is remarkable to see the overall agreement of the rainfall patterns in both 8 and 2 km simulations against the observation. The anomalous rainfall over the western coast of Sumatra and Malay Peninsula due to wind convergence (see Fig. 7a–c) in accord with the existing literature (Love et al 2011; Hassim et al. 2016). Also stands out is the excessive rainfall near the eastern coast of Malay Peninsula in SINGV-RCM, which is missing in observation. The rainfall generation over the Malacca Strait at 5 AM is captured reasonably well in SINGV-RCM (see Fig. 7d–f). The night-time offshore propagation of rainfall off the western coast of Sumatra (Fig. 7d–f) is more pronounced in the models. This overestimation, as opposed to observation, is more apparent at 8 km suggesting that increased resolution helps in reducing the bias.

5.3 Spatial representation of peak diurnal timing

Figure 8 shows the spatial variation in the timing of the diurnal rainfall peak over the SG domain compared to IMERG and ERA5 at each grid point. It is noticed that the diurnal timing over both land and Ocean grid points from the ERA5 reanalysis shows large differences against the observation whereas SINGV-RCM (Fig. 8c, d) resembles the observation quite reasonably suggesting the added value that it brings over the driving data.

Fig. 8
figure 8

Spatial map of peak diurnal timing of rainfall. a IMERG, b ERA5, c SINGV-RCM 8 km, and d SINGV-RCM 2 km

We can also notice some improvement in the diurnal peak rainfall timing as we go from coarse-grid resolution to high-grid resolution simulations of SINGV-RCM 8 km (Fig. 8c) to 2 km (Fig. 8d). It is clear from these experiments that the explicit representation of convection combined with improved grid resolution corrects the diurnal cycle of rainfall over the Singapore domain.

5.4 Representation of extreme rainfall

In this section, the distribution of 95-percentile extreme rainfall at each grid point in the SINGV-RCM simulations for different model grid resolutions is discussed.

Figure 9a depicts the 95-percentile extreme rainfall threshold value at each grid point from the observed IMERG data and Fig. 9b–d shows the 95-percentile extreme rainfall bias value at each grid point with respect to observed IMERG data. Figure 9b shows the bias for ERA5 reanalysis, underestimation is evident at every grid point, which means the ERA5 is not able to get the extreme rainfall at the 95 percentile threshold over the Singapore domain, while the SINGV-RCM 8 km run in the MC domain shows positive bias over the majority of grid-points (Fig. 9c). The positive bias in the 95-percentile rainfall intensifies from coarse-grid resolution to high-grid resolution, i.e. from SINGV-RCM 8 km to 2 km (Fig. 9d), we notice that the underestimation over the south-eastern Malay Peninsula region progressively reduced with increasing resolution, which is an added value of downscaling to very-high grid resolution over the Singapore domain.

Fig. 9
figure 9

Extreme rainfall a 95-percentile threshold value for IMERG and rest are rainfall bias with respect to IMERG. b ERA5, c SINGV-RCM 8 km, and d SINGV-RCM 2 km. Units in mm/h

5.5 Distribution of rain rate over the Land and Ocean grids in SINGV-RCM

The frequency distribution of observed and modelled rainfall over the SG-domain for land and ocean-only grids are shown in Figs. 10 and 11, respectively.

Fig. 10
figure 10

Distribution of rain rate over SINGV-RCM grids at 8 km and 2 km (for only Land-grid points in SG domain). Units in mm/h

Fig. 11
figure 11

Distribution of rain rate over SINGV-RCM grids at 8 km and 2 km (for only Ocean-grid points in SG domain). Units in mm/h

The results from Land-only grid points (Fig. 10) reveal that SINGV-RCM at both 8 km and 2 km resolutions overestimate no-rain grids compared to the IMERG, while in the 2 km resolution, the percentage is quite close to IMERG. The SINGV-RCM at 8 km resolution underestimates light rainfall compared to the IMERG in all the ranges of 0–5 mm/h and rainfall is slightly overestimated in the narrow rainfall ranges of 0–0.05 mm/h (very light rainfall category) in the 2 km resolution. Both 8 km and 2 km resolutions, slightly overestimate moderate and heavy rainfall in the ranges of 5–10 and 10–25 mm/h respectively. It is encouraging to see that the SINGV-RCM is close to the IMERG observation in estimating the 5–10 and 10–25 mm/h ranges in both 8 km and 2 km explicit-run grid resolutions. Extreme rainfalls in the range of 25–100 mm/h; the model at 8 km resolution is slightly higher than IMERG, while the model at 2 km resolution is slightly lower than IMERG and no rainfall in the ranges above 100 mm/h for both resolutions as well as in the IMERG.

The results from Ocean-only grids (Fig. 11) reveal that SINGV-RCM at both 8 km and 2 km resolutions overestimate no-rain grids compared to the IMERG. The SINGV-RCM underestimates light rainfall compared to the IMERG in all the ranges of 0–5 mm/h at both 8 km and 2 km resolutions respectively. It is encouraging to see that the SINGV-RCM is close to the IMERG observation in estimating the 5–10 and 10–25 mm/h ranges in both 8 km and 2 km explicit-run grid resolutions. Extreme rainfalls in the range of 25–100 mm/h; models at both 8 km and 2 km resolutions underestimate rainfall compared to IMERG and no rainfall in the ranges above 100 mm/h for both resolutions as well as in the IMERG.

The explicit representation of convection in the model configuration simulates moderate rain rates better than the light rainfall rates, irrespective of model grid resolution and the parameterized convection simulation tends to over-predict the light rainfall rates at the expense of heavy rainfall events.

6 Summary and discussion

The present work documents the regional climate model configuration (SINGV-RCM) for Singapore and the surrounding region. We have tested SINGV-RCM with two different grid resolutions of 8 km for the MC domain and 2 km over the SG domain and found that the results are generally in agreement with the observation.

Sensitivity experiments have been performed to evaluate model’s sensitivity to the number of vertical levels and the use of convection parameterization. Results demonstrate only marginal sensitivity of SINGV-RCM to the number of vertical levels, which is desired as most of the CMIP models are generally of coarsely resolved in the vertical. The sensitivity to the use of convection parametrization is noteworthy. It is shown that the diurnal phase of the mean rainfall is much better captured when the convection scheme is turned off, even at 8 km. This finding is consistent with other studies for Western Africa using the UM model with a 4.5 km horizontal grid resolution (e.g., Berthou et al. 2019) and for the Western maritime Continent using the WRF model (Argüeso et al. 2020).

We found from the analysis that 8 km and 2 km simulations though look similar in biases, have shown improved pattern correlation in the high-resolution simulation (2 km) compared to 8 km and the RMSE reduces slightly in the high-resolution simulation (2 km) compared to 8 km, which are added values going from 8 to 2 km resolution. We also noted a better distribution of rainfall intensities (less light rain, more heavy rain). 2 km is not statistically different to 8 km explicit over a large domain (little sensitivity to grid resolution). There are no improvements in the light rainfall in the ranges of 0 to 1 mm/h, but we see higher estimations in the moderate to heavy rainfall ranges of 5–10 mm/h to 10–25 mm/h over Land domains. The model captures at least the heavy and extreme rainfall thresholds better than the light rainfall rates over land grid points, which is definitely an added value for downscaling the coarse resolution model outputs to 8 km and then to 2 km.

Overestimation of extreme rainfall in the range of 25–100 mm/h in the 8 km compared to 2 km resolution (over both Land and Ocean) can be explained to some extent based on these results (Pearson et al. 2014; Birch et al. 2014) that the enhanced extreme rainfall in 8 km is related to the unrealistic grid size for explicit representation of convection, which may tend to exacerbate when there is plentiful of heat and moisture, such as over the Maritime continent.

The mean rainfall over MC and SG is found to be in good agreement with the observation overall. The diurnal phase over land and the ocean are correctly captured. The rainfall intensity over land is also correctly captured (for the right reason) but the intensity over the ocean is underestimated.