1 Introduction

Earthquake damage and loss assessment of asset portfolios is of critical importance for decision making in the context of disaster risk reduction (e.g., Calderón et al. 2021), pricing in the (re)insurance industry (e.g., Goda and Yoshikawa 2012), or rapid damage and loss estimation (e.g., Poggi et al. 2021). The accuracy and reliability of the seismic risk models used in these tasks are crucial, as uncertain or biased estimates may lead to insufficient mitigation measures, inadequate insurance policies, or erroneous allocation of resources in the aftermath of destructive events. Several studies have investigated the impact of modelling assumptions in earthquake scenarios, particularly in the characterization of the seismic demand (e.g., Crowley et al. 2005; Bommer and Crowley 2006; Goda and Hong 2008), vulnerability modelling (e.g., Silva 2019), exposure resolution (e.g., Bal et al. 2010; Dabbeek et al. 2021), asset reallocation (e.g., Bazzurro and Park 2007) and loss assessment (Silva 2016). In general, aggregated damage and loss metrics tend to be affected by large uncertainty due to the ground motion and structural capacity estimation, while the spatial resolution of the exposure model might introduce a considerable bias in the risk results. The consideration of historical earthquakes for which empirical damage and loss data are available can serve as a powerful benchmark tool for the development, validation, and calibration of seismic risk models (e.g., Spence et al. 2003; Verderame et al. 2014; Villar-Vega and Silva 2017; Riga et al. 2021). Moreover, the availability of ground motion recordings and hazard footprints (e.g., ShakeMaps—Wald et al. 2021) for such events may improve significantly the accuracy of the results (e.g., Silva and Horspool 2019) and reduce the uncertainty of the damage and loss estimates (e.g., Crowley et al. 2008).

The rapid development of machine learning (ML) algorithms has enabled new modelling approaches in the field of earthquake engineering (e.g., **e et al. 2020), due to their ability to predict complex outcomes considering a wide range of variables. However, the assessment of damage or losses due to single events has been the target of limited investigation. Kim et al. (2020) developed a probabilistic deep neural network for assessment of losses at the urban scale. Mangalathu and Burton (2019) used damage data from the 2014 Mw 6 South Napa earthquake to develop a Long Short-Term Memory recurrent neural network, that can be employed for rapid damage classification of buildings in the immediate aftermath of destructive events. Using the same dataset, Mangalathu et al. (2020) evaluated the performance of four popular ML algorithms for the prediction of the assigned ATC-20 (1995) tags on the affected buildings. Similarly, Roeslin et al. (2020) used damage data from the 2017 Mw 7.1 Puebla earthquake and four ML algorithms to develop damage classification models according to the EMS-98 scale (Grünthal 1998). In both studies, the random forest algorithm had the best performance for damage classification among all the considered algorithms, achieving an overall accuracy of 65%. Finally, Stojadinović et al. (2022) proposed a rapid earthquake loss assessment (RELA) framework, featuring a damage classification model calibrated using a subset of damage data from the 2010 Mw 5.5 Kraljevo earthquake. The RELA framework employs representative sampling to select a subset of buildings to be inspected after the occurrence of an event, and the results of the damage surveys are used by the random forest model to predict the damage states and loss of the entire portfolio. Despite these applications of ML algorithms in the field, the majority of the suggested frameworks employ classification-based models that require input data that are usually not available, or are unknown prior to the occurrence of a seismic event. Some of the proposed methods also require the initiation of the damage assessment surveys in order to provide loss estimates. For these reasons, while such models are certainly relevant for rapid loss assessment, they are less useful for the assessment of losses considering the characteristics of past events, or probabilistic seismic risk analyses.

In the recent study by Kalakonas and Silva (2021), we explored the use of artificial neural networks (ANN; e.g., Perlovsky 2000) for seismic vulnerability modelling of building portfolios, where considerable improvements over traditional scalar methodologies were found. These benefits were assessed in terms of efficiency, sufficiency, bias and variability for a subset of typical building classes in the Balkans. The present study is the second part of an extension of the work of Kalakonas and Silva (2021), exploring the application of ANN models for earthquake scenario damage and loss assessment. To this end, five recent major seismic events in the Balkan countries were selected; a region characterized by significant seismic hazard (e.g., ESHM13-Woessner et al. 2015; ESHM20-Danciu et al. 2021) and seismic risk (e.g., Silva et al. 2020). Five alternative seismic risk modelling cases were considered with respect to the employed ground motion and fragility/vulnerability models. Regarding the former, the ANN GMM developed in the companion study by Kalakonas and Silva (2022) and common GMMs for Europe were adopted. As for the seismic fragility and vulnerability components, the ANN models from Kalakonas and Silva (2021) and the scalar models from the European Seismic Risk Model 2020 (ESRM20—Crowley et al. 2021) were adopted. The results of the earthquake scenarios amongst the different modelling cases were compared in terms of accuracy with respect to the aggregated observed damage and loss data collected in the aftermath of the considered events. To our knowledge, this is the first study to compare end-to-end the performance of ANNs against traditional methodologies for earthquake scenario simulations of building portfolios. Despite the wide range of uncertainties that involve the assessment of damage and losses due to single events, these analyses are fundamental to understand whether ML algorithms can indeed bring advantages to the field of earthquake loss assessment.

2 Description of seismic demand, exposure, fragility and vulnerability models

The exposure models of the Balkan countries considered in this study were adopted from Crowley et al. (2020), the most up-to-date European exposure model for seismic risk assessment, developed within the scope of the Horizon 2020 EU SERA project (http://www.sera-eu.org/en/home). The exposure models are publicly available through the GitLab repository https://gitlab.seismo.ethz.ch /efehr/esrm20_exposure. The exposure models cover the residential, commercial and industrial building stocks of 45 European countries, and include information regarding the number, spatial distribution, replacement cost and building typology of the assets. The GEM Building Taxonomy (Silva et al. 2022) was used to classify all building typologies in a uniform manner, considering the following attributes: construction material, lateral load resisting system (LLRS), number of stories, seismic design code level, and lateral force coefficient. The spatial resolution and level of aggregation of the exposure models varies between countries and occupancies. As further discussed in the following section, the seismic events considered in this study occurred in Albania, Croatia and Greece. In Table 1, the 30 most common building classes in the affected regions are presented.

Table 1 Most common building classes in the affected regions according to Crowley et al. (2020)

Regarding the seismic fragility and vulnerability models, two approaches were considered herein:

  • Traditional scalar IM models: We adopted the functions from the ESRM20 model (Romão et al. 2019https://gitlab.seismo.ethz.ch/efehr/esrm20_vulnerability), which were developed following the methodology described in Martins and Silva (2020).

  • ANN vector IM models: We used the ANN models for the same building classes proposed by Kalakonas and Silva (2021).

Both approaches apply nonlinear dynamic analysis on the same equivalent SDOF oscillators, but employ different regression methods to characterize the relationship between IM and the engineering demand parameter (EDP). The scalar models employ linear regression between ln(IM) and ln(EDP) following the cloud analysis procedure (e.g., Jalayer 2015), while the ANNs use 14 input IMs and 14 neurons in a hidden layer to predict ln(EDP). The scalar models use one IM (PGA, SA(0.3 s), SA(0.6 s) or SA(1.0 s)) depending on the yielding period of the respective SDOF oscillators, while the ANNs use PGA, Arias Intensity (IA) and 12 spectral ordinates in the period range of 0.05 to 1.4 s.

The consequence model used in both approaches is adopted from Martins and Silva (2020), which classifies the potential damage into four states. Each damage state is associated with a limit state EDP threshold as a function of the yielding (Sdy) and the ultimate displacement (Sdu) of the capacity curve, and a mean loss ratio as presented in Table 2. Essentially, the difference between the two approaches is located in the regression analysis between IM and EDP, where the ANN models were found to be consistently sufficient and significantly more efficient than the scalar models (Kalakonas and Silva 2021). Detailed information regarding the exposure, fragility and vulnerability models can be found in the original publications and repositories.

Table 2 Damage states, EDP thresholds and expected loss ratios from Martins and Silva (2020)

Three alternative approaches were considered for the estimation of the seismic demand:

  • ShakeMaps from the USGS.

  • Existing analytical GMMs.

  • The ANN GMM developed in the companion study (Kalakonas and Silva 2022).

The USGS ShakeMap platform provides information about the earthquake rupture characteristics and ground shaking intensities for past seismic events around the world. The IMs included are PGA, SA(0.3 s), SA(1.0 s) and SA(3.0 s), which are calculated through the integration of recorded ground motions from nearby stations, community reported intensity data and the employment of a GMM logic tree (e.g., Worden et al. 2017). Their median and standard deviation values are provided in a 30 arc-seconds resolution grid throughout the affected regions. Concerning the analytical models, three common European GMMs were employed for the estimation of SA and PGA: Akkar et al. (2014), Bindi et al. (2014) and Kotha et al. (2016); and the model of Sandıkkaya and Akkar (2017) for IA.

3 Selected seismic events

The selection of the seismic events in the Balkan countries was performed based on the total direct impact (i.e., damage and loss of physical assets due to ground shaking). Most of the major recent earthquakes in the region were considered, although an event with minor impact was also selected to evaluate the overall performance of the models. The availability of damage and loss data collected in the aftermath of the seismic events was also a critical aspect, and for this reason some recent events were excluded (e.g., 2021 Mw 6 Crete-Arkalochori). Moreover, we opted to include earthquakes as recent as possible, to ensure that the exposure models capture sufficiently the number and replacement cost of the affected buildings. Finally, earthquakes where considerable damage was caused by multiple events were not considered (e.g., 2021 Mw 6.3 Damasi earthquake in central Greece—Koukouvelas et al. 2021; Michas et al. 2022), due to the complexity in calculating accurately the impact caused by the mainshock, foreshocks and aftershocks. In total, five seismic events were selected, as described below.

3.1 2019 Mw 6.4 Durrës, Albania, earthquake

On the 26 of November 2019 an earthquake of Mw 6.4 struck the northwest region of Albania around 17 km north of the city of Durrës (e.g., Papadopoulos et al. 2020), the second largest city in the country. The rupture occurred in a reverse fault mechanism with approximately 20 km hypocentral depth (e.g., USGS; Ganas et al. 2019) and it is one of the strongest earthquakes occurred in Albania in the last 100 years (e.g., Vittori et al. 2021). The impact of the event was significant with an indicative maximum MMI intensity of VIII-IX in the EMS-1998 scale. Overall, the earthquake affected 11 municipalities with varying degrees of damage, caused 51 fatalities and around 3,000 injuries (Freddi et al. 2021). In Fig. 1, the total number of buildings per administrate unit of the affected region are presented along with the location of the epicentre according to the USGS. The total direct economic losses in the residential, commercial/public, and industrial sectors were estimated by the World Bank GPURL D-RAS Team (Gunasekera et al. 2019) to be 616.7, 78.8, and 44.2 million USD, respectively. Regarding the damaged and destroyed buildings, the Earthquake Engineering Field Investigation Team (Antonov et al. 2020; Freddi et al. 2021) reported extensive damage in the masonry single-family houses, especially built before 1990, while multi-storey RC buildings suffered primarily heavy non-structural damage. According to the Post Disaster Needs Assessment (PDNA) report by the World Bank (World Bank 2020a), 11,490 housing units were extensively damaged or destroyed, 18,980 suffered partial damage, over 64,000 had slight damage, and 714 businesses in manufacturing and trade were damaged.

Fig. 1
figure 1

Epicentre of the 2019 Mw 6.4 Durrës earthquake and total number of buildings per administrative unit in the affected region

3.2 2020 M w 5.4 Zagreb, Croatia, earthquake

The capital city of Croatia, Zagreb, was hit by a Mw 5.4 earthquake on the 22nd of March 2020, as depicted in Fig. 2. The earthquake occurred on a reverse fault with a 10 km hypocentral depth and approximately 7 km north of the city centre (e.g., Markušić et al. 2020a, b; Atalić et al. 2021). It was the strongest seismic event in the vicinity of Zagreb in the last 140 years, with an estimated MMI intensity of VII within the historic centre of the city. Severe damage was observed, causing 1 fatality, 27 injuries and displacing more than 15,000 people (Atalić et al. 2021). Damage assessment surveys were initiated in the immediate aftermath of the event and were officially concluded on the 30th of June 2020. Overall, 25,528 inspections were carried out where buildings were assigned a tag (i.e., green, yellow or red) depending on the level of structural damage (Novak et al. 2020). The green tag consisted of two subcategories: U1—usable without limitations (insignificant roof damage and minor cracks in structural and non-structural walls); U2—usable but with recommendations (heavily damaged or collapsed chimneys, partly collapsed gable walls, damaged partitioned walls and small cracks in lintels and bearing walls). The yellow tags (PN1 and PN2) were assigned to buildings exhibiting moderate to significant structural and non-structural damage. The total number of buildings assigned to U1, U2, yellow and red tags were 10,309, 8,879, 4,988 and 1,342, respectively. The old masonry buildings in the city centre sustained most of the damage and the aggregated direct losses of the residential, commercial and public buildings were estimated at 1.20 billion EUR (or 1.32 billion USD at the time of the event) (Atalić et al. 2021).

Fig. 2
figure 2

Epicentres of the 2020 Mw 5.4 Zagreb and 2020 Mw 6.4 Petrinja earthquakes, and total number of buildings per municipality in the affected region

3.3 2020 M w 6.4 Petrinja, Croatia, earthquake

One of the strongest earthquakes in Croatia in the last 100 years occurred on the 29th of December 2020, approximately 3 km southwest of Petrinja. The earthquake rupture of Mw 6.4 took place on a shallow strike-slip fault at around 10 km depth and the ground shaking intensity near the epicentre was reported to be VIII-IX on the EMS-98 scale (Markušić et al. 2021). As a consequence, significant damage was observed in the towns of Petrinja, Glina, and Sisak, leaving 7 people dead and 26 injured. Figure 2 illustrates the epicentre of the earthquake and the total number of buildings per municipality in the affected region. A reconnaissance team reported that unreinforced masonry buildings suffered the majority of the heavy damage, especially the ones built before the implementation of the first seismic design code in 1964 (Miranda et al. 2021). Damage assessment surveys carried out by structural engineers following the EMS-98 damage scale, which categorizes structural damage into 5 grades, from negligible or slight damage (DG1) to destroyed (DG5). As of the 21st of February 2021, the Rapid Damage and Needs Assessment (RDNA) study by the World Bank reported that 25,914 buildings had been inspected out of the 34,552 reported damaged buildings (World Bank 2021). The damage assessment inspections carried on in the following months, where the final damage distribution of 46,634 surveyed buildings is as follows: 29,078 in DG1 (negligible to slight damage); 9,741 in DG2 (moderate damage); 4,371 in DG3 (substantial to heavy damage); 2,759 in DG4 (very heavy damage) and 685 in DG5 (destroyed) (CCEE 2022). The total economic loss in the residential, commercial and industrial sectors was estimated at 2.646 billion EUR (or 3.2 billion USD at the time of the event) (World Bank 2021).

3.4 1999 M w 6.0 Athens, Greece, earthquake

On the 7th of September 1999 Athens was struck by a Mw 6.0 earthquake at approximately 20 km from the city centre, where the rupture occurred at a normal fault with 8 km hypocentral depth (Papadimitriou et al. 2002). The event caused widespread damage primarily in 12 municipalities of Athens, and led to 143 fatalities, 7,000 injuries, and affected around 42,000 buildings (Pomonis 2002). Figure 3 presents the location of the epicentre of the event along with the total number of buildings per municipality of Athens and suburban regions. Damage assessment surveys were carried out by the Earthquake Planning and Protection Organisation (OASP), and were concluded in October 1999. The inspected buildings were classified into 3 categories according to the level of damage as: green—mainly light non-structural damage; yellow—repairable structural or non-structural damage; and red—severe structural damage that will most likely lead to demolition. Overall, 64,349 buildings were surveyed, from which 38,165 were yellow-tagged, and 4,682 were assigned a red tag. The total economic cost related to the reconstruction and repair of buildings was estimated at 3 billion EUR at the time of the event, while the total loss was estimated to reach 4 billion EUR, making it the costliest natural disaster in Greece’s history (e.g., Pomonis 2002).

Fig. 3
figure 3

Epicentres of the 1999 Mw 6.0 and 2019 Mw 5.3 Athens earthquakes, and total number of buildings per municipality in the affected region

3.5 2019 M w 5.3 Athens, Greece, earthquake

A moderate earthquake occurred on the 19th of July 2019 in the vicinity of the destructive event of 1999 in Athens, as shown in Fig. 3. The rupture occurred at a normal fault and similar hypocentral depth to the 1999 rupture (8 km) at around 23 km northwest from the city centre (Kouskouna et al. 2021). The Mw of the event was estimated to be 5.1 according to Kouskouna et al.(2021) and Kapetanidis et al. (2020), and 5.3 according to the USGS. The damage caused by the event was concentrated in the western and southern urban region of Athens, where certain municipalities were heavily affected by the 1999 earthquake. The damage assessment inspections were carried out by teams of engineers covering 33 municipalities and following the same methodology as in the aftermath of the 1999 event. The results of the surveys were geocoded and reported 730, 681, and 8 buildings in green, yellow and red tags, respectively (Kouskouna et al. 2021). Estimates of the economic loss related to the repair of the damaged buildings were not found.

4 Damage assessment and loss estimation framework

In the portfolio loss analysis carried out herein, five alternative modelling cases were considered associated with the characterization of the seismic demand and employment of fragility/vulnerability models:

  1. A.

    ShakeMap and scalar fragility/vulnerability models.

  2. B.

    Analytical GMMs and scalar fragility/vulnerability models.

  3. C.

    ANN GMM and scalar fragility/vulnerability models.

  4. D.

    ANN GMM and ANN fragility/vulnerability models.

  5. E.

    Analytical GMMs and ANN fragility/vulnerability models.

The earthquake scenario simulations employing ShakeMaps or analytical GMMs and scalar models (cases A and B) were carried out in the OpenQuake-engine, an open-source software for seismic hazard and risk analysis (e.g., Pagani et al. 2014; Silva et al. 2014). Silva and Horspool (2019) integrated ShakeMaps within the OpenQuake-engine, allowing direct earthquake scenario simulations for any event included in the USGS database. On the other hand, the OpenQuake-engine allows modelling finite ruptures as single or multiple plane(s), to calculate the spatial distribution of ground shaking. In this process, the rupture characteristics (i.e., Mw, hypocentre, depth, rake, strike, dip) are employed to calculate the rupture plane, and then used by the GMMs to calculate ground motion fields in the affected area (e.g., Pagani et al. 2022).

Due to the open-source nature of the OpenQuake-engine, it is possible to employ its hazard and risk libraries within a custom seismic risk assessment framework in a Python environment. In this study, such framework was developed to carry out the earthquake scenario simulations using the ANN models. The core functions used by the OpenQuake-engine to calculate the rupture geometry, the rupture surface projection and the calculation of Rjb were employed in this framework. These functions were used in the modelling cases C, D and E to predict the ground motion fields by the ANNs and the analytical GMMs. In particular, the GMMs of Akkar et al. (2014), Bindi et al. (2014), and Kotha et al. (2016) were used to estimate PGA and the spectral ordinates in modelling cases B and E, namely, cases B1/2/3 and E1/2/3, respectively. As for Arias Intensity required by the fragility and vulnerability ANN models, the model by Sandıkkaya and Akkar (2017) was used in all modelling subcases E, while the ANN GMM was used for case D. We note that the ShakeMaps are only employed with the scalar fragility and vulnerability models (case A), as the ANNs require IMs that are not currently available in the USGS platform. All the employed ground motion models in the scenario simulations use Vs30 as a parameter to account for local site effects. The Vs30 site models for the three countries were adopted from the ESRM20 (Weatherill et al. 2022), which are provided in a 30-arcseconds grid and developed using topographic and geological proxy datasets.

Certain assumptions and modelling choices were followed in all scenario simulations. Due to the absence of regional maps to model the systematic site effects, the between site (φ) and within event/site (ψ) variabilities were combined and considered as the total intra-event variability, calculated as \(\sqrt {\Phi^{2} + \psi^{2} }\). The intra-event residuals of spectral ordinates at different periods at the same site from a given event are cross-correlated as shown by various studies (e.g., Baker and Cornell 2006). Thus, it is necessary to account for such correlation when considering several IMs at multiple sites during portfolio risk analysis. We used the cross-correlation model proposed by Baker (2007) due to its ability to account for the correlation between PGA, IA and spectral ordinates. However, there is another source of correlation between the intra-event residuals of IMs at sites separated by a given distance, termed as spatial cross-correlation (e.g., Weatherill et al. 2015). This source of correlation was not explicitly modelled herein due to the complexity of modelling the spatially correlated sampling of intra-event residuals for the cases C and D. We note, however, that the findings of Silva (2016) and Schiappapietra et al. (2022) suggest that the consideration of spatial correlation in earthquake scenario simulations of large heterogenic building portfolios only increases the dispersion of the aggregated losses and does not affect the mean estimates.

The number of simulations for each modelling case was set to 1000 in order to achieve convergence, within an acceptable level of confidence, in the aggregated damage and loss metrics as suggested by Silva (2016). Due to the different levels of exposure aggregation at the various administrative units, all exposure models were disaggregated to a 60 arcseconds resolution following the findings of Kalakonas et al. (2020) and Dabbeek et al. (2021). The exposure disaggregation was applied following the methodology described in Dabbeek et al. (2021). We also note that due to the exposure aggregation and the modelling of between-event variability per IM for each simulation, spatial correlation is already accounted to some degree in the risk calculations.

Finally, it should be highlighted that the information regarding the rupture characteristics for all seismic events were adopted from the USGS archive. The authors recognize that for certain earthquakes more accurate rupture models may exist in the literature than the ones adopted herein. Nevertheless, for the sake of consistency in the comparisons and assumptions between all the events and modelling cases, it was decided to use only the available information from the USGS. We note that such an approach will be followed after the occurrence of future seismic events, simply because accurate rupture models are typically unavailable for several months after the occurrence of earthquakes.

5 Results and discussion

For all seismic events, the median PGA estimates provided by the ShakeMap and calculated by the GMMs are presented, enabling a direct comparison of the results amongst all modelling cases. Throughout the comparisons discussed in this section, the results are presented in summary tables for each event, where a colour is assigned to the values of the predicted metrics depending on their accuracy with respect to the observations. The applied colour scheme is as follows:

  • Satisfactory accuracy—Green colour; for estimates up to 25% higher or lower than the observed.

  • Overestimation—Red colour; for estimates at least 25% higher than the observed.

  • Underestimation—Blue colour; for estimates at least 25% lower than the observed.

The damage and loss information collected for each event is significantly different between the five events, and the damage classification does not follow the four damage states adopted by both the scalar and ANN fragility functions. For this reason, it was necessary to convert and adjust the reported information into metrics compatible with the damage scale used by the fragility models. All of these assumptions and conversions are described in the following subsections, along with a brief discussion of the results for each event.

5.1 Comparison between estimated and observed impact

5.1.1 2020 M w 6.4 Durrës, Albania, earthquake

The total direct aggregated economic loss associated to the replacement cost of the damaged and destroyed buildings in the three sectors is estimated at 740 million USD. According to the PDNA report, 31.4% and 30.8% of the total losses in the residential sector occurred in the municipality of Durrës and Tirana, respectively, while the remaining 37.8% occurred in 9 municipalities. The PDNA reported the total number of damaged and destroyed housing units (i.e., dwellings), so it is necessary to convert these values into the number of damaged buildings. In the EEFIT report, the ratio of the number dwellings to the number of buildings is estimated for Durrës, Tirana, and at the national level. These ratios are reported to be 1.8, 2.4, and 1.7, respectively. A reasonable approach to estimate the respective number of buildings from housing units is to calculate the weighted ratio as 1.8*0.314 + 2.4*0.308 + 1.7*0.378 = 1.947. Using the weighted ratio, the number of buildings were calculated from the 11,490 destroyed or extensively damaged, 18,980 partially damaged and 64,000 slightly damaged housing units. Moreover, the 714 damaged business responsible for the losses in the commercial and industrial occupancies were assumed to be equally distributed in the moderate and the joint damage state of extensive and complete damage states. The predicted and observed aggregated damage and loss data are presented in Table 3.

Table 3 2019 Mw 6.4 Durrës event: Comparison between the aggregated observed and predicted damage and loss metrics from all modelling cases

The ShakeMap and scalar models (case A) overestimated the aggregated losses and damaged buildings in all damage states, especially in the extensive and complete joint state. This can be explained by the higher PGA predictions in larger areas, as depicted in Fig. 4, where the ground motion attenuates at larger distances in comparison to the rest GMMs. The significantly smaller coefficient of variation of the total economic loss in comparison to the rest modelling cases indicates the reduced uncertainty in the ground motion fields, a consequence of incorporating ground motion recordings and reported intensity data. Regarding the performance of analytical GMMs and scalar models (case B), the subcases B1 and B3 provided reasonable estimations. On the other hand, even though the predictions of B2 are better in the SD and MD damage states, the major overestimation in the joint ED and CD state led to a higher total economic loss.

Fig. 4
figure 4

Median PGA estimates of all modelling cases for the 2019 Mw 6.4 Durrës earthquake. The black rectangle represents the earthquake rupture projection to the surface

Employing the ground shaking ANN (cases C and D) led to relatively accurate results in terms of the number of buildings in the SD and MD states, while for the joint ED and CD state there is a major and a slight overestimation by the fragility scalar models and ANN, respectively. Consequently, the vulnerability ANNs predicted a total economic loss with a satisfactory accuracy, while the scalar vulnerability models significantly overestimated the losses. Observing the results of GMMs and NN models (case E), the damage and loss estimates of E1 and E3 are significantly underestimated. The accuracy of the E2 subcase in the SD and MD states is quite similar to the subcase B2. However, the number of buildings in the joint ED and CD state are noticeably less overestimated than in subcase B2, thus the predicted total loss is relatively accurately predicted.

5.1.2 2020 M w 5.4 Zagreb, Croatia, earthquake

For this event, the documented damage data did not require any additional adjustments in order to perform the comparisons presented in Table 4. The total loss due to the damaged and destroyed assets includes public buildings such as hospitals, schools and governmental facilities. Therefore, it should be noted that the loss associated to the residential, commercial, and industrial sectors is expected to be slightly lower than 1,32 billion USD. For the particular events in Croatia, additional information about the replacement cost per square meter and surface area per building estimated by the local experts was available (World Bank 2020b, 2021). For this reason, we adjusted the values in the exposure model from Crowley et al. (2020) to match these estimates. This resulted in an increase of the average cost per square meter by 9% (i.e., 1050 €/m2) and an increase of the average surface area per building by 16% (i.e., 378 m2) in the city of Zagreb.

Table 4 2020 Mw 5.4 Zagreb event: Comparison between the aggregated observed and predicted damage and loss metrics from all modelling cases

The estimates of all modelling cases show a major overestimation in the number of slightly damaged buildings, especially in Cases A and D, where the overestimation is by a factor greater than 5. The reason for such huge differences is potentially related to the definitions of the SD state in the methodology of Martins and Silva (2020) and the green tags (U1 and U2) of the damage assessment methodology. The former defines the initiation of slight damage at 75% of the yielding displacement (Sdy) to account for damage in the non-structural components (e.g., partition walls) prior to the occurrence of structural damage. On the other hand, the green tags of U1 and U2 were assigned to buildings suffering at least minor cracks in structural and non-structural walls and minor roof damage. Consequently, it is expected that a fraction of the buildings in the SD state was not reported or surveyed in the aftermath of the event. Modelling Cases A and C overestimated substantially the number of buildings in MD and CD, leading to high economic losses, due to the higher predicted ground shaking in the centre of Zagreb, as observed from Fig. 5. On the contrary, the analytical GMMs and ANN models (case E) heavily underestimated the overall impact of the event. The estimates of cases B and D are more realistic, but still quite diverse. The estimated total loss by subcase B2 is quite satisfactory, but the predicted number of buildings in the CD state are substantially overestimated. The accuracy of ANN models (case D) in the MD and CD states is quite high, although the severe underestimation in the ED state led to slightly lower economic loss than the observed value.

Fig. 5
figure 5

Median PGA estimates of all modelling cases for the 2020 Mw 5.4 Zagreb earthquake. The black rectangle represents the earthquake rupture projection to the surface

5.1.3 2020 M w 6.4 Petrinja, Croatia, earthquake

Certain assumption needs to be made to match the EMS-98 damage scale to the damage states used by Martins and Silva (2020). Given the definitions of DG4 (i.e., very heavy damage) and DG5 (i.e., destruction) of the EMS-98, it is plausible to match both damage states to the complete damage state (CD) used by the fragility models. Also, DG1 (i.e., no structural and slight non-structural damage) can be mapped to the slight damage state (SD). Regarding DG2 (i.e., slight structural damage and moderate non-structural damage), it is reasonable to assume that a part should be matched to SD, while the remaining to moderate damage state (MD). Similarly, DG3 (i.e., moderate structural and heavy non-structural damage) can be mapped to a joint state of moderate damage (MD) and extensive damage (ED). Considering the above, a decision was made to map 50% of the buildings in DG2 to SD and the remaining in a joint MD and ED state along with the buildings in DG3.We note that a similar approach was proposed in the RDNA report (World Bank 2021) to correlate the EMS-98 damage grades to usability labels. As previously explained for the 2020 Zagreb event, the replacement cost per square meter and surface area per building in the affected counties of the exposure model were adjusted using information by the local experts (World Bank 2020b, 2021). This resulted in an increase of average the cost per square meter by 69% (i.e., 1050 €/m2) and an increase of the average surface area per building by 29% (i.e., 239 m2) in the affected counties. The results of the scenario simulations along with the reported damage assessment data are presented in Table 5.

Table 5 2020 Mw 6.4 Petrinja event: Comparison between the aggregated observed and predicted damage and loss metrics from all modelling cases

The estimates of each modelling case are dominated by the employed fragility and vulnerability models. The scalar models (cases A, B and C) overestimated the total impact of the event, while the ANN models led to overall accurate estimates, regardless of the employed ground motion model. Similar to the previous two seismic events, the predictions provided by Case A are considerably higher than the observed number of damaged buildings, because of the higher ground shaking estimation, as shown in Fig. 6.

Fig. 6
figure 6

Median PGA estimates of all modelling cases for the 2020 Mw 6.4 Petrinja earthquake. The black rectangle represents the earthquake rupture projection to the surface

5.1.4 1999 M w 6.0 Athens, Greece, earthquake

For the comparison of the results from this particular event, the approach suggested by Riga et al. (2021) was followed to map the yellow and red tags to the damage states used by Martins and Silva (2020). In this approach, the yellow-tagged buildings were assumed to be either in MD or ED states, while the red-tagged buildings were matched entirely to the CD state.

To compare the total economic loss, the reported loss of 3 billion EUR at the time of the event has to be projected to an equivalent 2020 value. To this end, a few different assumptions may be applied accounting for different sources of impact. For example, the inflation rate of euro from 1999 to 2020 is undoubtedly a factor to consider. One euro in 1999 is equivalent to 1.43 EUR in 2020 in terms of purchasing power according to the European Central Bank’s Harmonized Index of Consumer prices (e.g., Euro Inflation Calculator https://www.officialdata.org/Euro-inflation). Therefore, the equivalent of 3 billion EUR in 1999 is approximately 4.29 billion EUR in 2020 (or 4.9 billion USD). In the study of Pomonis (2002), it is reported that the reconstruction and repair cost of the buildings was around 1.5% of the estimated value of buildings and contents in the Attica and affected regions. Using the exposure model of Crowley et al. (2020), we calculated that 1.5% of the total replacement cost in the affected regions is approximately 4.8 billion USD, remarkably close to the previous value. Furthermore, one might claim that the same seismic event in 2020 will have similar financial impact as a percentage of the GDP of Greece as in 1999. The total financial impact including indirect losses was estimated to be around 3% of Greece’s GDP in 1999 (e.g., Pomonis 2002). The GDP of Greece in 2020 was 189 billion USD according to the World Bank data, and 3% of this value would be 5.67 billion USD. Recognizing that the latter estimate includes other type of losses that are not considered in the scenario simulations, we assumed that the equivalent loss in 2020 would be in the range of 4.9 to 5.67 billion USD. The observed and predicted damage and loss metrics are presented in Table 6.

Table 6 1999 Mw 6.0 Athens event: Comparison between the aggregated and predicted damage and loss metrics from all modelling cases

For this scenario, the trends in the damage and loss estimates are primarily influenced by the employed fragility and vulnerability models. On the one hand, the estimates using the scalar models are substantially overestimating the economic loss and the number of buildings in the CD state. This is particularly evident in Case A, due to the higher ground shaking predictions, as presented in Fig. 7. On the other hand, the predictions of cases D, E1 and E3 are relatively accurate. Subcase E2 is an exception as it overestimates the economic losses and number of buildings in the CD state. This can be attributed to the ground motion model of Bindi et al. (2014), as subcase B2 predicted the highest metrics among all cases.

Fig. 7
figure 7

Median PGA estimates of all modelling cases for the 1999 Mw 6.0 Athens earthquake. The black rectangle represents the earthquake rupture projection to the surface

Furthermore, it is interesting to observe that the general underestimation in the joint MD and ED state. In fact, these might be viewed as more plausible estimations rather than underestimations, as the majority of the red-tagged buildings were reconstructed and a large portion of the yellow-tagged buildings were repaired within a few years after the occurrence of the earthquake (e.g., Pomonis 2002). In 2000, a revised version of the Greek seismic design code was published and enforced, which integrated recent scientific advances in earthquake engineering and recommendations from the Eurocode 8 (EC8). Although it is unknown what portion of the red- and yellow-tagged buildings were reconstructed and repaired according to the new seismic design provisions, an overall improvement of the seismic performance of the new buildings is expected. Therefore, the number of buildings in the joint MD and CD state in a 2020 simulation should be lower than the yellow-tagged buildings in 1999. Under the same assumption, one might argue that the number of buildings in the CD state should also be lower than the number of red-tagged structures in 1999.

5.1.5 2019 M w 5.3 Athens, Greece, earthquake

The same approach followed for the 1999 event was used for this event to match the coloured tags to the damage states, as the same methodology was applied during the damage assessment surveys. The results are illustrated in Table 7.

Table 7 2019 Mw 5.3 Athens event: Comparison between the aggregated observed and predicted damage and loss metrics from all modelling cases

Unsurprisingly, the results follow similar trends to the ones observed and discussed in the 1999 event, given the fact that the earthquake occurred in the vicinity of the 1999 rupture. Nevertheless, Case A performed relatively better compared to Cases B and C, considering the lower ground motion predicted by the ShakeMap (see Fig. 8). All modelling cases substantially overestimated the number of buildings tagged as green. The reasons for such differences are similar to the ones discussed in the Zagreb 2020 event. A large portion of the buildings that suffered only slight non-structural damage were potentially not reported, and therefore not surveyed. Generally, the scalar fragility models overestimated the number of damaged buildings, while the ANN fragility models predicted more plausible damage distributions. These overestimations are potentially related to the fact that the rupture is modelled with a Mw 5.3 for the sake of consistency with the ground motion fields calculated by the ShakeMap. However, the Mw of the earthquake was reported to be 5.1 in several studies (e.g., Kouskouna et al. 2021; Kapetanidis et al. 2020). This discrepancy led to higher ground shaking, and overall higher number of damaged buildings.

Fig. 8
figure 8

Median PGA estimates of all modelling cases for the 2019 Mw 5.3 Athens earthquake. The black rectangle represents the earthquake rupture projection to the surface

 + To better understand the performance of each modelling approach for all seismic events, Fig. 9 illustrates their comparison to the observed economic losses and number of buildings in complete damage. The identity line is also included to facilitate the identification of the modelling approaches which led to results closer to the observations. For the modelling cases B and E, the mean estimates across all subcases were considered. These results indicate that modelling cases D and E, which employ the ANN models for the fragility and vulnerability component, led to overall satisfactory results. Case D in particular seem to consistently lead to greater accuracy, for both economic losses and buildings across all damage states.

Fig. 9
figure 9

Comparison of the mean estimates among all modelling cases for all events in the a number of buildings in complete damage (top); b total economic loss (bottom). The coefficient of variation of the mean estimates can be found in Tables 3, 4, 5, 6 and 7

The authors acknowledge that the accuracy of the observed damage and reported losses greatly affects the discussion in this study, and recognize that these data are affected by epistemic uncertainty and subjectivity. While the data that were used seemed to be the most up-to-date and reliable for these past seismic events, the results presented herein should be carefully interpreted. Additionally, the assumptions made to match the reported number of buildings per damage grade to the respective damage states of the fragility models are also subjected to considerable uncertainty and subjectivity. We also recognize that the comparisons presented and discussed herein can be greatly extended. For example, for certain events the spatial damage distribution for each building class can be compared to the observed data (e.g., Riga et al. 2021). Such comparisons may provide important insights to justify the differences found between the estimates of the modelling cases for most of the events. Nonetheless, the focus of the present study is not to benchmark the accuracy of different seismic risk models; but rather to explore the potential benefits of employing ANNs for earthquake scenario simulations and suggest a framework which can be used along with traditional models found in the literature.

6 Final remarks

In the present study, the application of artificial neural networks in earthquake scenario simulations of building portfolios was explored. We compared the performance of ANNs against scalar traditional GMMs and fragility/vulnerability models in the damage and loss assessment of earthquake scenarios. Five seismic events in the Balkan region with varying degrees of impact were selected, and five alternative modelling approaches were defined based on the characterization of seismic demand, fragility and vulnerability. The scenario simulations which employed ShakeMaps, analytical GMMs, and scalar fragility and vulnerability models were carried out using the OQ-engine. For the simulations using ANNs, a custom risk assessment framework was developed, which used certain components of the OQ-engine. The damage and loss predictions of the distinct modelling cases were compared to the observed data collected in the aftermath of the selected earthquakes. This is the first study to compare in an objective manner the performance of ANNs and traditional approaches in earthquake scenarios of building portfolios.

The ground motion model used for the estimation of the seismic demand has a significant impact on the aggregated damage and loss metrics. This fact has been identified and discussed by other studies in the field (e.g., Crowley et al. 2005; Silva 2016). For what concerns the fragility/vulnerability component, the ANNs led to reasonable estimates in comparison to the scalar models. The best performing modelling cases throughout all events are case D and E. This suggests that the employment of ANNs for earthquake scenarios enabled a more accurate estimation of the aggregated damage and losses. Nonetheless, further comparisons are essential to explore their performance and benefits over the traditional approaches, such as probabilistic seismic risk assessment.

Due to the vector of input IMs of ANNs, it is necessary to employ an IM cross-correlation model when using the ANNs for the damage and loss estimation. This may be viewed as a disadvantage, although its implementation is fairly simple and does not increase the computational cost. Nevertheless, the employment of a spatial cross-correlation model (e.g., Weatherill et al. 2015) might be cumbersome and computationally demanding due to the large number of IMs as opposed to scalar models. Another disadvantage of the ANNs is the fact that they need a plethora of data to be trained and evaluated as opposed to scalar models. However, in the case of analytical fragility and vulnerability models, this does not pose a considerable difficulty given the high availability of ground motion recordings (e.g., Kalakonas and Silva 2021).

It is interesting to note that for most of the events, Case A (i.e., ShakeMaps and scalar fragility/vulnerability functions) led to a significant overestimation of the results, despite the fact that ShakeMaps are undoubtedly the most reliable source of ground shaking footprints for past events at the global scale. We noted that for these particular events, only a few seismic stations were considered in the calibration of the ShakeMaps, and most of them were more than 100 km away. On the other hand, hundreds of reports regarding observations were integrated (the so called Did You Feel It? reports), which can be converted to Modified Mercalli Intensity (MMI) values. For the calibration of the ShakeMaps in terms of ground motion (i.e., PGA and SA), these values were converted using the conversion model proposed by Worden et al. (2012). This conversion model was developed using damage data for California, where the vulnerability of the building stock is potentially lower than the one in the Balkan region. In our region of interest, high levels of MMI can be caused by weaker ground shaking. This trend can lead to an overestimation of the ground shaking for past events where the number of seismic stations is limited. Such findings strengthen again the need to combine, as much as possible, data from recording stations in the calibration of ground shaking footprints for past events (e.g., Silva and Horspool 2019).

Overall, the results of all modelling cases are affected by a significant uncertainty regardless of the employed fragility and vulnerability models. We anticipate that the considerably reduced record-to-record variability of the ANNs compared to the scalar fragility models found in Kalakonas and Silva (2021) to be reflected in the variability of the aggregated damage and loss metrics. However, this uncertainty primarily originates in the aleatory uncertainty of the prediction of the ground motion fields. For additional information regarding the reduction of uncertainty in the ground shaking component for past seismic events, readers are referred to the work of Silva and Horspool (2019). On the one hand, the significant uncertainty of the risk estimates can be constrained by employing more advanced, region and site specific ground motion models. On the other hand, given the recent success of other ML tools in the field, it is essential to explore their performance on the damage and loss assessment of building portfolios without the use of ground motion models. This expedite may be very challenging and considerably different than the traditional risk frameworks, although its success will decrease substantially the uncertainty, complexity and computational cost of seismic risk assessment of building portfolios.