Abstract
Reliable earthquake fatality prediction is an important reference for post-earthquake emergency response efforts. Seismic data are the basis for constructing earthquake casualty prediction models, but the selection and evaluation of earthquake features are more critical due to the scarcity of destructive earthquake samples. In order to make full use of the high-dimensional survey data of destructive earthquake disasters in the Earthquake Reports in Yunnan Province since 1992, and effectively use it to improve the ability to predict the number of earthquake casualties, this paper proposes a hybrid feature importance evaluation method based on four conventional feature contribution methods (IG, PPMCC, SRCC and MDI), ranking the importance of 63 features that affect the number of earthquake casualties in Yunnan Province, and reducing the feature dimension accordingly. Then, cross-validation is used to compare the accuracy of the four machine models before and after dimensionality reduction. We found that (1) among the 10 features with the highest hybrid importance, there were 8 population distribution features, 1 geological hazard feature (number of landslides) and 1 damage degree feature (highest intensity of earthquakes); (2) the feature dimensionality reduction based on the importance of hybrid features can effectively improve the prediction accuracy of machine learning models; and (3) in the comparison of several methods, the Particle Swarm Optimized Support Vector Machine model had the highest prediction accuracy, with an R2 over 0.934. The research results showed that this method can significantly improve the prediction accuracy of the machine learning model and has some reference value for earthquake emergency rescue and post-disaster reconstruction work.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11069-023-05812-6/MediaObjects/11069_2023_5812_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11069-023-05812-6/MediaObjects/11069_2023_5812_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11069-023-05812-6/MediaObjects/11069_2023_5812_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11069-023-05812-6/MediaObjects/11069_2023_5812_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11069-023-05812-6/MediaObjects/11069_2023_5812_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11069-023-05812-6/MediaObjects/11069_2023_5812_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11069-023-05812-6/MediaObjects/11069_2023_5812_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11069-023-05812-6/MediaObjects/11069_2023_5812_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11069-023-05812-6/MediaObjects/11069_2023_5812_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11069-023-05812-6/MediaObjects/11069_2023_5812_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11069-023-05812-6/MediaObjects/11069_2023_5812_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11069-023-05812-6/MediaObjects/11069_2023_5812_Fig12_HTML.png)
Similar content being viewed by others
References
Abdollahi S, Madadi M, Ostad-Ali-Askari K (2021) Monitoring and investigating dust phenomenon on using remote sensing science, geographical information system and statistical methods. Appl Water Sci. https://doi.org/10.1007/s13201-021-01419-z
Aghamohammadi H, Mesgari MS, Mansourian A, Molaei D (2013) Seismic human loss estimation for an earthquake disaster using neural network. Int J Environ Sci Technol 10(5):931–939. https://doi.org/10.1007/s13762-013-0281-5
Biglari M, Formisano A (2022) Urban seismic scenario-based risk analysis using empirical fragility curves for kerend-e-gharb after mw 7.3, 2017 Iran earthquake. Bull Earthq Eng 20(12):6487–6503. https://doi.org/10.1007/s10518-022-01454-4
Chang D, Wang Y, Fan R (2022) Forecast of large earthquake emergency supplies demand based on pso-bp neural network. Tehnicki Vjesnik-Technical Gazette 29(2):561–571. https://doi.org/10.17559/Tv-20211120092137
Chen QF, Mi HL, Huang J (2005) A simplified approach to earthquake risk in mainland china. Pure Appl Geophys 162(6–7):1255–1269. https://doi.org/10.1007/s00024-004-2668-1
Chuang LY, Ke CH, Chang HW, Yang CH (2009) A two-stage feature selection method for gene expression data. OMICS 13(2):127–137. https://doi.org/10.1089/omi.2008.0083
Coburn AW, Spence RJS, Pomonis A, Int Assoc Earthquake E (1992) Factors determining human casualty levels in earthquakes—mortality prediction in building collapse. In: 10th world conference on earthquake engineering (10 Wcee ). A a Balkema, Madrid, Spain, pp 5989–5994
Cui SZ, Yin YQ, Wang DJ, Li ZW, Wang YZ (2021) A stacking-based ensemble learning method for earthquake casualty prediction. Appl Soft Comput. https://doi.org/10.1016/j.asoc.2020.107038
Delavar MR, Sadrykia M (2020) Assessment of enhanced dempster-shafer theory for uncertainty modeling in a gis-based seismic vulnerability assessment model, case study—tabriz city. ISPRS Int J Geoinf. https://doi.org/10.3390/ijgi9040195
Espezua S, Villanueva E, Maciel CD, Carvalho A (2015) A projection pursuit framework for supervised dimension reduction of high dimensional small sample datasets. Neurocomputing 149:767–776. https://doi.org/10.1016/j.neucom.2014.07.057
Feng T, Hong Z, Fu Q, Ma S, Jie X, Wu H, Jiang C, Tong X (2014) Application and prospect of a high-resolution remote sensing and geo-information system in estimating earthquake casualties. Nat Hazards Earth Syst Sci 14(8):2165–2178. https://doi.org/10.5194/nhess-14-2165-2014
Gregorutti B, Michel B, Saint-Pierre P (2016) Correlation and variable importance in random forests. Stat Comput 27(3):659–678. https://doi.org/10.1007/s11222-016-9646-1
Guettiche A, Gueguen P, Mimoune M (2017a) Economic and human loss empirical models for earthquakes in the mediterranean region, with particular focus on algeria. Int J Disaster Risk Sci 8(4):415–434. https://doi.org/10.1007/s13753-017-0153-6
Guettiche A, Gueguen P, Mimoune M (2017b) Seismic vulnerability assessment using association rule learning: application to the city of constantine, algeria. Nat Hazards 86(3):1223–1245. https://doi.org/10.1007/s11069-016-2739-5
Gul M, Guneri AF (2016) An artificial neural network-based earthquake casualty estimation model for Istanbul City. Nat Hazards 84(3):2163–2178. https://doi.org/10.1007/s11069-016-2541-4
Hancer E, Xue B, Zhang MJ (2018) Differential evolution for filter feature selection based on information theory and feature ranking. Knowl Based Syst 140:103–119. https://doi.org/10.1016/j.knosys.2017.10.028
Huang X, ** HD (2018) An earthquake casualty prediction model based on modified partial Gaussian curve. Nat Hazards 94(3):999–1021. https://doi.org/10.1007/s11069-018-3452-3
Huang X, Luo M, ** H (2020) Application of improved elm algorithm in the prediction of earthquake casualties. PLoS ONE 15(6):e0235236. https://doi.org/10.1371/journal.pone.0235236
Jaiswal K, Wald D (2010) An empirical model for global earthquake fatality estimation. Earthq Spectra 26(4):1017–1037. https://doi.org/10.1193/1.3480331
Jia HX, Lin JQ, Liu JL (2019) An earthquake fatalities assessment method based on feature importance with deep learning and random forest models. Sustainability. https://doi.org/10.3390/su11102727
Karimzadeh S, Miyajima M, Hassanzadeh R, Amiraslanzadeh R, Kamel B (2014) A gis-based seismic hazard, building vulnerability and human loss assessment for the earthquake scenario in tabriz. Soil Dyn Earthq Eng 66:263–280. https://doi.org/10.1016/j.soildyn.2014.06.026
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: 1995 IEEE international conference on neural networks (ICNN 95). IEEE, Univ W Austraia, Perth, Australia, pp 1942–1948
Kraskov A, Stogbauer H, Grassberger P (2004) Estimating mutual information. Phys Rev E Stat Nonlinear Soft Matter Phys 69(6 Pt 2):066138. https://doi.org/10.1103/PhysRevE.69.066138
Li B, Gong A, Zeng T, Bao W, Xu C, Huang Z (2021a) A zoning earthquake casualty prediction model based on machine learning. Remote Sens. https://doi.org/10.3390/rs14010030
Li K, Huang G, Baetz B (2021b) Development of a wilks feature importance method with improved variable rankings for supporting hydrological inference and modelling. Hydrol Earth Syst Sci 25(9):4947–4966. https://doi.org/10.5194/hess-25-4947-2021
Liu YH, Li ZQ, Wei BY, Li XL, Fu B (2019) Seismic vulnerability assessment at urban scale using data mining and giscience technology: application to urumqi (china). Geomat Nat Haz Risk 10(1):958–985. https://doi.org/10.1080/19475705.2018.1524400
Moraglio A, Chio C, Poli R (2007) Geometric particle swarm optimisation. Genet Progr Proc 4445(1):125. https://doi.org/10.1007/s11721-007-0002-0
Ni L, Fang F (2016) Entropy-based model-free feature screening for ultrahigh-dimensional multiclass classification. J Nonparam Stat 28(3):515–530. https://doi.org/10.1080/10485252.2016.1167206
Ostad-Ali-Askari K, Shayan M (2021) Subsurface drain spacing in the unsteady conditions by hydrus-3d and artificial neural networks. Arab J Geosci. https://doi.org/10.1007/s12517-021-08336-0
Ostad-Ali-Askari K (2022) Develo** an optimal design model of furrow irrigation based on the minimum cost and maximum irrigation efficiency. Appl Water Sci. https://doi.org/10.1007/s13201-022-01646-y
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830. https://doi.org/10.48550/ar**v.1201.0490
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106. https://doi.org/10.1007/bf00116251
Rabiei J, Khademi MS, Bagherpour S, Ebadi N, Karimi A, Ostad-Ali-Askari K (2022) Investigation of fire risk zones using heat–humidity time series data and vegetation. Appl Water Sci. https://doi.org/10.1007/s13201-022-01742-z
Riedel I, Gueguen P, Dalla Mura M, Pathier E, Leduc T, Chanussot J (2015) Seismic vulnerability assessment of urban environments in moderate-to-low seismic hazard regions using association rule learning and support vector machine methods. Nat Hazards 76(2):1111–1141. https://doi.org/10.1007/s11069-014-1538-0
Rostami M, Berahmand K, Nasiri E, Forouzande S (2021) Review of swarm intelligence-based feature selection methods. Eng Appl Artif Intell. https://doi.org/10.1016/j.engappai.2021.104210
Shi X, Wong YD, Li MZ, Palanisamy C, Chai C (2019) A feature learning approach based on xgboost for driving assessment and risk prediction. Accid Anal Prev 129:170–179. https://doi.org/10.1016/j.aap.2019.05.005
Strobl C, Boulesteix AL, Zeileis A, Hothorn T (2007) Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinform 8:25. https://doi.org/10.1186/1471-2105-8-25
Turkan S, Ozel G (2014) Modeling destructive earthquake casualties based on a comparative study for turkey. Nat Hazards 72(2):1093–1110. https://doi.org/10.1007/s11069-014-1059-x
Vehtari A, Gelman A, Gabry J (2016) Practical bayesian model evaluation using leave-one-out cross-validation and waic. Stat Comput 27(5):1413–1432. https://doi.org/10.1007/s11222-016-9696-4
Wang HX, Niu JX, Wu JF (2011) Ann model for the estimation of life casualties in earthquake engineering. Eng Risk Manag 1:55–60. https://doi.org/10.1016/j.sepro.2011.08.010
Wang QL, Guo YF, Yu LX, Li P (2020a) Earthquake prediction based on spatio-temporal data mining: an lstm network approach. IEEE Trans Emerg Topics Comput 8(1):148–158. https://doi.org/10.1109/Tetc.2017.2699169
Wang Y, Gardoni P, Murphy C, Guerrier S (2020b) Worldwide predictions of earthquake casualty rates with seismic intensity measure and socioeconomic data: a fragility-based formulation. Nat Hazards Rev 21:2. https://doi.org/10.1061/(Asce)Nh.1527-6996.0000356
**a CX, Nie GZ, Fan XW, Zhou JX, Li HY, Pang XK (2020) Research on the rapid assessment of earthquake casualties based on the anti-lethal levels of buildings. Geomat Nat Haz Risk 11(1):377–398. https://doi.org/10.1080/19475705.2019.1710581
**ng H, Zhonglin Z, Shaoyu W (2015) The prediction model of earthquake casuailty based on robust wavelet v-svm. Nat Hazards 77(2):717–732. https://doi.org/10.1007/s11069-015-1620-2
Zhang SH, Yang K, Cao YB (2019) Gis-based rapid disaster loss assessment for earthquakes. IEEE Access 7:6129–6139. https://doi.org/10.1109/Access.2018.2889918
Funding
This research was funded by the National Natural Science Foundation of China (Grant Nos. 41971369, 41561086, 42171392, 41861048); Yunnan Province Science and Technology Fundamental Special Key Project-Research and Application of Key Technologies for Hybrid Enhanced Smart Space Crowdsourcing in Smart Border Control between China and Myanmar (202001AS070032)(2020–2023); Yunnan Provincial High-level Science and Technology Talents and Innovation Team Selection Special Project—Reserve Talents for Young and Middle-aged Academic and Technical Leaders (202205AC160014).
Author information
Authors and Affiliations
Contributions
YC and SP contributed to conceptualization; ML and JL contributed to methodology; ML and JL contributed to software; ML and YC contributed to validation; ML, SP and YC contributed to formal analysis; YC contributed to investigation; YC contributed to resources; ML and JL contributed to data curation; ML contributed to writing—original draft preparation; SP and YC contributed to writing—review and editing; ML contributed to visualization; SP and BH contributed to supervision; SP and BH contributed to funding acquisition; all authors have read and agreed to the published version of the manuscript.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Luo, M., Peng, S., Cao, Y. et al. Earthquake fatality prediction based on hybrid feature importance assessment: a case study in Yunnan Province, China. Nat Hazards 116, 3353–3376 (2023). https://doi.org/10.1007/s11069-023-05812-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11069-023-05812-6