Abstract
The selection of nonlandslide samples is a key issue in landslide susceptibility modeling (LSM). In view of the potential subjectivity and randomness in random sampling, this paper considers LSM as a positive-unlabeled (PU) learning problem and proposes a two-step deep neural network framework (T-DNN). Through the Spy technique and iteratively training binary classifiers, negative samples with high confidence were identified from the random subsamples with unlabeled sets. Based on the framework and traditional random sampling, we used logistic regression (LR), support vector machine (SVM), and deep neural network (DNN) models for testing and validation. Taking the Changbai Mountain Area in Jilin Province, China, as an example, according to the regional landslide list and the metrological, geographical, and human factors of frequent disasters, landslide susceptibility was evaluated. Results show that the proposed T-DNN method can enhance the selection of negative samples and make the results of landslide susceptibility assessment more reliable and accurate; the area under the receiver operating characteristic curve (AUC) reaches 0.953. In addition, compared with traditional random negative sample sampling, the optimized sample set shows more stable and superior prediction performance in different classifiers.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10064-022-02615-0/MediaObjects/10064_2022_2615_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10064-022-02615-0/MediaObjects/10064_2022_2615_Fig2_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10064-022-02615-0/MediaObjects/10064_2022_2615_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10064-022-02615-0/MediaObjects/10064_2022_2615_Fig3a_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10064-022-02615-0/MediaObjects/10064_2022_2615_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10064-022-02615-0/MediaObjects/10064_2022_2615_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10064-022-02615-0/MediaObjects/10064_2022_2615_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10064-022-02615-0/MediaObjects/10064_2022_2615_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10064-022-02615-0/MediaObjects/10064_2022_2615_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10064-022-02615-0/MediaObjects/10064_2022_2615_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10064-022-02615-0/MediaObjects/10064_2022_2615_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10064-022-02615-0/MediaObjects/10064_2022_2615_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10064-022-02615-0/MediaObjects/10064_2022_2615_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10064-022-02615-0/MediaObjects/10064_2022_2615_Fig13_HTML.png)
Similar content being viewed by others
References
Adnan MSG, Rahman MS, Ahmed N, Ahmed B, Rabbi MF, Rahman RM (2020) Improving spatial agreement in machine learning-based landslide susceptibility map** (Article). Remote Sens 12(20):23. https://doi.org/10.3390/rs12203347
Ayalew L, Yamagishi H, Marui H, & Kanno T (2005). Landslides in Sado Island of Japan: Part II. GIS-based susceptibility map** with comparisons of results from two methods and verifications. Eng Geol 81(4):432–445. https://doi.org/10.1016/j.enggeo.2005.08.004
Bekker J, Davis J (2020) Learning from positive and unlabeled data: a survey (Article). Mach Learn 109(4):719–760. https://doi.org/10.1007/s10994-020-05877-5
Bui DT, Tsangaratos P, Nguyen V-T, Liem NV, Trinh PT (2020) Comparing the prediction performance of a deep learning neural network model with conventional machine learning models in landslide susceptibility assessment. Catena 188. https://doi.org/10.1016/j.catena.2019.104426
Bukhari AH, Raja MAZ, Sulaiman M, Islam S, Shoaib M, Kumam P (2020) Fractional neuro-sequential ARFIMA-LSTM for financial market forecasting. Ieee Access 8:71326–71338. https://doi.org/10.1109/Access.2020.2985763
Caine N (1980) The rainfall intensity - duration control of shallow landslides and debris flows. Geografiska Annaler Series a-Physical Geography 62(1–2):23–27. https://doi.org/10.2307/520449
Catani F, Lagomarsino D, Segoni S, Tofani V (2013) Landslide susceptibility estimation by random forests technique: sensitivity and scaling issues. Nat Hazard 13(11):2815–2831. https://doi.org/10.5194/nhess-13-2815-2013
Cevasco A, Pepe G, Brandolini P (2014) The influences of geological and land use settings on shallow landslides triggered by an intense rainfall event in a coastal terraced environment. Bull Eng Geol Env 73(3):859–875. https://doi.org/10.1007/s10064-013-0544-x
Chang C-T, Hajiyev J, Su C-R (2017) Examining the students’ behavioral intention to use e-learning in Azerbaijan? The general extended technology acceptance model for E-learning approach. Comput Educ 111:128–143. https://doi.org/10.1016/j.compedu.2017.04.010
Chang ZL, Du Z, Zhang F, Huang FM, Chen JW, Li WB et al (2020) Landslide susceptibility prediction based on remote sensing images and GIS: comparisons of supervised and unsupervised machine learning models (Article). Remote Sens 12(3):21. https://doi.org/10.3390/rs12030502
Chen W, Hong HY, Li SJ, Shahabi H, Wang Y, Wang XJ et al (2019) Flood susceptibility modelling using novel hybrid approach of reduced-error pruning trees with bagging and random subspace ensembles. J Hydrol 575:864–873. https://doi.org/10.1016/j.jhydrol.2019.05.089
Chen X, Chen W (2021) GIS-based landslide susceptibility assessment using optimized hybrid machine learning methods (Article). Catena 196:16. https://doi.org/10.1016/j.catena.2020.104833
Chiaroni F, Khodabandelou G, Rahal MC, Hueber N, Dufaux F (2020) Counter-examples generation from a positive unlabeled image dataset (Article). Pattern Recogn 107:15. https://doi.org/10.1016/j.patcog.2020.107527
Conforti M, Pascale S, Robustelli G, Sdao F (2014) Evaluation of prediction capability of the artificial neural networks for map** landslide susceptibility in the Turbolo River catchment (northern Calabria, Italy) (Article). Catena 113:236–250. https://doi.org/10.1016/j.catena.2013.08.006
Dai FC, Lee CF, Ngai YY (2002) Landslide risk assessment and management: an overview. Eng Geol 64(1):65–87. https://doi.org/10.1016/S0013-7952(01)00093-X
Dao DV, Jaafari A, Bayat M, Mafi-Gholami D, Qi C, Moayedi H et al (2020a) A spatially explicit deep learning neural network model for the prediction of landslide susceptibility. Catena 188. https://doi.org/10.1016/j.catena.2019.104451
Dao DV, Jaafari A, Bayat M, Mafi-Gholami D, Qi CC, Moayedi H et al (2020b) A spatially explicit deep learning neural network model for the prediction of landslide susceptibility (Article). Catena 188:13. https://doi.org/10.1016/j.catena.2019.104451
Demir G (2019) GIS-based landslide susceptibility map** for a part of the North Anatolian Fault Zone between Resadiye and Koyulhisar (Turkey) (Article). Catena 183:12. https://doi.org/10.1016/j.catena.2019.104211
Dou J, Yunus AP, Bui DT, Merghadi A, Sahana M, Zhu ZF et al (2020a) Improved landslide assessment using support vector machine with bagging, boosting, and stacking ensemble machine learning framework in a mountainous watershed, Japan. Landslides 17(3):641–658. https://doi.org/10.1007/s10346-019-01286-5
Dou J, Yunus AP, Merghadi A, Shirzadi A, Nguyen H, Hussain Y et al (2020b) Different sampling strategies for predicting landslide susceptibilities are deemed less consequential with deep learning. Sci Total Environ 720:137320. https://doi.org/10.1016/j.scitotenv.2020.137320
Elkadiri R, Sultan M, Youssef AM, Elbayoumi T, Chase R, Bulkhi AB et al (2014) A remote sensing-based approach for debris-flow susceptibility assessment using artificial neural networks and logistic regression modeling (Article). IEEE J Sel Top Appl Earth Obs Remote Sens 7(12):4818–4835. https://doi.org/10.1109/Jstars.2014.2337273
Ercanoglu M, Gokceoglu C (2002) Assessment of landslide susceptibility for a landslide-prone area (north of Yenice, NW Turkey) by fuzzy approach. Environ Geol 41(6):720–730. https://doi.org/10.1007/s00254-001-0454-2
Fagerland MW, Hosmer DW (2012) A generalized Hosmer-Lemeshow goodness-of-fit test for multinomial logistic regression models. Stata J 12(3):447–453. https://doi.org/10.1177/1536867x1201200307
Fanos AM, Pradhan B, Mansor S, Yusoff ZM, bin Abdullah AF, (2018) A hybrid model using machine learning methods and GIS for potential rockfall source identification from airborne laser scanning data. Landslides 15(9):1833–1850. https://doi.org/10.1007/s10346-018-0990-4
Fernandez A, Garcia S, Herrera F, Chawla NV (2018) SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J Artif Intell Res 61:863–905. https://doi.org/10.1613/jair.1.11192
Gong C, Liu T, Yang J, Tao D (2019) Large-margin label-calibrated support vector machines for positive and unlabeled learning (Article). IEEE Trans Neural Netw Learn Syst 30(11):3471–3483. https://doi.org/10.1109/TNNLS.2019.2892403
Guzzetti F, Mondini AC, Cardinali M, Fiorucci F, Santangelo M, Chang KT (2012) Landslide inventory maps: new tools for an old problem. Earth Sci Rev 112(1–2):42–66. https://doi.org/10.1016/j.earscirev.2012.02.001
He K, Zhang X, Ren S, Sun J, Ieee (2016) Deep residual learning for image recognition. 2016 Ieee Conference on Computer Vision and Pattern Recognition 770–778. https://doi.org/10.1109/cvpr.2016.90
Hernández Fusilier D, Montes-y-Gómez M, Rosso P, Guzmán Cabrera R (2015) Detecting positive and negative deceptive opinions using PU-learning. Inf Process Manage 51(4):433–443. https://doi.org/10.1016/j.ipm.2014.11.001
Hinton G, Deng L, Yu D, Dahl GE, Mohamed AR, Jaitly N et al (2012) Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process Mag 29(6):82–97. https://doi.org/10.1109/Msp.2012.2205597
Hong HY, Miao YM, Liu JZ, Zhu AX (2019) Exploring the effects of the design and quantity of absence data on the performance of random forest-based landslide susceptibility map**. Catena 176:45–64. https://doi.org/10.1016/j.catena.2018.12.035
Hong HY, Pradhan B, Xu C, Tien Bui D (2015) Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. Catena 133:266–281. https://doi.org/10.1016/j.catena.2015.05.019
Hsieh FY, Bloch DA, Larsen MD (1998) A simple method of sample size calculation for linear and logistic regression. Stat Med 17(14):1623–1634. https://doi.org/10.1002/(sici)1097-0258(19980730)
Hu Q, Zhou Y, Wang SX, Wang FT (2020a) Machine learning and fractal theory models for landslide susceptibility map**: case study from the **sha River Basin (Article). Geomorphology 351:15. https://doi.org/10.1016/j.geomorph.2019.106975
Hu XD, Zhang H, Mei HB, ** using the stacking ensemble machine learning method in Lushui, Southwest China. Appl Sci (Basel) 10(11). https://doi.org/10.3390/app10114016
Huang FM, Zhang J, Zhou CB, Wang YH, Huang JS, Zhu L (2020) A deep learning algorithm using a fully connected sparse autoencoder neural network for landslide susceptibility prediction. Landslides 17(1):217–229. https://doi.org/10.1007/s10346-019-01274-9
Iverson RM (2000) Landslide triggering by rain infiltration. Water Resour Res 36(7):1897–1910. https://doi.org/10.1029/2000wr900090
Iwata K, Ikeda K, Sakai H (2004) A new criterion using information gain for action selection strategy in reinforcement learning. IEEE Trans Neural Netw 15(4):792–799. https://doi.org/10.1109/TNN.2004.828760
Kritikos T, Davies T (2015) Assessment of rainfall-generated shallow landslide/debris-flow susceptibility and runout using a GIS-based approach: application to western Southern Alps of New Zealand (Article). Landslides 12(6):1051–1075. https://doi.org/10.1007/s10346-014-0533-6
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444. https://doi.org/10.1038/nature14539
Lee S, Ryu JH, Lee MJ, Won JS (2003) Use of an artificial neural network for analysis of the susceptibility to landslides at Boun, Korea. Environ Geol 44(7):820–833. https://doi.org/10.1007/s00254-003-0825-y
Mason SJ, Graham NE (2002) Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: statistical significance and interpretation. Q J R Meteorol Soc 128(584):2145–2166. https://doi.org/10.1256/003590002320603584
Merghadi A, Abderrahmane B, Bui DT (2018) Landslide susceptibility assessment at Mila Basin (Algeria): a comparative assessment of prediction capability of advanced machine learning methods. ISPRS Int J Geoinf 7(7). https://doi.org/10.3390/ijgi7070268
Muniasamy A, Alasiry A (2020) Deep learning: the impact on future eLearning (Article). Int J Emerg Technol Learn 15(1):188–199. https://doi.org/10.3991/ijet.v15i01.11435
Nhu VH, Hoang ND, Nguyen H, Ngo PTT, Bui TT, Hoa PV et al (2020) Effectiveness assessment of Keras based deep learning with different robust optimization algorithms for shallow landslide susceptibility map** at tropical area. Catena 188. https://doi.org/10.1016/j.catena.2020.104458
Palau RM, Hurlimann M, Berenguer M, Sempere-Torres D (2020) Influence of the map** unit for regional landslide early warning systems: comparison between pixels and polygons in Catalonia (NE Spain) (Article). Landslides 17(9):2067–2083. https://doi.org/10.1007/s10346-020-01425-3
Papathoma-Kohle M, Kappes M, Keiler M, Glade T (2011) Physical vulnerability assessment for alpine hazards: state of the art and future needs. Nat Hazards 58(2):645–680. https://doi.org/10.1007/s11069-010-9632-4
Peng L, Shen L, Liao L, Liu G, Zhou L (2020) RNMFMDA: a microbe-disease association identification method based on reliable negative sample selection and logistic matrix factorization with neighborhood regularization. Front Microbiol 11:592430. https://doi.org/10.3389/fmicb.2020.592430
Peng L, Xu DD, Wang XX (2019) Vulnerability of rural household livelihood to climate variability and adaptive strategies in landslide-threatened western mountainous regions of the Three Gorges Reservoir Area, China. Clim Dev 11(6):469–484. https://doi.org/10.1080/17565529.2018.1445613
Pham BT, Prakash I, Dou J, Singh SK, Trinh PT, Tran HT et al (2020) A novel hybrid approach of landslide susceptibility modelling using rotation forest ensemble and different base classifiers. Geocarto Int 35(12):1267–1292. https://doi.org/10.1080/10106049.2018.1559885
Pradhan AMS, Lee SR, Kim YT (2019) A shallow slide prediction model combining rainfall threshold warnings and shallow slide susceptibility in Busan, Korea (Article). Landslides 16(3):647–659. https://doi.org/10.1007/s10346-018-1112-z
Pradhan B (2013) A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility map** using GIS. Comput Geosci 51:350–365. https://doi.org/10.1016/j.cageo.2012.08.023
Reichstein M, Camps-Valls G, Stevens B, Jung M, Denzler J, Carvalhais N et al (2019) Deep learning and process understanding for data-driven Earth system science (Article). Nature 566(7743):195–204. https://doi.org/10.1038/s41586-019-0912-1
Sameen MI, Pradhan B, Lee S (2020) Application of convolutional neural networks featuring Bayesian optimization for landslide susceptibility assessment. Catena 186. https://doi.org/10.1016/j.catena.2019.104249
Santangelo M, Marchesini I, Bucci F, Cardinali M, Fiorucci F, Guzzetti F (2015) An approach to reduce map** errors in the production of landslide inventory maps. Nat Hazard 15(9):2111–2126. https://doi.org/10.5194/nhess-15-2111-2015
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003
Sharma A (2018) Guided stochastic gradient descent algorithm for inconsistent datasets. Appl Soft Comput 73:1068–1080. https://doi.org/10.1016/j.asoc.2018.09.038
Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300. https://doi.org/10.1023/A:1018628609742
Tang X, Machimura T, Li J, Liu W, Hong H (2020) A novel optimized repeatedly random undersampling for selecting negative samples: a case study in an SVM-based forest fire susceptibility assessment. J Environ Manage 271:111014. https://doi.org/10.1016/j.jenvman.2020.111014
Tang XZ, Hong HY, Shu YQ, Tang HJ, Li JF, Liu W (2019) Urban waterlogging susceptibility assessment based on a PSO-SVM method using a novel repeatedly random sampling idea to select negative samples (Article). J Hydrol 576:583–595. https://doi.org/10.1016/j.jhydrol.2019.06.058
Tehrany MS, Pradhan B, Jebur MN (2014) Flood susceptibility map** using a novel ensemble weights-of-evidence and support vector machine models in GIS. J Hydrol 512:332–343. https://doi.org/10.1016/j.jhydrol.2014.03.008
Tehrany MS, Pradhan B, Mansor S, Ahmad N (2015) Flood susceptibility assessment using GIS-based support vector machine model with different kernel types (Article). Catena 125:91–101. https://doi.org/10.1016/j.catena.2014.10.017
Tien Bui D, Tuan TA, Klempe H, Pradhan B, Revhaug I (2015) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 13(2):361–378. https://doi.org/10.1007/s10346-015-0557-6
Umar Z, Pradhan B, Ahmad A, Jebur MN, Tehrany MS (2014) Earthquake induced landslide susceptibility map** using an integrated ensemble frequency ratio and logistic regression models in West Sumatera Province, Indonesia (Article). Catena 118:124–135. https://doi.org/10.1016/j.catena.2014.02.005
van Westen CJ, van Asch TWJ, Soeters R (2005) Landslide hazard and risk zonation—why is it still so difficult? Bull Eng Geol Env 65(2):167–184. https://doi.org/10.1007/s10064-005-0023-0
Wei H, Ding Y, Liu B (2020) iPiDA-sHN: identification of Piwi-interacting RNA-disease associations by selecting high quality negative samples. Comput Biol Chem 88:107361. https://doi.org/10.1016/j.compbiolchem.2020.107361
**e WH, Liang GQ, Dong ZH, Tan BY, Zhang BS (2019) An improved oversampling algorithm based on the samples’ selection strategy for classifying imbalanced data. Math Probl Eng 2019. https://doi.org/10.1155/2019/3526539
Xu C, Dai FC, Xu XW, Lee YH (2012) GIS-based support vector machine modeling of earthquake-triggered landslide susceptibility in the Jianjiang River watershed, China. Geomorphology 145:70–80. https://doi.org/10.1016/j.geomorph.2011.12.040
Yang P, Li XL, Mei JP, Kwoh CK, Ng SK (2012) Positive-unlabeled learning for disease gene identification. Bioinformatics 28(20):2640–2647. https://doi.org/10.1093/bioinformatics/bts504
Yang W, Yin XS, Song H, Liu Y, Xu X (2014) Extraction of built-up areas from fully polarimetric SAR imagery via PU learning. IEEE J Sel Top Appl Earth Obs Remote Sens 7(4):1207–1216. https://doi.org/10.1109/Jstars.2013.2289986
Yao JY, Qin SW, Qiao SS, Che WC, Chen Y, Su G et al (2020) Assessment of landslide susceptibility combining deep learning with semi-supervised learning in Jiaohe County, Jilin Province, China. Appl Sci (Basel) 10(16). https://doi.org/10.3390/app10165640
Yu K, Liu Y, Qing L, Wang B, Cheng Y (2018) Positive and unlabeled learning for user behavior analysis based on mobile Internet traffic data. Ieee Access 6:37568–37580. https://doi.org/10.1109/access.2018.2852008
Yu SQ, Jia D, Xu CY (2017) Convolutional neural networks for hyperspectral image classification (Article). Neurocomputing 219:88–98. https://doi.org/10.1016/j.neucom.2016.09.010
Zaniewski AE, Lehmann A, Overton JMC (2002) Predicting species spatial distributions using presence-only data: a case study of native New Zealand ferns. Ecol Model 157(2–3):261–280. https://doi.org/10.1016/S0304-3800(02)00199-0
Zhang LQ, Li ZQ, Li AJ, Liu FY (2018a) Large-scale urban point cloud labeling and reconstruction (Article). ISPRS J Photogramm Remote Sens 138:86–100. https://doi.org/10.1016/j.isprsjprs.2018.02.008
Zhang YD, Pan CC, Sun JD, Tang CS (2018b) Multiple sclerosis identification by convolutional neural network with dropout and parametric ReLU. J Comput Sci 28:1–10. https://doi.org/10.1016/j.jocs.2018.07.003
Zhao Y, Wang R, Jiang YJ, Liu HJ, Wei ZL (2019) GIS-based logistic regression for rainfall-induced landslide susceptibility map** under different grid sizes in Yueqing Southeastern China. Eng Geol 259. https://doi.org/10.1016/j.enggeo.2019.105147
Zhu AX, Miao YM, Yang L, Bai SB, Liu JZ, Hong HY (2018) Comparison of the presence-only method and presence-absence method in landslide susceptibility map** (Article). Catena 171:222–233. https://doi.org/10.1016/j.catena.2018.07.012
Zhu L, Huang L, Fan L, Huang J, Huang F, Chen J et al (2020) Landslide susceptibility prediction modeling based on remote sensing and a novel deep learning algorithm of a cascade-parallel recurrent neural network. Sensors (Basel) 20(6). https://doi.org/10.3390/s20061576
Funding
This research was supported by the National Natural Science Foundation of China (grant No. 41977221 and 41972267), the authors acknowledge the Interdisciplinary Research Funding Program for Ph.D. students of Jilin University (No. 101832020DJX074), and the Jilin Provincial Science and Technology Department (No. 20190303103SF).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yao, J., Qin, S., Qiao, S. et al. Application of a two-step sampling strategy based on deep neural network for landslide susceptibility map**. Bull Eng Geol Environ 81, 148 (2022). https://doi.org/10.1007/s10064-022-02615-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10064-022-02615-0