Abstract
Land cover classification of mountainous environments continues to be a challenging remote sensing problem, owing to landscape complexities exhibited by the region. This study explored a multiple classifier system (MCS) approach to the classification of mountain land cover for the Khumbu region in the Himalayas using Sentinel-2 images and a cloud-based model framework. The relationship between classification accuracy and MCS diversity was investigated, and the effects of different diversification and combination methods on MCS classification performance were comparatively assessed for this environment. We present ten MCS models that implement a homogeneous ensemble approach, using the high performing Random Forest (RF) algorithm as the selected classifier. The base classifiers of each MCS model were developed using different combinations of three diversity techniques: (1) distinct training sets, (2) Mean Decrease Accuracy feature selection, and (3) ‘One-vs-All’ problem reduction. The base classifier predictions of each RF-MCS model were combined using: (1) majority vote, (2) weighted argmax, and (3) a meta RF classifier. All MCS models reported higher classification accuracies than the benchmark classifier (overall accuracy with 95% confidence interval: 87.33%±0.97%), with the highest performing model reporting an overall accuracy (±95% confidence interval) of 90.95%±0.84%. Our key findings include: (1) MCS is effective in mountainous environments prone to noise from landscape complexities, (2) problem reduction is indicated as a stronger method over feature selection in improving the diversity of the MCS, (3) although the MCS diversity and accuracy have a positive correlation, our results suggest this is a weak relationship for mountainous classifications, and (4) the selected diversity methods improve the discriminability of MCS against vegetation and forest classes in mountainous land cover classifications and exhibit a cumulative effect on MCS diversity for this context.
Similar content being viewed by others
References
Belgiu M, Drăguţ L (2016) Random forest in remote sensing: A review of applications and future directions. ISPRS J Photogramm Remote Sens 114:24–31. https://doi.org/10.1016/j.isprsjprs.2016.01.011
Bhawana K, Wang T, Gentle P (2017) Internal Migration and Land Use and Land Cover Changes in the Middle Mountains of Nepal. Mt Res Dev 37:446–455. https://doi.org/10.1659/MRD-JOURNAL-D-17-00027.1
Boschetti L, Stehman SV, Roy DP (2016) A stratified random sampling design in space and time for regional to global scale burned area product validation. Remote Sens Environ 186:465–478. https://doi.org/10.1016/j.rse.2016.09.016
Brabyn LK (1996) Landscape classification using GIS and national digital databases. PhD thesis, University of Canterbury, Christchurch, New Zealand. pp 138–141. https://doi.org/10.26021/7960
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140. https://doi.org/10.1007/BF00058655
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and Regression Trees (1st ed.). Routledge. https://doi.org/10.1201/9781315139470
Central Department of Hydrology and Meteorology Tribhuvan University (CDHMTU) (2008) Integrated Study on Hydrology and Meteorology of Khumbu Region with Climate Change Perspectives. https://wwf.panda.org/?191183/Integrated-Study-on-Hydrology-and-Meteorology-of-Khumbu-Region-with-Climate-Change-Perspectives. Accessed 27 Oct 2020
Cha S, Park C (2007) The utilization of google earth images as reference data for the multitemporal land cover classification with modis data of north Korea. Korean J Remote Sens 23:483–491
Chandrasekar K, Sesha Sai MVR, Roy PS, Dwevedi RS (2010) Land surface water index (LSWI) response to rainfall and ndvi using the modis vegetation index product. Int J Remote Sens 31:3987–4005. https://doi.org/10.1080/01431160802575653
Chen Y, Dou P, Yang X (2017) Improving land use/cover classification with a multiple classifier system using adaboost integration technique. Remote Sens 9:1055. https://doi.org/10.3390/rs9101055
Corner RJ, Dewan AM, Chakma S (2014) Monitoring and prediction of land-use and land-cover (LULC) change. In: Dewan A, Corner R (eds.), Dhaka Megacity: Geospatial Perspectives on Urbanisation, Environment and Health. Springer Netherlands, Dordrecht. pp 75–97
Dixit A, Goswami A, Jain S (2019) Development and evaluation of a new “snow water index (Swi)” for accurate snow cover delineation. Remote Sens 11:2774. https://doi.org/10.3390/rs11232774
DomaÉ A, Süzen ML (2006) Integration of environmental variables with satellite images in regional scale vegetation classification. Int J Remote Sens 27:1329–1350. https://doi.org/10.1080/01431160500444806
Du P, **a J, Zhang W, et al. (2012) Multiple classifier system for remote sensing image classification: a review. Sensors 12:4764–4792. https://doi.org/10.3390/s120404764
Foody GM (2009) Classification accuracy comparison: Hypothesis tests and the use of confidence intervals in evaluations of difference, equivalence and non-inferiority. Remote Sens Environ 113:1658–1663. https://doi.org/10.1016/j.rse.2009.03.014
Frantz D, Haß E, Uhl A, et al (2018) Improvement of the Fmask algorithm for Sentinel-2 images: Separating clouds from bright surfaces based on parallax effects. Remote Sens Environ 215:471–481. https://doi.org/10.1016/j.rse.2018.04.046
Gao BC (1996) NDWI — A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sens Environ 58:257–266. https://doi.org/10.1016/S0034-4257(96)00067-3
Ghamisi P, Plaza J, Chen Y, et al. (2017) Advanced spectral classifiers for hyperspectral images: a review. IEEE Geosci Remote Sens Mag 5:8–32. https://doi.org/10.1109/MGRS.2016.2616418
Gilpin S, Dunlavy D (2009) Relationships Between Accuracy and Diversity in Heterogeneous Ensemble Classifiers. SAND2009, 6940C. Department of Energy’s National Nuclear Security Administration under Contract DE-AC04-94AL85000
Gorelick N, Hancher M, Dixon M, et al. (2017) Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens Environ 202:18–27. https://doi.org/10.1016/j.rse.2017.06.031
Gupta S, Gupta A (2019) Dealing with noise problem in machine learning data-sets: a systematic review. Procedia Comput Sci 161:466–474. https://doi.org/10.1016/j.procs.2019.11.146
Haghighi S, Jasemi M, Hessabi S, Zolanvari A (2018) PyCM: Multiclass confusion matrix library in Python. J Open Source Softw 3:729. https://doi.org/10.21105/joss.00729
Hall DK, Riggs GA, Salomonson VV (1995) Development of methods for map** global snow cover using moderate resolution imaging spectroradiometer data. Remote Sens Environ 54:127–140. https://doi.org/10.1016/0034-4257(95)00137-P
Han H, Guo X, Yu H (2016) Variable selection using mean decrease accuracy and mean decrease gini based on random forest. In: 2016 7th IEEE International Conference on Software Engineering and Service Science (ICSESS). IEEE, Bei**g, China. pp 219–224
Hao S, Zhu F, Cui Y (2021) Land use and land cover change detection and spatial distribution on the Tibetan Plateau. Sci Rep 11:7531. https://doi.org/10.1038/s41598-021-87215-w
Harrison JF, Chang CH (2019) Sustainable management of a mountain community vulnerable to geohazards: a case study of Maolin district, Taiwan. Sustainability 11:4107. https://doi.org/10.3390/su11154107
Healey SP, Cohen WB, Yang Z, et al (2018) Map** forest change using stacked generalization: An ensemble approach. Remote Sens Environ 204:717–728. https://doi.org/10.1016/j.rse.2017.09.029
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20:832–844. https://doi.org/10.1109/34.709601
Ho TK, Hull JJ, Srihari SN (1994) Decision combination in multiple classifier systems. IEEE Trans Pattern Anal Mach Intell 16:66–75. https://doi.org/10.1109/34.273716
Hurskainen P, Adhikari H, Siljander M, et al. (2019) Auxiliary datasets improve accuracy of object-based land use/land cover classification in heterogeneous savanna landscapes. Remote Sens Environ 233:111354. https://doi.org/10.1016/j.rse.2019.111354
Ibrahim F, Rasul G (2017) Urban land use land cover changes and their effect on land surface temperature: case study using Dohuk City in the Kurdistan Region of Iraq. Climate 5:13. https://doi.org/10.3390/cl15010013
Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51:181–207. https://doi.org/10.1023/A:1022859003006
Lary DJ, Alavi AH, Gandomi AH, Walker AL (2016) Machine learning in geosciences and remote sensing. Geosci Front 7:3–10. https://doi.org/10.1016/j.gsf.2015.07.003
Liu Y, Wang N, Zhang J, Wang L (2019) Climate change and its impacts on mountain glaciers during 1960–2017 in western China. J Arid Land 11:537–550. https://doi.org/10.1007/s40333-019-0025-6
Mahdianpari M, Salehi B, Mohammadimanesh F, Motagh M (2017) Random forest wetland classification using ALOS-2 L-band, RADARSAT-2 C-band, and TerraSAR-X imagery. ISPRS J Photogramm Remote Sens 130:13–31. https://doi.org/10.1016/j.isprsjprs.2017.05.010
Mansour S, Al-Belushi M, Al-Awadhi T (2020) Monitoring land use and land cover changes in the mountainous cities of Oman using GIS and CA-Markov modelling techniques. Land Use Policy 91:104414. https://doi.org/10.1016/j.landusepol.2019.104414
Marcello J, Eugenio F, Gonzalo-Martin C, et al. (2021) Advanced processing of multiplatform remote sensing imagery for the monitoring of coastal and mountain ecosystems. IEEE Access 9:6536–6549. https://doi.org/10.1109/ACCESS.2020.3046657
Maxwell AE, Warner TA (2020) Thematic classification accuracy assessment with inherently uncertain boundaries: an argument for center-weighted accuracy assessment metrics. Remote Sens 12:1905. https://doi.org/10.3390/rs12121905
Maxwell AE, Warner TA, Fang F (2018) Implementation of machine-learning classification in remote sensing: an applied review. Int J Remote Sens 39:2784–2817. https://doi.org/10.1080/01431161.2018.1433343
Maxwell AE, Warner TA, Guillén LA (2021) Accuracy assessment in convolutional neural network-based deep learning remote sensing studies—Part 2: recommendations and best practices. Remote Sens 13:2591. https://doi.org/10.3390/rs13132591
Meraner A, Ebel P, Zhu XX, Schmitt M (2020) Cloud removal in Sentinel-2 imagery using a deep residual neural network and SAR-optical data fusion. ISPRS J Photogramm Remote Sens 166:333–346. https://doi.org/10.1016/j.isprsjprs.2020.05.013
Naboureh A, Li A, Bian J, et al. (2020) A hybrid data balancing method for classification of imbalanced training data within google earth engine: case studies from mountainous regions. Remote Sens 12:3301. https://doi.org/10.3390/rs12203301
Natekin A, Knoll A (2013) Gradient boosting machines, a tutorial. Front Neurorobotics 7. https://doi.org/10.3389/fnbot.2013.00021
Partridge D, Krzanowski W (1997) Software diversity: practical statistics for its measurement and exploitation. Inf Softw Technol 39:707–717. https://doi.org/10.1016/S0950-5849(97)00023-2
Pelletier C, Valero S, Inglada J, et al. (2017) Effect of training class label noise on classification performances for land cover map** with satellite image time series. Remote Sens 9:173. https://doi.org/10.3390/rs9020173
Phiri D, Simwanda M, Salekin S, et al. (2020) Sentinel-2 data for land cover/use map**: a review. Remote Sens 12:2291. https://doi.org/10.3390/rs12142291
Pontius RG, Millones M (2011) Death to Kappa: birth of quantity disagreement and allocation disagreement for accuracy assessment. Int J Remote Sens 32:4407–4429. https://doi.org/10.1080/01431161.2011.552923
Poortinga A, Tenneson K, Shapiro A, et al. (2019) Map** plantations in myanmar by fusing landsat-8, sentinel-2 and sentinel-1 data along with systematic error quantification. Remote Sens 11:831. https://doi.org/10.3390/rs11070831
Pu J, Zhao X, Miao P, et al. (2020) Integrating multisource RS data and GIS techniques to assist the evaluation of resource-environment carrying capacity in karst mountainous area. J Mt Sci 17:2528–2547. https://doi.org/10.1007/s11629-020-6097-0
Ramezan CA, Warner TA, Maxwell AE (2019) Evaluation of sampling and cross-validation tuning strategies for regional-scale machine learning classification. Remote Sens 11:185. https://doi.org/10.3390/rs11020185
Ranagalage M, Murayama Y, Dissanayake D, Simwanda M (2019) The impacts of landscape changes on annual mean land surface temperature in the tropical mountain city of Sri Lanka: A case study of Nuwara Eliya (1996–2017). Sustainability 11:5517. https://doi.org/10.3390/su11195517
Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5:101–141
Rimal B, Keshtkar H, Stork N, Rijal S (2021) Forest cover and sustainable development in the lumbini province, nepal: past, present and future. Remote Sens 13:4093. https://doi.org/10.3390/rs13204093
Rish I (2001) An empirical study of the naive bayes classifier. Proc IJCAI Workshop Empir Methods AI 41–46
Roberts DW (1986) Ordination on the basis of fuzzy set theory. Vegetation 66:123–131. https://doi.org/10.1007/BF00039905
Rodman KC, Veblen TT, Saraceni S, Chapman TB (2019) Wildfire activity and land use drove 20th-century changes in forest cover in the Colorado front range. Ecosphere 10:e02594. https://doi.org/10.1002/ecs2.2594
Rouse J, Haas RH, Schell JA, Deering D (1974) Monitoring vegetation systems in the great plains with ERTS. NASA Spec Publ 351:309
Rufibach K (2010) Use of Brier score to assess binary predictions. J Clin Epidemiol 63:938–939. https://doi.org/10.1016/j.jclinepi.2009.11.009
Rwanga SS, Ndambuki JM (2017) Accuracy Assessment of land use/land cover classification using remote sensing and GIS. Int J Geosci 8:611–622. https://doi.org/10.4236/ijg.2017.84033
Saah D, Tenneson K, Matin M, et al. (2019) Land cover map** in data scarce environments: challenges and opportunities. Front Environ Sci 7:150. https://doi.org/10.3389/fenvs.2019.00150
Saito T, Rehmsmeier M (2015) The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets. PLOS ONE 10:e0118432. https://doi.org/10.1371/journal.pone.0118432
Sayre R, Frye C, Karagulle D, et al. (2018) A new high-resolution map of world mountains and an online tool for visualizing and comparing characterizations of global mountain distributions. Mt Res Dev 38:240–249. https://doi.org/10.1659/MRD-JOURNAL-D-17-00107.1
Schneiderbauer S, Zebisch M, Steurer C (2007) Applied remote sensing in mountain regions: a workshop organized by eurac in the core of the alps. Mt Res Dev 27:286–287. https://doi.org/10.1659/mrd.0928
Shao G, Tang L, Zhang H (2021) Introducing Image Classification Efficacies. IEEE Access 9:134809–134816. https://doi.org/10.1109/ACCESS.2021.3116526
Sharma E, Molden D, Rahman A, et al. (2019) Introduction to the hindu kush himalaya assessment. In: Wester P, Mishra A, Mukherji A, Shrestha AB (eds) The Hindu Kush Himalaya Assessment. Springer International Publishing, Cham, pp 1–16
Shrestha DP, Zinck JA (2001) Land use classification in mountainous areas: integration of image processing, digital elevation data and field knowledge (application to Nepal). Int J Appl Earth Obs Geoinformation 3:78–85. https://doi.org/10.1016/S0303-2434(01)85024-8
Skalak DB (1996) The Sources of Increased Accuracy for Two Proposed Boosting Algorithms. In: In Proc. American Association for Arti Intelligence, AAAI-96, Integrating Multiple Learned Models Workshop. pp 120–125
Soenen SA, Peddle DR, Coburn CA (2005) SCS+C: a modified Sun-canopy-sensor topographic correction in forested terrain. IEEE Trans Geosci Remote Sens 43:2148–2159. https://doi.org/10.1109/TGRS.2005.852480
Song C, Woodcock CE, Seto KC, et al. (2001) Classification and change detection using landsat tm data. Remote Sens Environ 75:230–244. https://doi.org/10.1016/S0034-4257(00)00169-3
Stehman SV, Foody GM (2009) Accuracy Assessment. In: The SAGE handbook of remote sensing. London: Sage. pp 297–309
Sudhakar Reddy C, Vazeed Pasha S, Satish KV, et al. (2018) Quantifying nationwide land cover and historical changes in forests of Nepal (1930–2014): implications on forest fragmentation. Biodivers Conserv 27:91–107. https://doi.org/10.1007/s10531-017-1423-8
Tovar C, Seijmonsbergen AC, Duivenvoorden JF (2013) Monitoring land use and land cover change in mountain regions: An example in the Jalca grasslands of the Peruvian Andes. Landsc Urban Plan 112:40–49. https://doi.org/10.1016/j.landurbplan.2012.12.003
Tuladhar D, Dewan A, Kuhn M, Corner RJ (2019) The influence of rainfall and land use/land cover changes on river discharge variability in the mountainous catchment of the Bagmati River. Water 11:2444. https://doi.org/10.3390/w11122444
Uddin K, Shrestha HL, Murthy MSR, et al. (2015) Development of 2010 national land cover database for the Nepal. J Environ Manage 148:82–90. https://doi.org/10.1016/j.jenvman.2014.07.047
Vapnik VN, Guyon IM, Boser BE (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the fifth annual workshop on Computational learning theory — COLT’ 92. ACM Press, Pittsburgh, Pennsylvania, United States. pp 144–152
Vega Isuhuaylas L, Hirata Y, Ventura Santos L, Serrudo Torobeo N (2018) Natural forest map** in the andes (Peru): a comparison of the performance of machine-learning algorithms. Remote Sens 10:782. https://doi.org/10.3390/rs10050782
Wen L, Hughes M (2020) Coastal wetland map** using ensemble learning algorithms: a comparative study of bagging, boosting and stacking techniques. Remote Sens 12:1683. https://doi.org/10.3390/rs12101683
Wolpert DH (1992) Stacked generalization. Neural Netw 5:241–259. https://doi.org/10.1016/S0893-6080(05)80023-1
Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1:67–82. https://doi.org/10.1109/4235.585893
Woźniak M, Graña M, Corchado E (2014) A survey of multiple classifier systems as hybrid systems. Inf Fusion 16:3–17. https://doi.org/10.1016/j.inffus.2013.04.006
Wu Q (2020) Geemap: a python package for interactive map** with google earth engine. J Open Source Softw 5:2305. https://doi.org/10.21105/joss.02305
**a J, Ghamisi P, Yokoya N, Iwasaki A (2018) Random Forest Ensembles and Extended Multiextinction Profiles for Hyperspectral Image Classification. IEEE Trans Geosci Remote Sens. https://doi.org/10.1109/TGRS.2017.2744662
Xu R, Wen Z, Gui L, et al. (2020) Ensemble with estimation: seeking for optimization in class noisy data. Int J Mach Learn Cybern 11:231–248. https://doi.org/10.1007/s13042-019-00969-8
Xue J, Su B (2017) Significant remote sensing vegetation indices: a review of developments and applications. J Sens 2017:1–17. https://doi.org/10.1155/2017/1353691
Acknowledgements
The authors sincerely thank the anonymous reviewers for their insights and suggestions that helped improve this manuscript. The authors also thank the Google Earth Engine, Google Colab, Scikitlearn, PyCM, and Geemap teams for their great work.
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
11629_2021_7130_MOESM1_ESM.pdf
Diversity-accuracy assessment of multiple classifier systems for the land cover classification of the Khumbu region in the Himalayas
Rights and permissions
About this article
Cite this article
Hanson, C.C., Brabyn, L. & Gurung, S.B. Diversity-accuracy assessment of multiple classifier systems for the land cover classification of the Khumbu region in the Himalayas. J. Mt. Sci. 19, 365–387 (2022). https://doi.org/10.1007/s11629-021-7130-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11629-021-7130-7