Log in

Exploring class imbalance with under-sampling, over-sampling, and hybrid sampling based on Mahalanobis distance for landslide susceptibility assessment: a case study of the 2018 Iburi earthquake induced landslides in Hokkaido, Japan

  • Article
  • Published:
Geosciences Journal Aims and scope Submit manuscript

Abstract

This study focuses on evaluating the performance of the resampling approach using under-sampling, over-sampling, and hybrid sampling techniques in the random forest (RF) model for landslide susceptibility assessment (LSA). For this research, the study area selected was Hokkaido, Japan, which experienced a total of 5,625 landslides as a single event caused by the 2018 Ibury earthquake. The objective of this study is to address the class imbalance issue and improve the accuracy of LSA. Multiple data sources are utilized to obtain conditioning factors, and objective absence data sampling based on Mahalanobis distance is employed to tackle the unlabeled sample problem. The RF model is used to calculate landslide susceptibility values and generate LSA. These values are then evaluated using two diagnostic tools, the Area Under the Receiver Operating Characteristic curve (AUROC) and the Precision-Recall curve (AUPRC). These tools help validate and interpret binary classification predictive models for imbalanced data. The results demonstrate improved performance with larger sample sizes, and the resampling approach yields better consistency compared to random sampling within the study area. To enhance the accuracy and consistency of machine learning techniques in reducing landslide risks, the study recommends utilizing hybrid sampling technique and Mahalanobis distance-based absence data sampling in LSA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

Download references

Acknowledgments

This study was carried out with the support of R&D Program for Forest Science Technology (Project No. 2023476C10-2325-BB01) provided by Korea Forest Service (Korea Forestry Promotion Institute).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Byung-Gon Chae.

Additional information

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nam, K., Kim, J. & Chae, BG. Exploring class imbalance with under-sampling, over-sampling, and hybrid sampling based on Mahalanobis distance for landslide susceptibility assessment: a case study of the 2018 Iburi earthquake induced landslides in Hokkaido, Japan. Geosci J 28, 71–94 (2024). https://doi.org/10.1007/s12303-023-0033-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12303-023-0033-6

Key words

Navigation