Enhancing landslide susceptibility modelling through a novel non-landslide sampling method and ensemble learning technique

Research output: Contribution to journalArticleResearchpeer review

Authors

  • Chao Zhou
  • Yue Wang
  • Ying Cao
  • Ramesh P. Singh
  • Bayes Ahmed
  • Mahdi Motagh
  • Yang Wang
  • Ling Chen
  • Guangchao Tan
  • Shanshan Li

External Research Organisations

  • China University of Geosciences
  • Helmholtz Centre Potsdam - German Research Centre for Geosciences
  • Chapman University
  • University College London (UCL)
  • Hydrogeology and Engineering Geology Institute of Hubei Geological Bureau
View graph of relations

Details

Original languageEnglish
Article number2327463
Number of pages25
JournalGeocarto international
Volume39
Issue number1
Early online date27 Mar 2024
Publication statusPublished - 2024

Abstract

In recent years, several catastrophic landslide events have been observed throughout the globe, threatening to lives and infrastructures. To minimize the impact of landslides, the need of landslide susceptibility map is important. The study aims to extract high-quality non-landslide samples and improve the accuracy of landslide susceptibility modelling (LSM) outcomes by applying a coupled method of ensemble learning and Machine Learning (ML). The Zigui-Badong section of the Three Gorges Reservoir area (TGRA) in China was considered in the present study. Twelve influencing factors were selected as inputs for LSM, and the relationship between each causal factor and landslide spatial development was quantitatively analyzed. A total of 179 landslides have been used in the present study. About 70% of the landslide pixels were randomly considered for training, and the remaining 30% were used for validation. Logistic Regression (LR) model was applied to produce an initial susceptibility map, and the non-landslide samples were selected within the classified low-susceptibility zone. Subsequently, two ML classifiers–the Classification and Regression Tree (CART), and the Multi-Layer Perceptron (MLP), and four coupling models–the CART-Bagging, CART-Boosting, MLP-Bagging, and MLP-Boosting, were utilized for LSM. Finally, the receiver operating characteristics (ROC) curve and statistical analysis were applied for accuracy assessment. The results show that altitude and distance to rivers were the main causal factors of landslides in the study area. The LR-MLP-Boosting performed the best with an accuracy of 0.986 followed by the LR-CART-Bagging, LR-CART-Boosting, and LR-MLP-Bagging. Accuracy comparisons demonstrate that ensemble learning algorithm can notably enhance the LSM performance of ML classifiers, and the Boosting algorithm marginally outperforms the Bagging algorithm. Moreover, the LR model can effectively constrain the selection range of non-landslide samples. The non-landslide sampling method constrained by LR yields higher quality samples compared to raditional random sampling method with no constraints, which develops a more excellent LSM.

Keywords

    ensemble learning, machine learning, non-landslide sampling, Reservoir landslides, susceptibility mapping

ASJC Scopus subject areas

Cite this

Enhancing landslide susceptibility modelling through a novel non-landslide sampling method and ensemble learning technique. / Zhou, Chao; Wang, Yue; Cao, Ying et al.
In: Geocarto international, Vol. 39, No. 1, 2327463, 2024.

Research output: Contribution to journalArticleResearchpeer review

Zhou, C, Wang, Y, Cao, Y, Singh, RP, Ahmed, B, Motagh, M, Wang, Y, Chen, L, Tan, G & Li, S 2024, 'Enhancing landslide susceptibility modelling through a novel non-landslide sampling method and ensemble learning technique', Geocarto international, vol. 39, no. 1, 2327463. https://doi.org/10.1080/10106049.2024.2327463
Zhou, C., Wang, Y., Cao, Y., Singh, R. P., Ahmed, B., Motagh, M., Wang, Y., Chen, L., Tan, G., & Li, S. (2024). Enhancing landslide susceptibility modelling through a novel non-landslide sampling method and ensemble learning technique. Geocarto international, 39(1), Article 2327463. https://doi.org/10.1080/10106049.2024.2327463
Zhou C, Wang Y, Cao Y, Singh RP, Ahmed B, Motagh M et al. Enhancing landslide susceptibility modelling through a novel non-landslide sampling method and ensemble learning technique. Geocarto international. 2024;39(1):2327463. Epub 2024 Mar 27. doi: 10.1080/10106049.2024.2327463
Download
@article{af2fb90ecf714c228da84f78b54d8384,
title = "Enhancing landslide susceptibility modelling through a novel non-landslide sampling method and ensemble learning technique",
abstract = "In recent years, several catastrophic landslide events have been observed throughout the globe, threatening to lives and infrastructures. To minimize the impact of landslides, the need of landslide susceptibility map is important. The study aims to extract high-quality non-landslide samples and improve the accuracy of landslide susceptibility modelling (LSM) outcomes by applying a coupled method of ensemble learning and Machine Learning (ML). The Zigui-Badong section of the Three Gorges Reservoir area (TGRA) in China was considered in the present study. Twelve influencing factors were selected as inputs for LSM, and the relationship between each causal factor and landslide spatial development was quantitatively analyzed. A total of 179 landslides have been used in the present study. About 70% of the landslide pixels were randomly considered for training, and the remaining 30% were used for validation. Logistic Regression (LR) model was applied to produce an initial susceptibility map, and the non-landslide samples were selected within the classified low-susceptibility zone. Subsequently, two ML classifiers–the Classification and Regression Tree (CART), and the Multi-Layer Perceptron (MLP), and four coupling models–the CART-Bagging, CART-Boosting, MLP-Bagging, and MLP-Boosting, were utilized for LSM. Finally, the receiver operating characteristics (ROC) curve and statistical analysis were applied for accuracy assessment. The results show that altitude and distance to rivers were the main causal factors of landslides in the study area. The LR-MLP-Boosting performed the best with an accuracy of 0.986 followed by the LR-CART-Bagging, LR-CART-Boosting, and LR-MLP-Bagging. Accuracy comparisons demonstrate that ensemble learning algorithm can notably enhance the LSM performance of ML classifiers, and the Boosting algorithm marginally outperforms the Bagging algorithm. Moreover, the LR model can effectively constrain the selection range of non-landslide samples. The non-landslide sampling method constrained by LR yields higher quality samples compared to raditional random sampling method with no constraints, which develops a more excellent LSM.",
keywords = "ensemble learning, machine learning, non-landslide sampling, Reservoir landslides, susceptibility mapping",
author = "Chao Zhou and Yue Wang and Ying Cao and Singh, {Ramesh P.} and Bayes Ahmed and Mahdi Motagh and Yang Wang and Ling Chen and Guangchao Tan and Shanshan Li",
note = "Funding Information: We are grateful to the anonymous reviewers for providing useful comments/suggestions that have helped us to improve an earlier version of the manuscript. The first author would like to thank the China Scholarship Council for funding his research at the German Research Centre for Geosciences. This research is funded by the National Natural Science Foundation of China (No. 42371094 and No. 41702330) and the Key Research and Development Program of Hubei Province (No. 2021BCA219). We are grateful to the anonymous reviewers for providing useful comments/suggestions that have helped us to improve an earlier version of the manuscript. The first author would like to thank the China Scholarship Council for funding his research at the German Research Centre for Geosciences. ",
year = "2024",
doi = "10.1080/10106049.2024.2327463",
language = "English",
volume = "39",
journal = "Geocarto international",
issn = "1010-6049",
publisher = "Taylor and Francis Ltd.",
number = "1",

}

Download

TY - JOUR

T1 - Enhancing landslide susceptibility modelling through a novel non-landslide sampling method and ensemble learning technique

AU - Zhou, Chao

AU - Wang, Yue

AU - Cao, Ying

AU - Singh, Ramesh P.

AU - Ahmed, Bayes

AU - Motagh, Mahdi

AU - Wang, Yang

AU - Chen, Ling

AU - Tan, Guangchao

AU - Li, Shanshan

N1 - Funding Information: We are grateful to the anonymous reviewers for providing useful comments/suggestions that have helped us to improve an earlier version of the manuscript. The first author would like to thank the China Scholarship Council for funding his research at the German Research Centre for Geosciences. This research is funded by the National Natural Science Foundation of China (No. 42371094 and No. 41702330) and the Key Research and Development Program of Hubei Province (No. 2021BCA219). We are grateful to the anonymous reviewers for providing useful comments/suggestions that have helped us to improve an earlier version of the manuscript. The first author would like to thank the China Scholarship Council for funding his research at the German Research Centre for Geosciences.

PY - 2024

Y1 - 2024

N2 - In recent years, several catastrophic landslide events have been observed throughout the globe, threatening to lives and infrastructures. To minimize the impact of landslides, the need of landslide susceptibility map is important. The study aims to extract high-quality non-landslide samples and improve the accuracy of landslide susceptibility modelling (LSM) outcomes by applying a coupled method of ensemble learning and Machine Learning (ML). The Zigui-Badong section of the Three Gorges Reservoir area (TGRA) in China was considered in the present study. Twelve influencing factors were selected as inputs for LSM, and the relationship between each causal factor and landslide spatial development was quantitatively analyzed. A total of 179 landslides have been used in the present study. About 70% of the landslide pixels were randomly considered for training, and the remaining 30% were used for validation. Logistic Regression (LR) model was applied to produce an initial susceptibility map, and the non-landslide samples were selected within the classified low-susceptibility zone. Subsequently, two ML classifiers–the Classification and Regression Tree (CART), and the Multi-Layer Perceptron (MLP), and four coupling models–the CART-Bagging, CART-Boosting, MLP-Bagging, and MLP-Boosting, were utilized for LSM. Finally, the receiver operating characteristics (ROC) curve and statistical analysis were applied for accuracy assessment. The results show that altitude and distance to rivers were the main causal factors of landslides in the study area. The LR-MLP-Boosting performed the best with an accuracy of 0.986 followed by the LR-CART-Bagging, LR-CART-Boosting, and LR-MLP-Bagging. Accuracy comparisons demonstrate that ensemble learning algorithm can notably enhance the LSM performance of ML classifiers, and the Boosting algorithm marginally outperforms the Bagging algorithm. Moreover, the LR model can effectively constrain the selection range of non-landslide samples. The non-landslide sampling method constrained by LR yields higher quality samples compared to raditional random sampling method with no constraints, which develops a more excellent LSM.

AB - In recent years, several catastrophic landslide events have been observed throughout the globe, threatening to lives and infrastructures. To minimize the impact of landslides, the need of landslide susceptibility map is important. The study aims to extract high-quality non-landslide samples and improve the accuracy of landslide susceptibility modelling (LSM) outcomes by applying a coupled method of ensemble learning and Machine Learning (ML). The Zigui-Badong section of the Three Gorges Reservoir area (TGRA) in China was considered in the present study. Twelve influencing factors were selected as inputs for LSM, and the relationship between each causal factor and landslide spatial development was quantitatively analyzed. A total of 179 landslides have been used in the present study. About 70% of the landslide pixels were randomly considered for training, and the remaining 30% were used for validation. Logistic Regression (LR) model was applied to produce an initial susceptibility map, and the non-landslide samples were selected within the classified low-susceptibility zone. Subsequently, two ML classifiers–the Classification and Regression Tree (CART), and the Multi-Layer Perceptron (MLP), and four coupling models–the CART-Bagging, CART-Boosting, MLP-Bagging, and MLP-Boosting, were utilized for LSM. Finally, the receiver operating characteristics (ROC) curve and statistical analysis were applied for accuracy assessment. The results show that altitude and distance to rivers were the main causal factors of landslides in the study area. The LR-MLP-Boosting performed the best with an accuracy of 0.986 followed by the LR-CART-Bagging, LR-CART-Boosting, and LR-MLP-Bagging. Accuracy comparisons demonstrate that ensemble learning algorithm can notably enhance the LSM performance of ML classifiers, and the Boosting algorithm marginally outperforms the Bagging algorithm. Moreover, the LR model can effectively constrain the selection range of non-landslide samples. The non-landslide sampling method constrained by LR yields higher quality samples compared to raditional random sampling method with no constraints, which develops a more excellent LSM.

KW - ensemble learning

KW - machine learning

KW - non-landslide sampling

KW - Reservoir landslides

KW - susceptibility mapping

UR - http://www.scopus.com/inward/record.url?scp=85189149475&partnerID=8YFLogxK

U2 - 10.1080/10106049.2024.2327463

DO - 10.1080/10106049.2024.2327463

M3 - Article

AN - SCOPUS:85189149475

VL - 39

JO - Geocarto international

JF - Geocarto international

SN - 1010-6049

IS - 1

M1 - 2327463

ER -